中文版 | English
题名

基于安全编码和增量管理的 DNA 存储系统研发

其他题名
DEVELOPMENT OF A DNA STORAGE SYSTEM BASED ON SECURE CODING AND INCREMENTAL MANAGEMENT
姓名
姓名拼音
YUAN Tao
学号
12132597
学位类型
硕士
学位专业
电子科学与技术
学科门类/专业学位类别
08 工学
导师
曲强
导师单位
中科院深圳先进技术研究院
论文答辩日期
2024-05-08
论文提交日期
2024-07-03
学位授予单位
南方科技大学
学位授予地点
深圳
摘要
  在当前大数据洪流的时代背景下,DNA 分子作为一种前沿的新型信息存储介质,具备卓越的存储密度、超长的使用寿命以及极小的能源损耗等特性,显著优于传统物理存储介质,使其备受瞩目。DNA 存储技术已被视为解决未来大规模数据存储瓶颈的有效途径,该技术利用 DNA 分子中碱基对的有序排列来承载数字化信息,标志着合成生物学与现代计算机存储技术的深度交叉融合。
  然而,尽管 DNA 存储的潜力巨大,现阶段其在信息安全领域的探索尚显不足。目前缺乏一种普遍适用且满足编码要求的加密方法专用于 DNA 存储。此外,其对于多用户信息编辑操作的支持仍有不足,尚未构建起一套高度完善的、适用于团队协作和实际部署的安全存储体系结构。故而,本研究聚焦于 DNA 存储信息安全这一亟待充实的领域,力图填补其中的科研空白。
  针对 DNA 存储领域缺乏合适的加密方法,影响其信息安全的问题,本文提出了一种将混沌系统加密与喷泉码编码相结合的 DNA 加密编码方法。该方法生成的DNA 序列不仅满足 DNA 存储在信息密度、生物约束和纠错性能等方面的要求,还具有抵御多种密码学攻击的能力,保障存储数据的安全性。
  为进一步提升安全性并优化信息管理效率,针对当前 DNA 存储数据管理中存在的多方编辑复杂性高、处理效率低下以及经济成本高昂等核心问题,本文提出了一种面向 DNA 安全存储的信息增量管理方法。在上述 DNA 加密编码方法基础上,通过引入非对称加密算法,本文设计了一种适用于多用户环境的混合加密方法,并创新构建了一种无需基因编辑的DNA 增量存储模型,以实现对 DNA 存储信息的安全、灵活编辑与管理。
  在理论研究的基础上,本文进而实施构建了一个简洁实用的、基于安全编码
和增量管理机制的 DNA 存储系统原型。该系统集成了前述方法,实现了 DNA 存储流程的多项数据处理功能模块,有效地满足了在多用户环境中对 DNA 存储信息安全编辑与管理的实际需求,同时还通过可视化手段呈现了系统运行的结果,验证了该系统在线应用时的良好效能和用户体验。
其他摘要

In the current era of big data, DNA molecules emerge as cutting-edge information storage media, boasting outstanding characteristics such as exceptional storage density, ultra-long lifespan, and minimal energy consumption, which significantly surpass traditional physical storage media, making them highly remarkable. DNA storage technology has been regarded as an effective solution to the future bottleneck of large-scale data storage. This technology utilizes the ordered arrangement of base pairs in DNA molecules to carry digital information, marking a profound convergence of synthetic biology and modern computer storage technology.

However, despite the enormous potential of DNA storage, its exploration in the field of information security remains insufficient at present. There is currently a lack of a universally applicable and coding-compliant encryption method specifically designed for DNA storage. Additionally, its support for multi-user information editing operations is still inadequate, and a highly sophisticated and practical secure storage architecture suitable for team collaboration and actual deployment has not yet been established. Therefore, this research focuses on the urgent need to enrich the field of DNA storage information security and aims to fill the research gap therein.

Addressing the lack of suitable encryption methods in the DNA storage domain, which affects its information security, this paper proposes a DNA encryption coding method that combines chaotic system encryption with fountain code encoding. The DNA sequences generated by this method not only meet the requirements of DNA storage in terms of information density, biological constraints, and error correction performance but also possess the capability to resist various cryptographic attacks, ensuring the security of stored data.

To further enhance security and optimize information management efficiency, addressing the core issues of high complexity in multi-party editing, low processing efficiency, and high economic costs in current DNA storage data management, this paper proposes an information incremental management method tailored for secure DNA storage. Building upon the aforementioned DNA encryption coding method, this paper designs a hybrid encryption method suitable for multi-user environments by introducing asymmetric encryption algorithms. Furthermore, it innovatively constructs a DNA incremental storage model that does not require gene editing to achieve secure, flexible editing, and management of DNA storage information.

Based on theoretical research, this paper further implements the development of a concise and practical DNA storage system prototype based on secure coding and incremental management mechanisms. This system integrates the aforementioned methods, realizing multiple core data processing functions of DNA storage processes, effectively meeting the practical requirements for information security editing and management in multi-user environments, and also presents the operational results of the system through visualization means, validating the system's excellent performance and user experience in online applications.

关键词
其他关键词
语种
中文
培养类别
独立培养
入学年份
2021
学位授予年份
2024-06
参考文献列表

[1] 昝乡镇, 姚翔宇, 许鹏, 等. DNA 存储中的纠错方法综述[J]. 广州大学学报 (自然科学版), 2021, 20(2): 13-22.
[2] REINSEL D,GANTZ J, RYDNING J. The Digital World: From Edge to Core[EB/OL]. 2020. ht tps://www.seagate.com/files/www-content/our-story/trends/files/dataage-idc-report-final.pdf.
[3] PANDA D, MOLLA KA, BAIG M J, et al. DNA as a digital information storage device: hope or hype?[J]. 3 Biotech, 2018, 8: 1-9.
[4] BONNET J, COLOTTE M, COUDY D, et al. Chain and conformation stability of solid-state DNA: implications for room temperature storage[J]. Nucleic Acids Research, 2010, 38(5): 1531-1546.
[5] CHURCH G M, GAO Y, KOSURI S. Next-generation digital information storage in DNA[J]. Science, 2012, 337(6102): 1628-1628.
[6] GOLDMAN N, BERTONE P, CHEN S, et al. Towards practical, high-capacity, low- maintenance information storage in synthesized DNA[J]. Nature, 2013, 494(7435): 77-80.
[7] ZAN X, YAO X, XU P, et al. A hierarchical error correction strategy for text DNA storage[J]. Interdisciplinary Sciences: Computational Life Sciences, 2022, 14(1): 141-150.
[8] PING Z, MA D, HUANG X, et al. Carbon-based archiving: current progress and future prospects of DNA-based data storage[J]. GigaScience, 2019, 8(6): giz075.
[9] HAKAMIHA, CHACZKOZ, KALE A. Review of big data storage based on DNA computing [C]//2015 Asia-Pacific Conference on Computer Aided System Engineering. IEEE, 2015: 113- 117.
[10] 许鹏, 方刚, 石晓龙, 等. DNA 存储及其研究进展[J]. 电子与信息学报, 2020, 42(6): 1326- 1331.
[11] HECKEL R, MIKUTIS G, GRASS R N. A characterization of the DNA data storage channel [J]. Scientific Reports, 2019, 9(1): 9663.
[12] TRAVERS K J, CHIN C S, RANK D R, et al. A flexible and efficient template format for circular consensus sequencing and SNP detection[J]. Nucleic Acids Research, 2010, 38(15): e159-e159.
[13] CRETU STANCU M, VAN ROOSMALEN M J, RENKENS I, et al. Mapping and phasing of structural variation inpatient genomes using nanopore sequencing[J]. Nature Communications, 2017, 8(1): 1326.
[14] WATSON J D, CRICK F H. Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid[J]. Nature, 1953, 171(4356): 737-738.
[15] NEIMAN M S. Some fundamental issues of microminiaturization[J]. Radiotekhnika, 1964, 1(1): 3-12.
[16] WIENER N. Interview: machines smarter than men[J]. US News World Rep, 1964, 56: 84-6.
[17] NEIMAN M. On the molecular memory systems and the directed mutations[J]. Radiotekhnika, 1965, 6(1): 8.
[18] RICHARDS C. The blind watchmaker[J]. Bristol Medico-Chirurgical Journal, 1987, 102(2): 54.
[19] DAVIS J. Microvenus[J]. Art Journal, 1996, 55(1): 70-74.
[20] CLELLAND C T, RISCA V, BANCROFT C. Hiding messages in DNA microdots[J]. Nature, 1999, 399(6736): 533-534.
[21] GRASS RN, HECKEL R, PUDDU M, et al. Robust chemical preservation of digital informa- tion on DNA in silica with error-correcting codes[J]. Angewandte Chemie International Edition, 2015, 54(8): 2552-2555.
[22] BLAWAT M, GAEDKE K, HUETTER I, et al. Forward error correction for DNA data storage [J]. Procedia Computer Science, 2016, 80: 1011-1022.
[23] BORNHOLT J, LOPEZ R, CARMEAN D M, et al. A DNA-based archival storage system[C]// Proceedings of the Twenty-First International Conference on Architectural Support for Pro- gramming Languages and Operating Systems. 2016: 637-649.
[24] ERLICH Y, ZIELINSKI D. DNA Fountain enables a robust and efficient storage architecture [J]. Science, 2017, 355(6328): 950-954.
[25] SHIPMAN S L, NIVALA J, MACKLIS J D, et al. CRISPR–Cas encoding of a digital movie into the genomes of a population of living bacteria[J]. Nature, 2017, 547(7663): 345-349.
[26] ORGANICK L, ANG S D, CHEN Y J, et al. Random access in large-scale DNA data storage [J]. Nature Biotechnology, 2018, 36(3): 242-248.
[27] KOCH J, GANTENBEIN S, MASANIA K, et al. A DNA-of-things storage architecture to create materials with embedded memory[J]. Nature Biotechnology, 2020, 38(1): 39-43.
[28] BANAL J L, SHEPHERD T R, BERLEANT J, et al. Random access DNA memory using Boolean search in an archival file storage system[J]. Nature Materials, 2021, 20(9): 1272-1280.
[29] TABATABAEI S K, PHAM B, PAN C, et al. Expanding the molecular alphabet of DNA-based data storage systems with neural network nanopore readout processing[J]. Nano Letters, 2022, 22(5): 1905-1914.
[30] ANTKOWIAK P L, KOCH J,NGUYEN B H, et al. Integrating DNA encapsulates and digital microfluidics for automated data storage in DNA[J]. Small, 2022, 18(15): 2107381.
[31] 杨平, 孙德斌, 柳伟强, 等. 带有编码信息的人工合成 DNA 存储介质及信息的存储读取方 法和应用: CN104850760A[P]. 2015.
[32] PING Z, CHEN S, HUANG X, et al. Towards practical and robust DNA-based data archiving by codec system named ‘Yin-Yang ’[J]. BioRxiv, 2019: 829721.
[33] HAO M, QIAOH, GAO Y, et al. A mixed culture of bacterial cells enables an economic DNA storage on a large scale[J]. Communications Biology, 2020, 3(1): 416.
[34] 平质, 张颢龄, 陈世宏, 等. Chamaeleo: DNA 存储碱基编解码算法的可拓展集成与系统评 估平台[J]. 合成生物学, 2021, 2(3): 412.
[35] SCHWARZ P M, FREISLEBEN B. NOREC4DNA: using near-optimal rateless erasure codes for DNA storage[J]. BMC Bioinformatics, 2021, 22(1): 1-28.
[36] LUBY M. LT codes[C]//The 43rd Annual IEEE Symposium on Foundations of Computer Sci- ence, 2002. Proceedings. IEEE Computer Society, 2002: 271-271.
[37] SHOKROLLAHI A. Raptor codes[J]. IEEE Transactions on Information Theory, 2006, 52(6): 2551-2567.
[38] ZHANG S, PENG K. DNA information storage technology based on raptor code[J]. Laser & Optoelectronics Progress, 2020, 57(15): 151701.
[39] LUBY M, SHOKROLLAHI A, WATSON M, et al. RFC 5053: Raptor forward error correction scheme for object delivery[M]. RFC Editor, 2007.
[40] GARZON MH, DEATON RJ. Codeword design and information encoding in DNA ensembles [J]. Natural Computing, 2004, 3: 253-292.
[41] ZHANG Q, WANG B, WEI X, et al. DNA word set design based on minimum free energy[J]. IEEE Transactions on Nanobioscience, 2010, 9(4): 273-277.
[42] LIMBACHIYA D, GUPTA M K, AGGARWAL V. Family of constrained codes for archival DNA data storage[J]. IEEE Communications Letters, 2018, 22(10): 1972-1975.
[43] LAEHNEMANN D, BORKHARDT A, MCHARDY A C. Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction[J]. Briefings in Bioinformatics, 2016, 17(1): 154-179.
[44] CAO B, ZHAO S, LI X, et al. K-means multi-verse optimizer (KMVO) algorithm to construct DNA storage codes[J]. IEEE Access, 2020, 8: 29547-29556.
[45] CAO B, II X, ZHANG X, et al. Designing uncorrelated address constrain for DNA storage by DMVO algorithm[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2020, 19(2): 866-877.
[46] LEVY M, YAAKOBI E. Mutually uncorrelated codes for DNA storage[J]. IEEE Transactions on Information Theory, 2018, 65(6): 3671-3691.
[47] CAO B, ZHANG X, WU J, et al. Minimum free energy coding for DNA storage[J]. IEEE Transactions on Nanobioscience, 2021, 20(2): 212-222.
[48] TULPAN D, ANDRONESCU M, CHANG S B, et al. Thermodynamically based DNA strand design[J]. Nucleic Acids Research, 2005, 33(15): 4951-4964.
[49] ZHIRNOV V, ZADEGAN R M, SANDHU G S, et al. Nucleic acid memory[J]. Nature Mate- rials, 2016, 15(4): 366-370.
[50] CEZE L, NIVALA J, STRAUSS K. Molecular digital data storage using DNA[J]. Nature Reviews Genetics, 2019, 20(8): 456-466.
[51] GAO Y, CHEN X, QIAO H, et al. Low-bias manipulation of DNA oligo pool for robust data storage[J]. ACS Synthetic Biology, 2020, 9(12): 3344-3352.
[52] CHEN Y J, TAKAHASHI CN, ORGANICKL, et al. Quantifying molecular bias in DNA data storage[J]. Nature Communications, 2020, 11(1): 3264.
[53] EDGAR R C. MUSCLE: multiple sequence alignment with high accuracy and high throughput [J]. Nucleic Acids Research, 2004, 32(5): 1792-1797.
[54] 陈为刚, 葛奇, 王盼盼, 等. 细胞内大片段 DNA 数据存储的多RS 码交织编码[J]. 合成生 物学, 2021, 2(3): 428.
[55] CHEN W, WANG L, HAN M, et al. Sequencing barcode construction and identification meth- ods based on block error-correction codes[J]. Science China Life Sciences, 2020, 63: 1580- 1592.
[56] 宋香明. 基于 Huffman 编码的 DNA 信息存储方法研究[D]. 天津大学, 2019.
[57] CHEN W, HAN M, ZHOU J, et al. An artificial chromosome for data storage[J]. National Science Review, 2021, 8(5): nwab028.
[58] PRESS W H, HAWKINS J A, JONES JR S K, et al. HEDGES error-correcting code for DNA storage corrects indels and allows sequence constraints[J]. Proceedings of the National Academy of Sciences, 2020, 117(31): 18489-18496.
[59] SONG L, GENG F, GONG ZY, et al. Robust data storage in DNA by deBruijn graph-based de novo strand assembly: volume 13[EB/OL]. Nature Publishing Group UK London, 2022: 5361.
[60] SHARMA D, KUMAR R, GUPTA M, et al. Encoding scheme for data storage and retrieval on DNA computers[J]. IET Nanobiotechnology, 2020, 14(7): 635-641.
[61] WELZEL M, SCHWARZ P M, LÖCHEL H F, et al. DNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage[J]. Nature Communica- tions, 2023, 14(1): 628.
[62] LINK N, VOLKELK, TUCK JM, et al. Dynamic and scalable DNA-based information storage [J]. Nature communications, 2020, 11(1): 2981.
[63] TABATABAEI YAZDI S, YUAN Y, MA J, et al. A rewritable, random-access DNA-based storage system[J]. Scientific reports, 2015, 5(1): 1-10.
[64] BRYKSIN A V, MATSUMURA I. Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids[J]. Biotechniques, 2010, 48(6): 463-465.
[65] APPUSWAMY R, LEBRIGAND K, BARBRY P, et al. OligoArchive: Using DNA in the DBMS storage hierarchy[C]//Biennal Conference on Innovative Data Systems Research (CIDR 2019). 2019: p98.
[66] LEE U J, HWANG S, KIM K E, et al. DNA data storage in Perl[J]. Biotechnology and Biopro- cess Engineering, 2020, 25: 607-615.
[67] YANG J, MA J, LIU S, et al. A molecular cryptography model based on structures of DNA self-assembly[J]. Chinese Science Bulletin, 2014, 59: 1192-1198.
[68] ZAKERI B, CARR P A, LU T K. Multiplexed sequence encoding: a framework for DNA communication[J]. PLoS One, 2016, 11(4): e0152774.
[69] ZHANG Y, WANG F, CHAO J, et al. DNA origami cryptography for secure communication [J]. Nature Communications, 2019, 10(1): 5469.
[70] ZHU E, LUO X, LIU C, et al. An operational DNA strand displacement encryption approach [J]. Nanomaterials, 2022, 12(5): 877.
[71] WU X, KAN H, KURTHS J. A new color image encryption scheme based on DNA sequences and multiple improved 1D chaotic maps[J]. Applied Soft Computing, 2015, 37: 24-39.
[72] WU X, WANG K, WANG X, et al. Color image DNA encryption using NCA map-based CML and one-time keys[J]. Signal Processing, 2018, 148: 272-287.
[73] WU J, LIAO X,YANG B. Image encryption using 2DHénon-Sine map and DNA approach[J]. Signal Processing, 2018, 153: 11-23.
[74] WANG X, SU Y. Image encryption based on compressed sensing and DNA encoding[J]. Signal Processing: Image Communication, 2021, 95: 116246.
[75] GRASS R N, HECKEL R, DESSIMOZ C, et al. Genomic encryption of digital data stored in synthetic DNA[J]. Angewandte Chemie International Edition, 2020, 59(22): 8476-8480.
[76] PENG W, CUI S, SONG C. One-time-pad cipher algorithm based on confusion mapping and DNA storage technology[J]. Plos One, 2021, 16(1): e0245506.
[77] ZAN X, CHU L, XIE R, et al. An image cryptography method by highly error-prone DNA storage channel[J]. Frontiers in Bioengineering and Biotechnology, 2023, 11: 1173763.
[78] YAO X, XIE R, ZAN X, et al. A novel image encryption scheme for DNA storage systems based on DNA hybridization and gene mutation[J]. Interdisciplinary Sciences: Computational Life Sciences, 2023, 15(3): 419-432.
[79] 姚翔宇, 苏燕青, 昝乡镇, 等. 一种基于前向纠错码的图像 DNA 加密存储算法[J]. 信息安 全学报, 2023, 8(6): 28-36.
[80] MATTHEWS R. On the derivation of a“chaotic ”encryption algorithm[J]. Cryptologia, 1989, 13(1): 29-42.
[81] XU Q, SUNK, CAO C, et al. A fast image encryption algorithm based on compressive sensing and hyperchaotic map[J]. Optics and Lasers in Engineering, 2019, 121: 203-214.
[82] MASOOD F, MASOOD J, ZHANG L, et al. A new color image encryption technique using DNA computing and Chaos-based substitution box[J]. Soft Computing, 2022: 1-17.
[83] ZHU H, GE J, QI W, et al. Dynamic analysis and image encryption application of a sinusoidal- polynomial composite chaotic system[J]. Mathematics and Computers in Simulation, 2022, 198: 188-210.
[84] LUO L, LI Y, LI T, et al. Research and simulation of Lyapunov ’s exponents[J]. Computer Simulation, 2005, 22(12): 285-288.
[85] CHEN Z, YANG Y, YUAN Z. A single three-wing or four-wing chaotic attractor generated from a three-dimensional smooth quadratic autonomous system[J]. Chaos, Solitons & Fractals, 2008, 38(4): 1187-1196.
[86] LIU J, TONG X, LIU Y, et al. A joint encryption and error correction scheme based on chaos and LDPC[J]. Nonlinear Dynamics, 2018, 93: 1149-1163.
[87] LORENZ E N. Deterministic nonperiodic flow[J]. Journal of Atmospheric Sciences, 1963, 20(2): 130-141.
[88] GAO S, WU R, WANG X, et al. A 3D model encryption scheme based on a cascaded chaotic system[J]. Signal Processing, 2023, 202: 108745.
[89] DONG Y, ZHAO G, MA Y, et al. A novel image encryption scheme based on pseudo-random coupled map lattices with hybrid elementary cellular automata[J]. Information Sciences, 2022, 593: 121-154.

所在学位评定分委会
电子科学与技术
国内图书分类号
TP309.7
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/778878
专题中国科学院深圳理工大学(筹)联合培养
推荐引用方式
GB/T 7714
袁涛. 基于安全编码和增量管理的 DNA 存储系统研发[D]. 深圳. 南方科技大学,2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12132597-袁涛-中国科学院深圳理(4278KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[袁涛]的文章
百度学术
百度学术中相似的文章
[袁涛]的文章
必应学术
必应学术中相似的文章
[袁涛]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。