中文版 | English
题名

深度整合与挖掘单细胞多模态组学数据及其数据库建设

其他题名
DEEP INTEGRATION AND MINING OF SINGLE CELL MULTI OMICS DATA AND DATABASE CONSTRUCTION
姓名
姓名拼音
YIN Changhui
学号
12032579
学位类型
硕士
学位专业
071007 遗传学
学科门类/专业学位类别
07 理学
导师
陈善义
导师单位
南方科技大学附属第一人民医院
论文答辩日期
2023-05-18
论文提交日期
2023-06-29
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

单细胞多组学测序是近几年被开发出来在单细胞水平上对同一细胞中多种特征进行检测的技术,广泛应用于系统地揭示细胞中关键成分和探究细胞异质性中。分析单细胞内的转录组和染色质可及性是剖析复杂组织中基因调控程序的有效方法,目前已经开发出多种技术用于对同一细胞中的转录组和染色质可及性进行测序。而许多研究在对某些基因或调控因子进行探究后,测序数据所蕴含的其他有效信息没有得到利用,目前尚未存在一个单细胞多组学数据库用于收录单细胞多组学数据。 本研究收集了同时对同一细胞中转录组和染色质可及性测序的公开发表的数据集,通过对数据进行转录组分析、染色质可及性分析以及多组学的联合分析,深入研究了各组织中的细胞类型和细胞状态的关键信息。并根据多组学整合的聚类结果对转录组中的高变异基因和染色质可及性的高变异峰进行了探究。 通过对数据结果的整理,我们搭建了第一个在线多组学数据库SCMODB对上述数据进行可视化,以促进对单细胞多组学数据的使用。本数据库易于使用并且可交互,让实验科学家们够轻松访问单细胞多组学数据。在本数据库中,用户可以搜索他们感兴趣的组织,也可以在数据集中对他们所感兴趣的基因进行搜索,我们还提供了下载功能,以便用户在此数据库分析的基础上对单细胞多组学数据更深层次的分析与挖掘。建立SCMODB在线数据库可以促进不同研究机构之间的数据共享和交流,有助于科学家们更好地利用现有数据,推动科学研究的进展,同时也有利于降低科学研究的重复性和浪费性,提高研究效率和成果质量。

关键词
语种
中文
培养类别
独立培养
入学年份
2020
学位授予年份
2023-06-30
参考文献列表

[1] VAN GELDER R N, VON ZASTROW M E, YOOL A, et al. Amplified RNA synthesizedfrom limited quantities of heterogeneous cDNA.[J]. Proceedings of the National Academy of Sciences, 1990, 87(5): 1663-1667.
[2] CARTER N P, BEBB C E, NORDENSKJO M, et al. Degenerate oligonucleotide-primed PCR: general amplification of target DNA by a single degenerate primer[J]. Genomics, 1992, 13(3):718-725.
[3] MARDIS E R. A decade’ s perspective on DNA sequencing technology[J]. Nature, 2011, 470(7333): 198-203.
[4] NAVIN N, KENDALL J, TROGE J, et al. Tumour evolution inferred by single-cell sequencing[J]. Nature, 2011, 472(7341): 90-94.
[5] TANG F, BARBACIORU C, WANG Y, et al. mRNA-Seq whole-transcriptome analysis of a single cell[J]. Nature methods, 2009, 6(5): 377-382.
[6] RAMSKÖLD D, LUO S, WANG Y C, et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells[J]. Nature biotechnology, 2012, 30(8): 777-782.
[7] PICELLI S, BJÖRKLUND Å K, FARIDANI O R, et al. Smart-seq2 for sensitive full-lengthtranscriptome profiling in single cells[J]. Nature methods, 2013, 10(11): 1096-1098.
[8] BUENROSTRO J D, WU B, LITZENBURGER U M, et al. Single-cell chromatin accessibility reveals principles of regulatory variation[J]. Nature, 2015, 523(7561): 486-490.
[9] FAN H C, FU G K, FODOR S P. Combinatorial labeling of single cells for gene expression cytometry[J]. Science, 2015, 347(6222): 1258367.
[10] MACOSKO E Z, BASU A, SATIJA R, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets[J]. Cell, 2015, 161(5): 1202-1214.
[11] KLEIN A M, MAZUTIS L, AKARTUNA I, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells[J]. Cell, 2015, 161(5): 1187-1201.
[12] CRINIER A, MILPIED P, ESCALIÈRE B, et al. High-dimensional single-cell analysis identifies organ-specific signatures and conserved NK cell subsets in humans and mice[J]. Immunity,2018, 49(5): 971-986.
[13] LEE J, HYEON D Y, HWANG D. Single-cell multiomics: technologies and data analysismethods[J]. Experimental & Molecular Medicine, 2020, 52(9): 1428-1442.
[14] CAO J, CUSANOVICH D A, RAMANI V, et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells[J]. Science, 2018, 361(6409): 1380-1385.
[15] LIU L, LIU C, QUINTERO A, et al. Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity[J]. Nature communications, 2019, 10(1): 470.
[16] CHEN S, LAKE B B, ZHANG K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell[J]. Nature biotechnology, 2019, 37(12): 1452-1457.
[17] ZHU C, YU M, HUANG H, et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome[J]. Nature structural & molecular biology, 2019, 26(11):1063-1070.
[18] MA S, ZHANG B, LAFAVE L M, et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin[J]. Cell, 2020, 183(4): 1103-1116.
[19] HOU Y, GUO H, CAO C, et al. Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas[J]. Cell research, 2016, 26(3):304-319.
[20] RODRIGUEZ-MEIRA A, BUCK G, CLARK S A, et al. Unravelling intratumoral heterogeneity through high-sensitivity single-cell mutational analysis and parallel RNA sequencing[J]. Molecular cell, 2019, 73(6): 1292-1305.
[21] DARMANIS S, GALLANT C J, MARINESCU V D, et al. Simultaneous multiplexed measurement of RNA and proteins in single cells[J]. Cell reports, 2016, 14(2): 380-389.
[22] STOECKIUS M, HAFEMEISTER C, STEPHENSON W, et al. Simultaneous epitope and transcriptome measurement in single cells[J]. Nature methods, 2017, 14(9): 865-868.
[23] VAN DIJK D, SHARMA R, NAINYS J, et al. Recovering gene interactions from single-cell data using data diffusion[J]. Cell, 2018, 174(3): 716-729.
[24] PEREŠÍNI P, KUŹNIAR M, KOSTIĆ D. Monocle: Dynamic, fine-grained data plane monitoring[C]//Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies. 2015: 1-13.
[25] SCHEP A N, WU B, BUENROSTRO J D, et al. chromVAR: inferring transcription-factorassociated accessibility from single-cell epigenomic data[J]. Nature methods, 2017, 14(10):975-978.
[26] KHARCHENKO P V. The triumphs and limitations of computational methods for scRNA-seq [J]. Nature Methods, 2021, 18(7): 723-732.
[27] HAQUE A, ENGEL J, TEICHMANN S A, et al. A practical guide to single-cell RNAsequencing for biomedical research and clinical applications[J]. Genome medicine, 2017, 9(1): 1-12.
[28] ILICIC T, KIM J K, KOLODZIEJCZYK A A, et al. Classification of low quality cells fromsingle-cell RNA-seq data[J]. Genome biology, 2016, 17(1): 1-15.
[29] BACHER R, KENDZIORSKI C. Design and computational analysis of single-cell RNAsequencing experiments[J]. Genome biology, 2016, 17(1): 1-14.
[30] HAGHVERDI L, LUN A T, MORGAN M D, et al. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors[J]. Nature biotechnology, 2018, 36(5):421-427.
[31] BÜTTNER M, MIAO Z, WOLF F A, et al. A test metric for assessing single-cell RNA-seq batch correction[J]. Nature methods, 2019, 16(1): 43-49.
[32] VALLEJOS C A, RISSO D, SCIALDONE A, et al. Normalizing single-cell RNA sequencing data: challenges and opportunities[J]. Nature methods, 2017, 14(6): 565-571.
[33] SVENSSON V, NATARAJAN K N, LY L H, et al. Power analysis of single-cell RNAsequencing experiments[J]. Nature methods, 2017, 14(4): 381-387.
[34] HUANG M, WANG J, TORRE E, et al. SAVER: gene expression recovery for single-cell RNA sequencing[J]. Nature methods, 2018, 15(7): 539-542.
[35] LI W V, LI J J. An accurate and robust imputation method scImpute for single-cell RNA-seq data[J]. Nature communications, 2018, 9(1): 997.
[36] GONG W, KWAK I Y, POTA P, et al. DrImpute: imputing dropout events in single cell RNA sequencing data[J]. BMC bioinformatics, 2018, 19: 1-10.
[37] TALWAR D, MONGIA A, SENGUPTA D, et al. AutoImpute: Autoencoder based imputation of single-cell RNA-seq data[J]. Scientific reports, 2018, 8(1): 1-11.
[38] BECHT E, MCINNES L, HEALY J, et al. Dimensionality reduction for visualizing single-cell data using UMAP[J]. Nature biotechnology, 2019, 37(1): 38-44.
[39] LEVINE J H, SIMONDS E F, BENDALL S C, et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis[J]. Cell, 2015, 162(1): 184-197.
[40] ANDREWS T S, HEMBERG M. Identifying cell populations with scRNASeq[J]. Molecularaspects of medicine, 2018, 59: 114-122.
[41] FINAK G, MCDAVID A, YAJIMA M, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data[J]. Genome biology, 2015, 16(1): 1-13.
[42] KHARCHENKO P V, SILBERSTEIN L, SCADDEN D T. Bayesian approach to single-celldifferential expression analysis[J]. Nature methods, 2014, 11(7): 740-742.
[43] MIAO Z, DENG K, WANG X, et al. DEsingle for detecting three types of differential expression in single-cell RNA-seq data[J]. Bioinformatics, 2018, 34(18): 3223-3224.
[44] GRIFFITHS J A, SCIALDONE A, MARIONI J C. Using single-cell genomics to understand developmental processes and cell fate decisions[J]. Molecular systems biology, 2018, 14(4): e8046.
[45] SHIN J, BERG D A, ZHU Y, et al. Single-cell RNA-seq with waterfall reveals molecular cascades underlying adult neurogenesis[J]. Cell stem cell, 2015, 17(3): 360-372.
[46] TRAPNELL C, CACCHIARELLI D, GRIMSBY J, et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells[J]. Nature biotechnology,2014, 32(4): 381-386.
[47] QIU X, HILL A, PACKER J, et al. Single-cell mRNA quantification and differential analysis with Census[J]. Nature methods, 2017, 14(3): 309-315.
[48] STREET K, RISSO D, FLETCHER R B, et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics[J]. BMC genomics, 2018, 19: 1-16.
[49] SHI P, NIE Y, YANG J, et al. Fundamental and practical approaches for single-cell ATAC-seq analysis[J]. Abiotech, 2022, 3(3): 212-223.
[50] PLINER H A, PACKER J S, MCFALINE-FIGUEROA J L, et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data[J]. Molecular cell, 2018, 71(5):858-871.6
[51] CUSANOVICH D A, HILL A J, AGHAMIRZAIE D, et al. A single-cell atlas of in vivo mammalian chromatin accessibility[J]. Cell, 2018, 174(5): 1309-1324.
[52] GRANJA J M, CORCES M R, PIERCE S E, et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis[J]. Nature genetics, 2021, 53(3): 403-411.
[53] STUART T, SRIVASTAVA A, MADAD S, et al. Single-cell chromatin state analysis with Signac[J]. Nature methods, 2021, 18(11): 1333-1341.
[54] FANG R, PREISSL S, LI Y, et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC[J]. Nature communications, 2021, 12(1): 1337.
[55] YU W, UZUN Y, ZHU Q, et al. scATAC-pro: a comprehensive workbench for single-cellchromatin accessibility sequencing data[J]. Genome biology, 2020, 21(1): 1-17.
[56] LI B, LI Y, LI K, et al. APEC: an accesson-based method for single-cell chromatin accessibility analysis[J]. Genome biology, 2020, 21: 1-27.
[57] CLYDE D. Share-seq reveals chromatin potential[J]. Nature Reviews Genetics, 2021, 22(1): 2-2.
[58] MACAULAY I C, HAERTY W, KUMAR P, et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes[J]. Nature methods, 2015, 12(6): 519-522.
[59] MARTÍNEZ-MIRA C, CONESA A, TARAZONA S. MOSim: Multi-omics simulation in R[J]. bioRxiv, 2018: 421834.
[60] ZAPPIA L, PHIPSON B, OSHLACK A. Splatter: simulation of single-cell RNA sequencing data[J]. Genome biology, 2017, 18(1): 174.
[61] BIAN S, HOU Y, ZHOU X, et al. Single-cell multiomics sequencing and analyses of human colorectal cancer[J]. Science, 2018, 362(6418): 1060-1063.
[62] GAYOSO A, LOPEZ R, STEIER Z, et al. A joint model of RNA expression and surface protein abundance in single cells[J]. biorxiv, 2019: 791947.
[63] MA A, WANG X, WANG C, et al. Deepmaps: Single-cell biological network inference using heterogeneous graph transformer[J]. bioRxiv, 2021: 2021-10.
[64] HAO Y, HAO S, ANDERSEN-NISSEN E, et al. Integrated analysis of multimodal single-cell data[J]. Cell, 2021, 184(13): 3573-3587.
[65] WANG X, SUN Z, ZHANG Y, et al. BREM-SC: a bayesian random effects mixture model for joint clustering single cell multi-omics data[J]. Nucleic acids research, 2020, 48(11): 5814-5824.
[66] ZUO C, CHEN L. Deep-joint-learning analysis model of single cell transcriptome and open chromatin accessibility data[J]. Briefings in Bioinformatics, 2021, 22(4): bbaa287.
[67] LUO C, KEOWN C L, KURIHARA L, et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex[J]. Science, 2017, 357(6351): 600-604.
[68] BARRETT T, WILHITE S E, LEDOUX P, et al. NCBI GEO: archive for functional genomics data sets—update[J]. Nucleic acids research, 2012, 41(D1): D991-D995.
[69] REGEV A, TEICHMANN S A, LANDER E S, et al. The human cell atlas[J]. elife, 2017, 6:e27041
[70] HARRISON P W, AHAMED A, ASLAM R, et al. The european nucleotide archive in 2020[J].Nucleic acids research, 2021, 49(D1): D82-D85.
[71] LITVIŇUKOVÁ M, TALAVERA-LÓPEZ C, MAATZ H, et al. Cells of the adult human heart[J]. Nature, 2020, 588(7838): 466-472.
[72] STEWART B J, FERDINAND J R, YOUNG M D, et al. Spatiotemporal immune zonation ofthe human kidney[J]. Science, 2019, 365(6460): 1461-1466.
[73] DELOREY T M, ZIEGLER C G, HEIMBERG G, et al. COVID-19 tissue atlases reveal SARSCoV-2 pathology and cellular targets[J]. Nature, 2021, 595(7865): 107-113.
[74] LI M, ZHANG X, ANG K S, et al. DISCO: a database of deeply integrated human single-cellomics data[J]. Nucleic acids research, 2022, 50(D1): D596-D602.
[75] ZHANG X, LAN Y, XU J, et al. CellMarker: a manually curated resource of cell markers in human and mouse[J]. Nucleic acids research, 2019, 47(D1): D721-D728.
[76] NER-GAON H, MELCHIOR A, GOLAN N, et al. Jinglebells: a repository of immune-related single-cell rna–sequencing datasets[J]. The Journal of Immunology, 2017, 198(9): 3375-3379.
[77] ZHAO T, LYU S, LU G, et al. SC2disease: a manually curated database of single-cell transcriptome for human diseases[J]. Nucleic Acids Research, 2021, 49(D1): D1413-D1419.
[78] FRANZÉN O, GAN L M, BJÖRKEGREN J L. PanglaoDB: a web server for exploration ofmouse and human single-cell RNA sequencing data[J]. Database, 2019, 2019.
[79] CONSORTIUM* T S, JONES R C, KARKANIAS J, et al. The Tabula Sapiens: A multipleorgan, single-cell transcriptomic atlas of humans[J]. Science, 2022, 376(6594): eabl4896.
[80] CAO J, O’ DAY D R, PLINER H A, et al. A human cell atlas of fetal gene expression[J].Science, 2020, 370(6518): eaba7721.
[81] BAGGER F O, KINALIS S, RAPIN N. BloodSpot: a database of healthy and malignanthaematopoiesis updated with purified and single cell mRNA sequencing profiles[J]. Nucleicacids research, 2019, 47(D1): D881-D885.
[82] DURINCK S, MOREAU Y, KASPRZYK A, et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis[J]. Bioinformatics, 2005, 21(16):3439-3440.
[83] ARAN D, LOONEY A P, LIU L, et al. Reference-based analysis of lung single-cell sequencingreveals a transitional profibrotic macrophage[J]. Nature immunology, 2019, 20(2): 163-172.
[84] STUART T, BUTLER A, HOFFMAN P, et al. Comprehensive integration of single-cell data[J]. Cell, 2019, 177(7): 1888-1902.
[85] FANG R, PREISSL S, HOU X, et al. Fast and accurate clustering of single cell epigenomesreveals cis-regulatory elements in rare cell types[J]. BioRxiv, 2019, 10(615179): 13.
[86] CHEN H, LAREAU C, ANDREANI T, et al. Assessment of computational methods for theanalysis of single-cell ATAC-seq data[J]. Genome biology, 2019, 20(1): 1-25.
[87] ZHANG Y, LIU T, MEYER C A, et al. Model-based analysis of ChIP-Seq (MACS)[J]. Genomebiology, 2008, 9(9): 1-9.
[88] LI H, LIU J. The novel function of HINFP as a co-activator in sterol-regulated transcription ofPCSK9 in HepG2 cells[J]. Biochemical Journal, 2012, 443(3): 757-768.
[89] WU J, LI Y, FENG D, et al. Integrated analysis of ATAC-seq and RNA-seq reveals the transcriptional regulation network in SLE[J]. International Immunopharmacology, 2023, 116: 109803.
[90] WANG X M, ZHANG J Y, XING X, et al. Global transcriptomic characterization of T cells inindividuals with chronic HIV-1 infection[J]. Cell Discovery, 2022, 8(1): 29.
[91] ANDY. 原发性胆汁性肝硬化外周血中颗粒溶素的基因表达水平的检测[Z]. 2011.
[92] HYDES T J, BLUNT M D, NAFTEL J, et al. Constitutive activation of natural killer cells inprimary biliary cholangitis[J]. Frontiers in Immunology, 2019, 10: 2633.
[93] ZHOU W, GAO F, ROMERO-WOLF M, et al. Single-cell deletion analyses show control ofpro–T cell developmental speed and pathways by Tcf7, Spi1, Gata3, Bcl11a, Erg, and Bcl11b[J]. Science immunology, 2022, 7(71): eabm1920.
[94] CALIFANO D, CHO J J, UDDIN M N, et al. Transcription factor Bcl11b controls identity andfunction of mature type 2 innate lymphoid cells[J]. Immunity, 2015, 43(2): 354-368.
[95] AVRAM D, CALIFANO D. The multifaceted roles of Bcl11b in thymic and peripheral T cells:impact on immune diseases[J]. The Journal of Immunology, 2014, 193(5): 2059-2065.
[96] ZHU J, YAMANE H, PAUL W E. Differentiation of effector CD4 T cell populations[J]. Annualreview of immunology, 2009, 28: 445-489.
[97] COLLINS A, LITTMAN D R, TANIUCHI I. RUNX proteins in transcription factor networksthat regulate T-cell lineage choice[J]. Nature Reviews Immunology, 2009, 9(2): 106-115.
[98] HATTORI N, KAWAMOTO H, FUJIMOTO S, et al. Involvement of transcription factors TCF-1 and GATA-3 in the initiation of the earliest step of T cell development in the thymus.[J]. TheJournal of experimental medicine, 1996, 184(3): 1137-1147.
[99] HOSOYA T, KUROHA T, MORIGUCHI T, et al. GATA-3 is required for early T lineageprogenitor development[J]. Journal of Experimental Medicine, 2009, 206(13): 2987-3000.
[100] WEI G, ABRAHAM B J, YAGI R, et al. Genome-wide analyses of transcription factor GATA3-mediated gene regulation in distinct T cell types[J]. Immunity, 2011, 35(2): 299-311.
[101] ZHANG L, YU L, LIU Y, et al. miR-21-5p promotes cell proliferation by targeting BCL11B inThp-1 cells[J]. Oncology letters, 2021, 21(2): 1-1.
[102] SAMSON S I, RICHARD O, TAVIAN M, et al. GATA-3 promotes maturation, IFN-𝛾 production, and liver-specific homing of NK cells[J]. Immunity, 2003, 19(5): 701-711.

所在学位评定分委会
生物学
国内图书分类号
Q811.4
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/544506
专题南方科技大学医学院
推荐引用方式
GB/T 7714
尹昌辉. 深度整合与挖掘单细胞多模态组学数据及其数据库建设[D]. 深圳. 南方科技大学,2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12032579-尹昌辉-南方科技大学医(6300KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[尹昌辉]的文章
百度学术
百度学术中相似的文章
[尹昌辉]的文章
必应学术
必应学术中相似的文章
[尹昌辉]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。