中文版 | English
题名

Development of spatial multi-omics data integration method and cell segmentation method

其他题名
空间多组学数据整合方法及细胞分割方法开发
姓名
姓名拼音
WU Xinchao
学号
12133069
学位类型
硕士
学位专业
0710 生物学
学科门类/专业学位类别
07 理学
导师
靳文菲
导师单位
系统生物学系
外机构导师
刘石平
外机构导师单位
杭州华大生命科学研究院
论文答辩日期
2024-05-13
论文提交日期
2024-07-01
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

Spatial omics technologies have made remarkable progress in recent years, mainly in the development of a variety of spatial omics sequencing technologies and the greatly improvement of sequencing throughput and resolution. Among these technologies, spatially-resolved CITE-seq could simultaneously profile the spatial expression of transcriptomics and proteomics of the same tissue slice. However, the lack of analytical methods for spatial multi-omics data severely constrains the application and biological interpretation of spatial omics technologies. Therefore, the further advancement and application of spatial omics are contingent upon the development of analytical methods for spatial omics data.

We developed the GCAT method based on the graph cross-attention mechanism to integrate spatial proximity information with transcriptomics and proteomics expression data, which enabled us to fully utilize spatial multi-modality information and reveal the biological phenomena and mechanisms. We first evaluated the performance of the GCAT integration method using mouse spleen CITE-seq data. The results showed that the GCAT method could effectively integrate transcriptomic and proteomic features and generate low-dimensional embeddings, alongside correcting batch effects in the data. GCAT demonstrated superior integration performance across different metrics compared to other single-cell multi-omics integration methods. Cell trajectory analysis revealed that the B cell subpopulations identified based on GCAT embeddings were highly consistent with the developmental stages of B cells. In the study of spatially-resolved CITE-seq data, GCAT effectively integrated proteomic and transcriptomic information with spatial proximity relationships, and accurately identified distinct functional regions of mouse spleen and thymus tissues. The spatial distribution of these regions in the embedding space reflected their spatial proximity and functional connections. We also integrated the Stereo-CITE-seq data of human tonsil tissues by GCAT, and found that there were lymphoid follicles in different developmental stages and spatial locations in the tonsil, which further proved the effectiveness of GCAT in integrating spatial multi-omics information.

For better utilization of spatial information, we also developed Starro for cell segmentation in high-resolution spatial transcriptomics data. This method was tailored to data generated by various spatial transcriptomics technologies, with two modules: nuclear segmentation and cytoplasmic expansion. Through comparative validation, we demonstrated the effectiveness of cytoplasmic expansion. Evaluation of the Starro method on various spatial transcriptomics datasets showed that it outperformed other methods in terms of achieving the highest RNA signal intensity and the greatest concordance with the ground truth. These results suggested that Starro enabled better cell segmentation of spatial transcriptomics data.

In summary, we developed GCAT for spatial multi-omics data integration and Starro for spatial transcriptomics cell segmentation, providing strong supports for the integration and analysis of spatial multi-omics data. The GCAT method could effectively integrate spatial proximity information with transcriptomic and proteomic expression data, offering new perspectives for understanding the structure and function of biological tissues. Meanwhile, Starro tackled the challenge of cell segmentation in high-resolution spatial transcriptomics data, supporting the single-cell level studies of spatial transcriptomics.

 
其他摘要

 

 
 
关键词
语种
英语
培养类别
独立培养
入学年份
2021
学位授予年份
2024-06
参考文献列表

[1] TARHAN L, BISTLINE J, CHANG J, et al. Single Cell Portal: an interactive home for single-cell genomics data [J]. BioRxiv, 2023, 10.1101/2023.07.13.548886.
[2] ROZENBLATT-ROSEN O, SHIN J W, ROOD J E, et al. Building a high-quality Human Cell Atlas [J]. Nature Biotechnology, 2021, 39(2): 149-53.
[3] KASHIMA Y, SAKAMOTO Y, KANEKO K, et al. Single-cell sequencing techniques from individual to multiomics analyses [J]. Experimental & Molecular Medicine, 2020, 52(9): 1419-27.
[4] MARX V. Method of the Year: spatially resolved transcriptomics [J]. Nature Methods, 2021, 18(1): 9-14.
[5] Method of the Year 2023: methods for modeling development [J]. Nature Methods, 2023, 20(12): 1831-2.
[6] STUART T, SATIJA R. Integrative single-cell analysis [J]. Nature Reviews Genetics, 2019, 20(5): 257-72.
[7] STOECKIUS M, HAFEMEISTER C, STEPHENSON W, et al. Simultaneous epitope and transcriptome measurement in single cells [J]. Nature Methods, 2017, 14(9): 865-8.
[8] GAYOSO A, STEIER Z, LOPEZ R, et al. Joint probabilistic modeling of single-cell multi-omic data with totalVI [J]. Nature Methods, 2021, 18(3): 272-82.
[9] KLEIN F, ROUX J, CVIJETIC G, et al. Dntt expression reveals developmental hierarchy and lineage specification of hematopoietic progenitors [J]. Nature Immunology, 2022, 23(4): 505-17.
[10] STEIER Z, AYLARD D A, MCINTYRE L L, et al. Single-cell multiomic analysis of thymocyte development reveals drivers of CD4+ T cell and CD8+ T cell lineage commitment [J]. Nature Immunology, 2023, 24(9): 1579-90.
[11] ASP M, BERGENSTRåHLE J, LUNDEBERG J. Spatially Resolved Transcriptomes—Next Generation Tools for Tissue Exploration [J]. BioEssays, 2020, 42(10): 1900221.
[12] SINGER R H, WARD D C. Actin gene expression visualized in chicken muscle tissue culture by using in situ hybridization with a biotinated nucleotide analog [J]. Proc Natl Acad Sci U S A, 1982, 79(23): 7331-5.
[13] WANG X, ALLEN W E, WRIGHT M A, et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states [J]. Science, 2018, 361(6400): eaat5691.
[14] STåHL P L, SALMéN F, VICKOVIC S, et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics [J]. Science, 2016, 353(6294): 78-82.
[15] CHEN A, LIAO S, CHENG M, et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays [J]. Cell, 2022, 185(10): 1777-92.e21.
[16] VANDEREYKEN K, SIFRIM A, THIENPONT B, et al. Methods and applications for single-cell and spatial multi-omics [J]. Nature Reviews Genetics, 2023, 24(8): 494-515.
[17] LIU Y, DISTASIO M, SU G, et al. High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seq [J]. Nat Biotechnol, 2023,41(10):1405-1409.
[18] LIAO S, HENG Y, LIU W, et al. Integrated Spatial Transcriptomic and Proteomic Analysis of Fresh Frozen Tissue Based on Stereo-seq [J]. BioRxiv, 2023, 10.1101/2023.04.28.538364.
[19] SUN L, SU Y, JIAO A, et al. T cells in health and disease [J]. Signal Transduction and Targeted Therapy, 2023, 8(1): 235.
[20] HOSOKAWA H, ROTHENBERG E V. How transcription factors drive choice of the T cell fate [J]. Nature Reviews Immunology, 2021, 21(3): 162-76.
[21] WEN H, DING J, JIN W, et al. Graph neural networks for multimodal single-cell data integration [C]. Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, Washington DC, USA; Association for Computing Machinery. 2022: 4153–63.
[22] SONG T, BROADBENT C, KUANG R. GNTD: reconstructing spatial transcriptomes with graph-guided neural tensor decomposition informed by spatial and functional relations [J]. Nature Communications, 2023, 14(1): 8276.
[23] ZHANG S, TONG H, XU J, et al. Graph convolutional networks: a comprehensive review [J]. Computational Social Networks, 2019, 6(1), 10.1186/s40649-019-0069-y.
[24] LONG Y, ANG K S, LI M, et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST [J]. Nature Communications, 2023, 14(1): 1155.
[25] JEONG D, KOO B, OH M, et al. GOAT: Gene-level biomarker discovery from multi-Omics data using graph ATtention neural network for eosinophilic asthma subtype [J]. Bioinformatics, 2023, 39(10): btad582.
[26] LONG Y, ANG K S, LIAO S, et al. Integrated analysis of spatial multi-omics with SpatialGlue [J]. BioRxiv, 2023, 10.1101/2023.04.26.538404.
[27] OGBEIDE S, GIANNESE F, MINCARELLI L, et al. Into the multiverse: advances in single-cell multiomic profiling [J]. Trends in Genetics, 2022, 38(8): 831-43.
[28] STUART T, BUTLER A, HOFFMAN P, et al. Comprehensive Integration of Single-Cell Data [J]. Cell, 2019, 177(7): 1888-902.e21.
[29] HAO Y, HAO S, ANDERSEN-NISSEN E, et al. Integrated analysis of multimodal single-cell data [J]. Cell, 2021, 184(13): 3573-87.e29.
[30] HAO Y, STUART T, KOWALSKI M H, et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis [J]. Nature Biotechnology, 2023, 42(3): 293–04.
[31] HIE B, BRYSON B, BERGER B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama [J]. Nature Biotechnology, 2019, 37(6): 685-91.
[32] LOPEZ R, REGIER J, COLE M B, et al. Deep generative modeling for single-cell transcriptomics [J]. Nature Methods, 2018, 15(12): 1053-8.
[33] JIN S, ZHANG L, NIE Q. scAI: an unsupervised approach for the integrative analysis of parallel single-cell transcriptomic and epigenomic profiles [J]. Genome Biology, 2020, 21(1): 25.
[34] LI G, FU S, WANG S, et al. A deep generative model for multi-view profiling of single-cell RNA-seq and ATAC-seq data [J]. Genome Biology, 2022, 23(1): 20.
[35] ASHUACH T, GABITTO M I, KOODLI R V, et al. MultiVI: deep generative model for the integration of multimodal data [J]. Nature Methods, 2023, 20(8): 1222-31.
[36] KANG J B, NATHAN A, WEINAND K, et al. Efficient and precise single-cell reference atlas mapping with Symphony [J]. Nature Communications, 2021, 12(1): 5890.
[37] LIN Y, WU T-Y, WAN S, et al. scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning [J]. Nature Biotechnology, 2022, 40, 703–710.
[38] GONG B, ZHOU Y, PURDOM E. Cobolt: integrative analysis of multimodal single-cell sequencing data [J]. Genome Biology, 2021, 22(1): 351.
[39] ZHANG Z, YANG C, ZHANG X. scDART: integrating unmatched scRNA-seq and scATAC-seq data and learning cross-modality relationship simultaneously [J]. Genome Biology, 2022, 23(1): 139.
[40] CAO K, BAI X, HONG Y, et al. Unsupervised topological alignment for single-cell multi-omics integration [J]. Bioinformatics, 2020, 36(Supplement_1): i48-i56.
[41] CHEN D, FAN B, OLIVER C, et al. Unsupervised manifold alignment with joint multidimensional scaling [J]. arXiv, 2022, 10.48550/arXiv.2207.02968.
[42] ARGELAGUET R, VELTEN B, ARNOL D, et al. Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets [J]. Molecular Systems Biology, 2018, 14(6): e8124.
[43] ARGELAGUET R, ARNOL D, BREDIKHIN D, et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data [J]. Genome Biology, 2020, 21(1): 111.
[44] LIU J, GAO C, SODICOFF J, et al. Jointly defining cell types from multiple single-cell datasets using LIGER [J]. Nature Protocols, 2020, 15(11): 3632-62.
[45] KRIEBEL A R, WELCH J D. UINMF performs mosaic integration of single-cell multi-omic datasets using nonnegative matrix factorization [J]. Nature Communications, 2022, 13(1): 780.
[46] CAO Z-J, GAO G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding [J]. Nature Biotechnology, 2022, 40(10): 1458-66.
[47] KORSUNSKY I, MILLARD N, FAN J, et al. Fast, sensitive and accurate integration of single-cell data with Harmony [J]. Nature Methods, 2019, 16(12): 1289-96.
[48] DEMETCI P, SANTORELLA R, SANDSTEDE B, et al. Unsupervised integration of single-cell multi-omics datasets with disparities in cell-type representation [J]. BioRxiv, 2021, 10.1101/2021.11.09.467903.
[49] KANG M, KO E, MERSHA T B. A roadmap for multi-omics data integration using deep learning [J]. Briefings in Bioinformatics, 2022, 23(1): bbab454.
[50] THEODORIS C V, XIAO L, CHOPRA A, et al. Transfer learning enables predictions in network biology [J]. Nature, 2023, 618(7965): 616-24.
[51] CUI H, WANG C, MAAN H, et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI [J]. Nature Methods, 2024: 10.1038/s41592-024-02201-0.
[52] KEDZIERSKA K Z, CRAWFORD L, AMINI A P, et al. Assessing the limits of zero-shot foundation models in single-cell biology [J]. BioRxiv, 2023, 10.1101/2023.10.16.561085
[53] YANG F, WANG W, WANG F, et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data [J]. Nature Machine Intelligence, 2022, 4(10): 852-66.
[54] ROSEN Y, ROOHANI Y, AGARWAL A, et al. Universal Cell Embeddings: A Foundation Model for Cell Biology [J]. BioRxiv, 2023, 10.1101/2023.11.28.568918.
[55] ROSEN Y, BRBIĆ M, ROOHANI Y, et al. Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN [J]. Nature Methods, 2024: 10.1038/s41592-024-02191-z.
[56] NGUYEN N D, WANG D. Multiview learning for understanding functional multiomics [J]. PLOS Computational Biology, 2020, 16(4): e1007677.
[57] SVENSSON V, TEICHMANN S A, STEGLE O. SpatialDE: identification of spatially variable genes [J]. Nature Methods, 2018, 15(5): 343-6.
[58] SUN S, ZHU J, ZHOU X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies [J]. Nature methods, 2020, 17(2): 193-200.
[59] ZHU J, SUN S, ZHOU X. SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies [J]. Genome Biology, 2021, 22(1): 184.
[60] WEBER L M, SAHA A, DATTA A, et al. nnSVG for the scalable identification of spatially variable genes using nearest-neighbor Gaussian processes [J]. Nature Communications, 2023, 14(1): 4059.
[61] HU J, LI X, COLEMAN K, et al. SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network [J]. Nature Methods, 2021, 18(11): 1342-51.
[62] SCHMIDT U, WEIGERT M, BROADDUS C, et al. Cell Detection with Star-Convex Polygons [M]. Springer International Publishing. 2018: 265-73.
[63] PACHITARIU M, STRINGER C. Cellpose 2.0: how to train your own model [J]. Nature Methods, 2022, 19(12): 1634-41.
[64] GREENWALD N F, MILLER G, MOEN E, et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning [J]. Nature Biotechnology, 2022, 40(4): 555-65.
[65] BORM L E, MOSSI ALBIACH A, MANNENS C C A, et al. Scalable in situ single-cell profiling by electrophoretic capture of mRNA using EEL FISH [J]. Nature Biotechnology, 2022, 41(2): 222-31.
[66] PETUKHOV V, XU R J, SOLDATOV R A, et al. Cell segmentation in imaging-based spatial transcriptomics [J]. Nature Biotechnology, 2022, 40(3): 345-54.
[67] PARK J, CHOI W, TIESMEYER S, et al. Cell segmentation-free inference of cell types from in situ transcriptomics data [J]. Nature Communications, 2021, 12(1): 4103.
[68] HE Y, TANG X, HUANG J, et al. ClusterMap for multi-scale clustering analysis of spatial gene expression [J]. Nature Communications, 2021, 12(1): 5909.
[69] CHEN H, LI D, BAR-JOSEPH Z. SCS: cell segmentation for high-resolution spatial transcriptomics [J]. Nature Methods, 2023, 20(8): 1237-43.
[70] KOTLIAR D, VERES A, NAGY M A, et al. Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq [J]. eLife, 2019, 8: e43803.
[71] HUANG C, LIU X, YAO T, et al. An efficient EM algorithm for the mixture of negative binomial models [J]. Journal of Physics: Conference Series, 2019, 1324(1): 012093.
[72] ZHANG M, EICHHORN S W, ZINGG B, et al. Spatially resolved cell atlas of the mouse primary motor cortex by MERFISH [J]. Nature, 2021, 598(7879): 137-43.
[73] WOLF F A, HAMEY F K, PLASS M, et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells [J]. Genome Biology, 2019, 20(1): 59.
[74] JACOMY M, VENTURINI T, HEYMANN S, et al. ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software [J]. PLoS ONE, 2014, 9(6): e98679.
[75] LEWIS S M, WILLIAMS A, EISENBARTH S C. Structure and function of the immune system in the spleen [J]. Sci Immunol, 2019, 4(33): eaau6085.
[76] MARTIN F, KEARNEY J F. Marginal-zone B cells [J]. Nature Reviews Immunology, 2002, 2(5): 323-35.
[77] SANZ I, WEI C, JENKS S A, et al. Challenges and Opportunities for Consistent Classification of Human B Cell and Plasma Cell Populations [J]. Front Immunol, 2019, 10: 2458.
[78] KAMINSKI D A, WEI C, QIAN Y, et al. Advances in human B cell phenotypic profiling [J]. Front Immunol, 2012, 3: 302.
[79] SOMASUNDARAM R, JENSEN C T, TINGVALL-GUSTAFSSON J, et al. EBF1 and PAX5 control pro-B cell expansion via opposing regulation of the Myc gene [J]. Blood, 2021, 137(22): 3037-49.
[80] LODER F, MUTSCHLER B, RAY R J, et al. B cell development in the spleen takes place in discrete steps and is determined by the quality of B cell receptor-derived signals [J]. J Exp Med, 1999, 190(1): 75-89.
[81] LI X, ISLAM S, XIONG M, et al. Epigenetic regulation of NfatC1 transcription and osteoclastogenesis by nicotinamide phosphoribosyl transferase in the pathogenesis of arthritis [J]. Cell Death Discovery, 2019, 5(1): 62.
[82] ZHANG J, LI S, LIU F, et al. Role of CD68 in tumor immunity and prognosis prediction in pan-cancer [J]. Scientific Reports, 2022, 12(1): 7844.
[83] TONG C, YIN Y. Localization of RNAs in the nucleus: cis - and trans - regulation [J]. RNA Biol, 2021, 18(12): 2073-86.

所在学位评定分委会
生物学
国内图书分类号
Q344
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/778780
专题生命科学学院_生物系
推荐引用方式
GB/T 7714
Wu XC. Development of spatial multi-omics data integration method and cell segmentation method[D]. 深圳. 南方科技大学,2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12133069-吴鑫超-生物系.pdf(8463KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[吴鑫超]的文章
百度学术
百度学术中相似的文章
[吴鑫超]的文章
必应学术
必应学术中相似的文章
[吴鑫超]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。