题名 | TESTS OF HIGH-DIMENSIONAL COMPOSITIONAL DATA |
其他题名 | 高维成分数据的检验
|
姓名 | |
姓名拼音 | LI Wenbo
|
学号 | 12232887
|
学位类型 | 硕士
|
学位专业 | 0701 数学
|
学科门类/专业学位类别 | 07 理学
|
导师 | |
导师单位 | 统计与数据科学系
|
论文答辩日期 | 2024-05-20
|
论文提交日期 | 2024-07-04
|
学位授予单位 | 南方科技大学
|
学位授予地点 | 深圳
|
摘要 | The human microbiome plays a crucial role in human health and diseases. How to comprehensively analyze microbiome data and explore the relationship between microor- ganisms and human health is a hot topic. High-dimensional compositional data are an im- portant type of microbiome data. By performing statistical inference on high-dimensional compositional data, we can profoundly investigate the potential relationships between human microbiome data and health and diseases. This paper focuses on the hypothesis testing problems of high-dimensional compositional data, including one-sample and two- sample mean and covariance testing. We propose brand new testing methods that address the issue of existing test statistics not being applicable to the sum-to-one constraint of high-dimensional compositional data. For mean testing, the existing max-type test statistics are mainly designed for sparse high-dimensional compositional data. However, for data with weak signals and high den- sity, the test power significantly decreases. Sum-type test statistics are more suitable for dense data testing, but most of the existing sum-type test statistics are based on the as- sumption of independent component models and are not applicable to high-dimensional compositional data. Therefore, we modify the existing sum-type tests to make them ap- plicable to one-sample and two-sample mean testing of high-dimensional compositional data under more general conditions. Furthermore, we demonstrat the asymptotic inde- pendence of the sum-type test statistic and the max-type test statistic, and subsequently propose a max-sum combination test statistic that can handle both sparse and dense data. We establish the asymptotic distribution of these test statistics under the null hypothesis and the power analysis under the alternative hypothesis. Both theoretical derivations and numerical simulations indicate that the proposed max-sum test performs robustly regard- less of the sparsity of the data. Then we consider the spherical test for the covariance matrix of high-dimensional compositional data. We adopt the classical multivariate analysis method, John’s test statis- tic, and modify it to be applicable to high-dimensional compositional data. To derive the asymptotic distribution of the modified John’s test statistic, we generalize the central limit theorem for the sample covariance matrix linear spectral statistic of independent compo- nent data, making it also applicable to cases with a degenerate population covariance matrix, including high-dimensional compositional data. Meanwhile, numerical simulations also show that our modified John’s test statistic maintains a good power while controlling the empirical test size. |
关键词 | |
语种 | 英语
|
培养类别 | 独立培养
|
入学年份 | 2022
|
学位授予年份 | 2024-06
|
参考文献列表 | [1] AITCHISON J W. The statistical analysis of compositional data[M]. Caldwell: Blackburn Press, 2003. |
所在学位评定分委会 | 数学
|
国内图书分类号 | O212.1
|
来源库 | 人工提交
|
成果类型 | 学位论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/778956 |
专题 | 理学院_统计与数据科学系 |
推荐引用方式 GB/T 7714 |
Li WB. TESTS OF HIGH-DIMENSIONAL COMPOSITIONAL DATA[D]. 深圳. 南方科技大学,2024.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
12232887-李文博-统计与数据科学(2551KB) | -- | -- | 限制开放 | -- | 请求全文 |
个性服务 |
原文链接 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
导出为Excel格式 |
导出为Csv格式 |
Altmetrics Score |
谷歌学术 |
谷歌学术中相似的文章 |
[李文博]的文章 |
百度学术 |
百度学术中相似的文章 |
[李文博]的文章 |
必应学术 |
必应学术中相似的文章 |
[李文博]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论