中文版 | English
题名

New Classification Algorithm Based on Dimension Reduction for High Dimensional Data

其他题名
基于降维的高维数据新分类算法
姓名
姓名拼音
ZHONG Ruijuan
学号
12132913
学位类型
硕士
学位专业
0701 数学
学科门类/专业学位类别
07 理学
导师
CHEN XIN
导师单位
统计与数据科学系
论文答辩日期
2023-05-07
论文提交日期
2023-06-29
学位授予单位
南方科技大学
学位授予地点
深圳
摘要
Modern biological technologies generate high-dimensional datasets, often containing over 20,000 features, which are used to design classifiers. However, due to small sample sizes, developing accurate models poses a significant challenge. Traditional methods are ill-equipped to handle the rapid increase in dimensionality of observation vectors that we currently face. To address this issue, inspired by the linear optimal low-rank (LOL) method, we propose a novel supervised classification algorithm based on dimensionality reduction.
Specifically, we employ class-conditional means and class-conditional
covariance in the projection process. In particular, we combine two
dimensionality reduction methods, namely principal component analysis (PCA) and partial least squares (PLS), in the class-conditional covariance part. Using cross-validation classification error, we tune two parameters, respectively, of top p eigenvetors from the PCA projection matrix and top q eigenvectors from PLS projection matrix to obtain the best combination of PCA and PLS. Our proposed method significantly reduces data dimensionality and improves classification efficiency while ensuring classification accuracy, as validated by experimental simulations. Moreover, our proposed method exhibits superior  performance with lower misclassification rates compared to traditional methods and the LOL algorithm, as demonstrated in eight simulations. Additionally, we analyze the advantages and disadvantages of different dimensionality reduction methods and propose future research directions and prospects.
关键词
语种
英语
培养类别
独立培养
入学年份
2021
学位授予年份
2023-06
参考文献列表

[1] Seagate. (2017, May 10). 数 据 时 代 2025 [Data age 2025]. Retrieved fromhttps://www.seagate.com/files/www-content/ourstory/trends/files/data-age-2025-white-paper-simplified-chinese.pdf
[2] Fan, J., Feng, Y., & Tong, X. (2012). A road to classification in high dimensionalspace: the regularized optimal affine discriminant. Journal of the Royal StatisticalSociety: Series B (Statistical Methodology), 74(4), 745–771. https://doi.org/10.1111/j.1467-9868.2011.01035.x
[3] Ma, Y., & Zhu, L. (2013). A review on dimension reduction. International StatisticalReview, 81(1), 134-150. https://doi.org/10.1111/j.1751-5823.2012.00182.x
[4] Vogelstein, J. T., Bridgeford, E. W., Tang, M., et al. (2021). Superviseddimensionality reduction for big data. Nature Communications, 12(1), 2872. https://doi.org/10.1038/s41467-021-23102-2
[5] Marron, J. S., Todd, M. J., & Ahn, J. (2007). Distance-weighted discrimination. Journal of the American Statistical Association, 102(480), 1267–1271. https://doi.org/10.1198/016214507000001015
[6] Marron, J. S. (2015). Distance-weighted discrimination. Wiley InterdisciplinaryReviews: Computational Statistics, 7(2), 109–114. https://doi.org/10.1002/wics.1343
[7] Wang, B., & Zou, H. (2018). Another look at distance-weighted discrimination. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80(1), 177–198. https://doi.org/10.1111/rssb.12220
[8] Bair, E., Hastie, T., Paul, D., & Tibshirani, R. (2006). Prediction by supervisedprincipal components. Journal of the American Statistical Association, 101(473), 119–137. https://doi.org/10.1198/016214505000000628
[9] Shao, R., Hu, W., Wang, Y., & Qi, X. (2014). The fault feature extraction andclassification of gear using principal component analysis and kernel principalcomponent analysis based on the wavelet packet transform. Measurement, 54, 118-132. https://doi.org/10.1016/j.measurement.2014.04.020
[10] Shin, H., & Eubank, R. L. (2011). Unit canonical correlations and high-dimensionaldiscriminant analysis. Journal of Statistical Computation and Simulation, 81(2), 167–178. https://doi.org/10.1080/00949650903575808
[11] Bo¨ulesteix, A. L. (2004). PLS dimension reduction for classification withmicroarray data. Statistical applications in genetics and molecular biology, 3(1), 1-30. https://doi.org/10.2202/1544-6115.1029
[12] Abdi, H. (2003). Partial least squares (PLS) regression. Encyclopedia of socialsciences research methods. https://doi.org/10.4135/9781412950589
[13] Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEETransactions on Information Theory, 13(1), 21-27. https://doi.org/10.1109/TIT.1967.1053964
[14] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. https://doi.org/10.1007/BF00994018
[15] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
[16] Tharwat, Alaa et al. “Linear discriminant analysis: A detailed tutorial.” AICommun. 30 (2017): 169-190.
[17] Friedman, J. H. (2001). Greedy function approximation: A gradient boostingmachine. The Annals of Statistics, 29(5), 1189-1232. https://doi.org/10.1214/aos/1013203451
[18] Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storageand organization in the brain. Psychological Review, 65(6), 386-408. https://doi.org/10.1037/h0042519
[19] Stone, M., & Brooks, R. J. (1990). Continuum regression: Cross-validatedsequentially constructed prediction embracing ordinary least squares, partial leastsquares and principal components regression. Journal of the Royal StatisticalSociety: Series B (Methodological), 52(2), 237-258. https://doi.org/10.1111/j.2517-6161.1990.tb01786.x
[20] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions(with discussion). Journal of the Royal Statistical Society. Series B(Methodological), 36(2), 111-147. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x
[21] Tharwat, A., Gaber, T., Ibrahim, A., & Hassanien, A. E. (2017). Linear discriminantanalysis: A detailed tutorial. AI Communications, 30, 169-190.
[22] Abdi, H., & Williams, L. J. (2010). Principal component analysis. WileyInterdisciplinary Reviews: Computational Statistics, 2, 433-459. https://doi.org/10.1002/wics.101
[23] Abdi, H. (2003). Partial least squares (PLS) regression. Encyclopedia of socialsciences research methods.
[24] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology), 58, 267-288.
[25] Fan, J., Feng, Y., & Tong, X. (2012). A road to classification in high dimensionalspace: The regularized optimal affine discriminant. Journal of the Royal StatisticalSociety: Series B (Statistical Methodology), 74, 745-771.
[26] Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning withsparsity: The lasso and generalizations. Chapman and Hall/CRC.
[27] Shen, W., Guo, Y., & Hastie, T. (2017). False discoveries occur early on the lassopath. Annals of Statistics, 45, 2133-2150.
[28] Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locallylinear embedding. Science, 290, 2323-2326. doi: 10.1126/science.290.5500.2323.
[29] Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometricframework for nonlinear dimensionality reduction. Science, 290, 2319-2323. doi:10.1126/science.290.5500.2319.
[30] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representationsby back-propagating errors. Nature, 323, 533-536. doi: 10.1038/323533a0.
[31] Wang, Z., Yan, J., & Oates, T. (2019). Generalized Autoencoder: A Neural NetworkFramework for Dimensionality Reduction. IEEE Transactions on Neural Networksand Learning Systems, 30(9), 2720-2733. https://doi.org/10.1109/TNNLS.2018.2888232.
[32] Tang, J., Liu, Z., Zhao, M., & Huang, D. (2019). Structural Learning of HierarchicalRepresentations for Neural Network-Based Data Classification. IEEE Transactionson Neural Networks and Learning Systems, 30(2), 431-443. https://doi.org/10.1109/TNNLS.2018.2822994
[33] Ma, W., Zhang, X., & Zhou, D. (2017). Extreme Learning Machine Autoencoder forDimensionality Reduction and Feature Extraction. IEEE Transactions onCybernetics, 47(7), 1808-1819. https://doi.org/10.1109/TCYB.2016.2549486.
[34] Liu, X., Huang, G., Lin, Z., & Liao, X. (2018). Batch Layerwise Encoding ExtremeLearning Machine with Manifold Regularization for Large-Scale Sparse DataProcessing. IEEE Transactions on Neural Networks and Learning Systems, 29(6), 2026-2039. https://doi.org/10.1109/TNNLS.2017.2748418
[35] LeCun, Y., Bengio, Y., & Hinton, G. (1998). Convolutional Networks for Images, Speech, and Time Series. In The Handbook of Brain Theory and Neural Networks(2nd ed., pp. 255-258). MIT Press.
[36] He, X., Yan, S., & Niyogi, P. (2004). Locality Preserving Projections. Advances inNeural Information Processing Systems, 16, 153-160. https://proceedings.neurips.cc/paper/2003/hash/849f3e834d0f3e1f1e3e5c5f5d5d5b5e-Paper.pdf.
[37] Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in neural information processing systems (pp. 556-562).
[38] Paatero, P. (1994). Least squares formulation of robust non-negative factor analysis. Chemometrics and Intelligent Laboratory Systems, 25(2), 233-245.
[39] Friedlander, M. P., & Saul, L. K. (2006). Active set algorithms for nonnegativematrix factorization with the Kullback-Leibler divergence. Neural computation, 18(9), 2148-2174.
[40] Ding, C., Li, T., & Jordan, M. I. (2005). Convex and Semi-Nonnegative MatrixFactorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 45-55. https://doi.org/10.1109/TPAMI.2008.67.
[41] García Cuesta, E. (2022). Supervised Local Maximum Variance Preserving (SLMVP)Dimensionality Reduction Method (1.0). Zenodo. https://doi.org/10.5281/zenodo.6832623
[42] Hastie, T., Tibshirani, R., & Friedman, J. H. (2004). The elements of statisticallearning: Data mining, inference, and prediction. Publishing House of ElectronicsIndustry.
[43] LeCun, Y., Cortes, C., & Burges, C. (2015). MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/
[44] Bengio, Y., et al. (2004). Out-of-sample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering. In S. Thrun, L. K. Saul, & P. B. Scholkopf (Eds.), Advances in neural information processing systems (pp. 177–184). MIT Press.
[45] Fan, J., Feng, Y., Tong, X., & Yu, T. (2015). Multi-view sure independencescreening. The Annals of Statistics, 43(1), 122-154. https://doi.org/10.1214/14-AOS1279
[46] Chatterjee, S. (2019). A new coefficient of correlation. Statistics & ProbabilityLetters, 148, 25-29. https://doi.org/10.1016/j.spl.2019.01.004.

所在学位评定分委会
数学
国内图书分类号
O212.1
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/544548
专题理学院_统计与数据科学系
推荐引用方式
GB/T 7714
Zhong RJ. New Classification Algorithm Based on Dimension Reduction for High Dimensional Data[D]. 深圳. 南方科技大学,2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12132913-钟瑞娟-统计与数据科学(3852KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[钟瑞娟]的文章
百度学术
百度学术中相似的文章
[钟瑞娟]的文章
必应学术
必应学术中相似的文章
[钟瑞娟]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。