中文版 | English
题名

SPARSE ESTIMATION FOR CUBIC POLYNOMIAL SINGLE-INDEX MODEL

其他题名
三次多项式单指标模型的稀疏估计
姓名
姓名拼音
MEI Leilei
学号
12032859
学位类型
硕士
学位专业
0701 数学
学科门类/专业学位类别
07 理学
导师
CHEN XIN
导师单位
统计与数据科学系
论文答辩日期
2022-05-08
论文提交日期
2022-06-21
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

The single index model is a popular research field for contemporary scholars. Since
the single index model with sparsity can solve the curse of dimensionality and highdimensional
data problem at the same time, sparse estimation of single index models
has also received increasing attention. This thesis mainly considers the sparse estimation
problem of cubic polynomial single index models. For semi-parametric model, the
cubic polynomial single index model refers to use a cubic polynomial to approximate the
link function of the single index model. Since cubic polynomial single index model assumes
the form of the function in advance. To a certain extent, the error generated during
the process of estimating the link function of the traditional single index model can be
avoided. For sparsity estimation, this thesis mainly obtains the estimated value with sparsity
by adding the Lasso penalty term.
The literature review of this thesis mainly reviews the estimation methods of single
index models in recent years and some variable selection methods that can be applied on
single index models. When dealing with the problem of variable selection, this thesis
proposes the estimation for the cubic polynomial single index model based on the Lasso
penalty. The Lasso penalty is the most commonly used penalty function in sparse estimation
problems. The great advantage of using this method is the fact that it can achieve two
purposes at the same time. Estimation of the model is done at the same time as variable
selection. In order to achieve model estimation, this thesis builds the model with Lasso
penalty function and achieves the iterative steps of the algorithm to analyze the model. In
the calculation process of the constructed model, the Sequential Quadratic Programming
is used in this thesis to analyze the optimization problem. By optimizing the model, this
thesis obtains the algorithm process to solve the sparse estimation problem of the cubic
polynomial single index model. Then the estimators of this model can be generated. This
sparse estimation method inherits the advantages of both cubic polynomial estimation and
Lasso penalty method, which is also the innovation of this thesis.
In the simulation studies, this thesis builds some known data models and randomly
generates the data sets through the models. On the basis of these generated data sets, the
iterative algorithm obtained in the thesis is used to estimate models. Compared with the
estimation results of other methods, it is found that the simulation results of the estimation algorithm in this thesis verify the validity and accuracy of the estimation method and
variable selection proposed in this thesis. In the last part, it chooses a widely utilized
housing price data set and applies proposed estimation method on the data set for analysis.
The estimated results of the fitted model are consistent with the real results.

其他摘要

单指标模型是当前学者们的热门研究领域。因为具有稀疏性的单指标模型可
以同时解决维数诅咒和高维数据问题,所以关于单指标模型的稀疏估计也受到了
越来越多的关注。在本文中,主要考虑了三次多项式单指标回归模型的稀疏估计
问题。对于单指标模型来说,三次多项式单指标回归模型是指用三次多项式来设定
单指标模型连接函数的形式。由于三次多项式单指标模型假设了连接函数的形式,
在一定程度上可以避免估计传统单指标模型的连接函数时候所产生的误差。对于
稀疏性估计来说,本文主要是通过增加Lasso 惩罚函数的项来得到具有稀疏性的
估计值。
本文的文献综述部分回顾了近年来单指标模型的估计方法以及一些应用在单
指标模型上的变量选择的方法。在处理变量选择问题时,选择了基于对Lasso 惩罚
方法在三次多项式估计的单指标模型中的应用来进行研究。Lasso 惩罚函数是在稀
疏估计问题中最常见被使用到的惩罚函数。使用该方法最大的优点就是能够同时
实现两个目标。在进行变量选择的同时完成模型的估计。为了实现模型估计,此
处构建带有Lasso 惩罚函数的模型并且得到优化模型的算法迭代步骤。在对构建
的模型的计算过程中,采用序列二次规划算法来解析模型的优化问题。通过优化
该模型,得到解决此三次多项式单指标模型稀疏估计问题的算法流程,并以此可
以得到该模型的估计值。这个稀疏估计方法同时继承了三次多项式单指标模型估
计和Lasso 惩罚方法两方面的优点, 这一点是一个本文有所创新的地方。
在模拟验证部分,构建了已知数据模型并通过模型随机生成数据集。在此生
成的数据集基础之上,通过新算法进行模型估计。另外,同其他单指标模型估计
方法的结果相比较可以发现提出的新算法的模拟结果验证了所提出的估计方法和
变量选择的有效性和实用性。最后,实际数据分析选择了被普遍采用的一个房价
数据集并且应用此估计模型对此数据集进行建模。此拟合模型估计的结果和实际
情况一致。

关键词
其他关键词
语种
英语
培养类别
独立培养
入学年份
2020
学位授予年份
2022-07
参考文献列表

[1] ARIF M, WANG G, CHEN S. Deep learning with non-parametric regression model fortraffic flow prediction[C]//2018 IEEE 16th Intl Conf on Dependable, Autonomic and SecureComputing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Confon Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). IEEE, 2018: 681-688.
[2] SEVERN K E, DRYDEN I L, PRESTON S P. Non-parametric regression for networks[J]. Stat,2021, 10(1): e373.
[3] WADHVANI R, SHUKLA S. Analysis of parametric and non-parametric regression techniquesto model the wind turbine power curve[J]. Wind Engineering, 2019, 43(3): 225-232.
[4] FRIEDMAN J H, STUETZLE W. Projection pursuit regression[J]. Journal of the Americanstatistical Association, 1981, 76(376): 817-823.
[5] ZHANG Y, LIAN H, YU Y. Ultra-high dimensional single-index quantile regression[J]. Journalof Machine Learning Research, 2020, 21(224): 1-25.
[6] KUCHIBHOTLA A K, PATRA R K. Efficient estimation in single index models throughsmoothing splines[J]. Bernoulli, 2020, 26(2): 1587-1618.
[7] BICKEL P J, DOKSUM K A. Mathematical statistics: basic ideas and selected topics, volumesi-ii package[M]. Chapman and Hall/CRC, 2015.
[8] AKAIKE H. A new look at the statistical model identification[J]. IEEE transactions on automaticcontrol, 1974, 19(6): 716-723.
[9] SCHWARZ G. Estimating the dimension of a model[J]. The annals of statistics, 1978: 461-464.
[10] TIBSHIRANI R. Regression shrinkage and selection via the lasso[J]. Journal of the RoyalStatistical Society: Series B (Methodological), 1996, 58(1): 267-288.
[11] ZOU H. The adaptive lasso and its oracle properties[J]. Journal of the American statisticalassociation, 2006, 101(476): 1418-1429.
[12] FAN J, LI R. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. Journal of the American statistical Association, 2001, 96(456): 1348-1360.
[13] JIANG R, QIAN W M, ZHOU Z G. Weighted composite quantile regression for single-indexmodels[J]. Journal of Multivariate Analysis, 2016, 148: 34-48.
[14] LI K C. Sliced inverse regression for dimension reduction[J]. Journal of the American StatisticalAssociation, 1991, 86(414): 316-327.
[15] ICHIMURA H. Estimation of single index models[D]. Massachusetts Institute of Technology,1987.
[16] ICHIMURA H. Semiparametric least squares (sls) and weighted sls estimation of single-indexmodels[J]. Journal of econometrics, 1993, 58(1-2): 71-120.
[17] HÄRDLE W, STOKER T M. Investigating smooth multiple regression by the method of averagederivatives[J]. Journal of the American statistical Association, 1989, 84(408): 986-995.
[18] DUAN N, LI K C. Slicing regression: a link-free regression method[J]. The Annals of Statistics,1991: 505-530.
[19] COOK R D, WEISBERG S. Sliced inverse regression for dimension reduction: Comment[J].Journal of the American Statistical Association, 1991, 86(414): 328-332.
[20] HOROWITZ J L, HÄRDLE W. Direct semiparametric estimation of single-index models withdiscrete covariates[J]. Journal of the American Statistical Association, 1996, 91(436): 1632-1640.
[21] HRISTACHE M, JUDITSKY A, SPOKOINY V. Direct estimation of the index coefficient in asingle-index model[J]. Annals of Statistics, 2001: 595-623.
[22] XIA Y, TONG H, LI W K, et al. An adaptive estimation of dimension reduction space[M]//Exploration of A Nonlinear World: An Appreciation of Howell Tong’s Contributions to Statistics.World Scientific, 2009: 299-346.
[23] XIA Y. A constructive approach to the estimation of dimension reduction directions[J]. TheAnnals of Statistics, 2007, 35(6): 2654-2690.
[24] LI B, WANG S. On directional regression for dimension reduction[J]. Journal of the AmericanStatistical Association, 2007, 102(479): 997-1008.
[25] WANG L, YANG L. Spline estimation of single-index models[J]. Statistica Sinica, 2009: 765-783.
[26] KONG E, XIA Y. A single-index quantile regression model and its estimation[J]. EconometricTheory, 2012, 28(4): 730-768.
[27] SHENG W, YIN X. Direction estimation in single-index models via distance covariance[J].Journal of Multivariate Analysis, 2013, 122: 148-161.
[28] LIU J, ZHANG R, ZHAO W, et al. A robust and efficient estimation method for single indexmodels[J]. Journal of Multivariate Analysis, 2013, 122: 226-238.
[29] JIANG R, ZHOU Z G, QIAN W M, et al. Two step composite quantile regression for singleindexmodels[J]. Computational Statistics & Data Analysis, 2013, 64: 180-191.
[30] LIU X. A cubic polynomial single-index model and its estimate[J]. Master Thesis, 2020, 49.
[31] CARROLL R J, FAN J, GIJBELS I, et al. Generalized partially linear single-index models[J].Journal of the American Statistical Association, 1997, 92(438): 477-489.
[32] XIA Y, LI W K. On single-index coefficient regression models[J]. Journal of the AmericanStatistical Association, 1999, 94(448): 1275-1285.
[33] XIA Y, TONG H, LI W K. On extended partially linear single-index models[J]. Biometrika,1999, 86(4): 831-842.
[34] WANG J L, XUE L, ZHU L, et al. Estimation for a partial-linear single-index model[J]. TheAnnals of statistics, 2010, 38(1): 246-274.
[35] WANG K, LIN L. New efficient estimation and variable selection in models with single-indexstructure[J]. Statistics & Probability Letters, 2014, 89: 58-64.
[36] WANG Q, WU R. Shrinkage estimation of partially linear single-index models[J]. Statistics &Probability Letters, 2013, 83(10): 2324-2331.
[37] FRANK L E, FRIEDMAN J H. A statistical view of some chemometrics regression tools[J].Technometrics, 1993, 35(2): 109-135.
[38] BREIMAN L. Better subset regression using the nonnegative garrote[J]. Technometrics, 1995,37(4): 373-384.
[39] ZOU H, HASTIE T. Regularization and variable selection via the elastic net[J]. Journal of theroyal statistical society: series B (statistical methodology), 2005, 67(2): 301-320.
[40] ZHANG C H. Nearly unbiased variable selection under minimax concave penalty[J]. TheAnnals of statistics, 2010, 38(2): 894-942.
[41] 李锋, 卢一强, 李高荣. 部分线性模型的AdaptiveLASSO 变量选择[J]. 应用概率统计, 2012,28(6): 614-624.
[42] FAN J, LI R. New estimation and model selection procedures for semiparametric modeling inlongitudinal data analysis[J]. Journal of the American Statistical Association, 2004, 99(467):710-723.
[43] LI L. Sparse sufficient dimension reduction[J]. Biometrika, 2007, 94(3): 603-613.
[44] ZHU L P, ZHU L X. Nonconcave penalized inverse regression in single-index models with highdimensional predictors[J]. Journal of Multivariate Analysis, 2009, 100(5): 862-875.
[45] PENG H, HUANG T. Penalized least squares for single index models[J]. Journal of StatisticalPlanning and Inference, 2011, 141(4): 1362-1379.
[46] ZENG P, HE T, ZHU Y. A lasso-type approach for estimation and variable selection in singleindex models[J]. Journal of Computational and Graphical Statistics, 2012, 21(1): 92-109.
[47] ROJAS C R, WAHLBERG B, HJALMARSSON H. A sparse estimation technique for generalmodel structures[C]//2013 European Control Conference (ECC). IEEE, 2013: 2410-2414.
[48] ZENG B, WEN X M, ZHU L. A link-free sparse group variable selection method for singleindexmodel[J]. Journal of Applied Statistics, 2017, 44(13): 2388-2400.
[49] CHEN X, SHENG W, YIN X. Efficient sparse estimate of sufficient dimension reduction inhigh dimension[J]. Technometrics, 2018, 60(2): 161-168.
[50] FENG Y, XIAO L, CHI E C. Sparse single index models for multivariate responses[J]. Journalof Computational and Graphical Statistics, 2021, 30(1): 115-124.
[51] WANG T, XIA Y. A piecewise single-index model for dimension reduction[J]. Technometrics,2014, 56(3): 312-324.
[52] LU Y, ZHANG R, HU B. The adaptive lasso spline estimation of single-index model[J]. Journalof Systems Science and Complexity, 2016, 29(4): 1100-1111.
[53] YOUNG M. The stone-weierstrass theorem[M]//MATH 328 Notes. Queen’s University atKingston, 2006.
[54] FAN J, HUANG T. Profile likelihood inferences on semiparametric varying-coefficient partiallylinear models[J]. Bernoulli, 2005, 11(6): 1031-1057.
[55] HARRISON JR D, RUBINFELD D L. Hedonic housing prices and the demand for clean air[J].Journal of environmental economics and management, 1978, 5(1): 81-102.
[56] ZHANG H H, CHENG G, LIU Y. Linear or nonlinear? automatic structure discovery forpartially linear models[J]. Journal of the American Statistical Association, 2011, 106(495):1099-1112.

所在学位评定分委会
统计与数据科学系
国内图书分类号
O212.1
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/336418
专题理学院_统计与数据科学系
推荐引用方式
GB/T 7714
Mei LL. SPARSE ESTIMATION FOR CUBIC POLYNOMIAL SINGLE-INDEX MODEL[D]. 深圳. 南方科技大学,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12032859-梅蕾蕾-统计与数据科学(5632KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[梅蕾蕾]的文章
百度学术
百度学术中相似的文章
[梅蕾蕾]的文章
必应学术
必应学术中相似的文章
[梅蕾蕾]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。