题名 | VARIABLE SELECTION BASED ON CONSTRAINED PROJECTION LEAST SQUARES FOR GENERALIZED PARTIAL LINEAR MODELS |
其他题名 | 基于约束投影最小二乘的广义部分线性模型变量选择
|
姓名 | |
学号 | 11930005
|
学位类型 | 硕士
|
学位专业 | 数学
|
导师 | |
论文答辩日期 | 2021-05-15
|
论文提交日期 | 2021-06-15
|
学位授予单位 | 南方科技大学
|
学位授予地点 | 深圳
|
摘要 | Generalized partial linear models are widely investigated in statistical modeling, and usually, they possess a large number of predictors than the number of observations. However, there exists only a finite number of covariables that affect the response variable. Fan and Lv (2008) introduced the sure independence screening (SIS) procedure to decrease the dimensionality of the explanatory vector significantly while retaining the true model with an overwhelming probability. But its sure screening property depends tightly on the strong marginal correlations among the response and predictors, which rarely holds in practice. In this paper, we extend the dimension reduced projection least-squares (DR-PLS) algorithm proposed by Jiang and Wang (2020) in carrying out the model selection in generalized partial linear models. We first extend the ordinary least square (OLS) estimation in high-dimensional generalized linear models. Then we investigate the problem in employing OLS estimation in high-dimensional regression, which fails to approximate the true fitted error. Finally, we optimize the loss function of the model with the oracle constraint and then deduce a projection least square estimation (PLSE) to overcome the problem. The PLSE is essentially an ordinary LSE based on the true feature space. In addressing the discontinuous points of the response, we introduce the concept of projection and have shown that estimation errors only receive tiny and controllable influence. In asymptotic properties, our DR-PLS algorithm was established with the sure screening property. Furthermore, if the response and the important variables do not keep the strong marginal correlation, our algorithm achieves model selection consistency and a low degree of computation complexity. Finally, we execute numerical simulations in finite samples to examine the performance of our recommended screening approach. |
其他摘要 | 广义的部分线性模型在统计建模中被广泛研究,通常其预测量的个数大于观测量的个数。然而,实际上影响响应变量的协变量个数是有限的。Fan and Lv (2008)提出了确定独立筛选(SIS)过程用于显著降低预测变量的维数,并且以极高的概率保留真实模型。但是这种方法的确定筛选性质依赖于重要变量与响应变量间的强边际相关性,这个假设实际上罕见成立。本文将Jiang和Wang (2020)提出的用于高维线性模型中降维的投影最小二乘(DR-PLS)算法推广到广义部分线性模型中。我们首先在高维广义线性模型中推广了常规的最小二乘(OLS)估计。然后研究了在高维回归中使用OLS估计的问题——未能逼近真实拟合误差。最后,本文在具有Oracle限制下优化模型的损失函数,进而推导出了投影最小二乘估计(PLSE)来克服这个问题。PLSE本质上是一个基于真实特征空间的常规最小二乘估计。为了解决响应变量的不连续点,我们定义了投影并且证明了估计误差的影响仅仅受到微小且可控的影响。已经证明了我们的DR-PLS算法具有确定的筛选特性,以及在无需假设响应变量和重要预测变量间具有强边际相关性的情况下,达到了一致的模型选择,并且计算复杂度达到较低的程度。数值实验证明了所提筛选方法的有限样本性能,他在广义部分线性模型下的模型选择效果优于SIS。 |
关键词 | |
其他关键词 | |
语种 | 英语
|
培养类别 | 独立培养
|
成果类型 | 学位论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/229798 |
专题 | 理学院_统计与数据科学系 |
作者单位 | 南方科技大学 |
推荐引用方式 GB/T 7714 |
Ma Y. VARIABLE SELECTION BASED ON CONSTRAINED PROJECTION LEAST SQUARES FOR GENERALIZED PARTIAL LINEAR MODELS[D]. 深圳. 南方科技大学,2021.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
Variable selection b(725KB) | -- | -- | 限制开放 | -- | 请求全文 |
个性服务 |
原文链接 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
导出为Excel格式 |
导出为Csv格式 |
Altmetrics Score |
谷歌学术 |
谷歌学术中相似的文章 |
[马悦]的文章 |
百度学术 |
百度学术中相似的文章 |
[马悦]的文章 |
必应学术 |
必应学术中相似的文章 |
[马悦]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论