中文版 | English
题名

BAYESIAN-BASED TENSOR RECOVERY FOR TRAFFIC DATA IMPUTATION

其他题名
基于贝叶斯方法的交通数据张量补全
姓名
姓名拼音
HUANG Rongping
学号
12232876
学位类型
硕士
学位专业
0701 数学
学科门类/专业学位类别
07 理学
导师
杨丽丽
导师单位
统计与数据科学系
论文答辩日期
2024-05-12
论文提交日期
2024-06-19
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

        Traffic data is commonly obtained through advanced sensors and exhibits spatiotemporal characteristics and supplementary information. However, uncertainties in the real world, such as sensor failures, network fluctuations, can result in data missing in intelligent transportation systems (ITS). These challenges increase the difficulty of traffic forecasting, planning, and other tasks for relevant departments. Consequently, addressing missing traffic data remains a significant problem.
        For data imputation, traffic data is described as different structures, such as vectors, matrices and tensors, and related technologies are explored for completing data. With the increase in the description of spatiotemporal characteristics of real-world traffic data, it has become an inevitable trend to study high-dimension traffic data. At present, many imputation emerging models can carry out semantic interpretation based on the spatiotemporal characteristics of traffic data, which of them adopt tensor representation. The mainstream tensor imputation methods encompass tensor completion and tensor decomposition. However, common tensor decomposition techniques like CP decomposition and Tucker decomposition encounter challenges in selecting the appropriate tensor rank, often demanding significant computational resources and time. Therefore, it is necessary to explore a method that can quickly and efficiently use spatiotemporal information for tensor imputation of traffic data.

        An augmented tensor decomposition model (BACP) is proposed which applies the Multiplicative Gamma Process (MGP) shrinkage prior under the framework of variational Bayes, which solves the problem of rank estimation and can semantically interpret the dimensional information of traffic data due to the structure of the model. A large number of experiments are conducted based on three traffic data sets, with two scenarios of different missing. The experiments show that our proposed BACP is of the best accuracy compared to the baselines determining the tensor rank. At the same time, it can also interpret the dimension information through the explicit patterns of the model, as well as, has faster calculation than traffic data imputation methods with similar structures.

关键词
语种
英语
培养类别
独立培养
入学年份
2022
学位授予年份
2024-07
参考文献列表

[1] ASIF M T, KANNAN S, DAUWELS J, et al., 2013. Data compression techniques for urban traffic data[C/OL]//2013 IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS). 44-49. DOI: 10.1109/CIVTS.2013.6612288.
[2] ASIF M T, MITROVIC N, DAUWELS J, et al., 2016. Matrix and tensor based methods for missing data estimation in large traffic networks[J]. IEEE Transactions on intelligent transportation systems, 17(7): 1816-1825.
[3] CARROLL J D, CHANG J J, 1970. Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition[J]. Psychometrika, 35(3): 283-319.
[4] CHEN J, SHAO J, 2000. Nearest neighbor imputation for survey data[J]. Journal of officialstatistics, 16(2): 113.
[5] CHEN X, WEI Z, LI Z, et al., 2017. Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation[J]. Knowledge-Based Systems, 132: 249-262.
[6] CHEN X, HE Z, WANG J, 2018. Spatial-temporal traffic speed patterns discovery and incomplete data recovery via svd-combined tensor decomposition[J]. Transportation research part C:emerging technologies, 86: 59-77.
[7] CHEN X, HE Z, CHEN Y, et al., 2019. Missing traffic data imputation and pattern discovery with a bayesian augmented tensor factorization model[J]. Transportation Research Part C:Emerging Technologies, 104: 66-77.
[8] CHEN X, HE Z, SUN L, 2019. A bayesian tensor decomposition approach for spatiotemporal traffic data imputation[J]. Transportation research part C: emerging technologies, 98: 73-84.
[9] CHEN X, LEI M, SAUNIER N, et al., 2021. Low-rank autoregressive tensor completion for spatiotemporal traffic data imputation[J]. IEEE Transactions on Intelligent Transportation Systems, 23(8): 12301-12310.
[10] CHENG L, CHEN Z, SHI Q, et al., 2022. Towards flexible sparsity-aware modeling: Automatic tensor rank learning using the generalized hyperbolic prior[J]. IEEE Transactions on Signal Processing, 70: 1834-1849.
[11] CONVY I, HUGGINS W, LIAO H, et al., 2022. Mutual information scaling for tensor network machine learning[J]. Machine learning: science and technology, 3(1): 015017.
[12] DEMPSTER A P, LAIRD N M, RUBIN D B, 1977. Maximum likelihood from incomplete data via the em algorithm[J]. Journal of the royal statistical society: series B (methodological), 39(1): 1-22.
[13] DING C, 2013. Transport development, regional concentration and economic growth[J]. Urban Studies, 50(2): 312-328.
[14] DUAN Y, LV Y, LIU Y L, et al., 2016. An efficient realization of deep learning for traffic data imputation[J]. Transportation research part C: emerging technologies, 72: 168-181.
[15] GE Y, LI H, TUZHILIN A, 2019. Route recommendations for intelligent transportation services[J]. IEEE Transactions on Knowledge and Data Engineering, 33(3): 1169-1182.
[16] GECCHELE G, ROSSI R, GASTALDI M, et al., 2012. Advances in uncertainty treatment in fhwa procedure for estimating annual average daily traffic volume[J]. Transportation research record, 2308(1): 148-156.
[17] GOLD D L, TURNER S M, GAJEWSKI B J, et al., 2001. Imputing missing values in its data archives for intervals under 5 minutes[C]//Transportation Research Board 80th Annual Meeting.
[18] GOULART J D M, KIBANGOU A, FAVIER G, 2017. Traffic data imputation via tensor completion based on soft thresholding of tucker core[J]. Transportation Research Part C: Emerging Technologies, 85: 348-362.
[19] GUO Y, WANG X, WANG M, et al., 2018. An improved low rank matrix completion method for traffic data[C]//2018 11th International Conference on Intelligent Computation Technology and Automation (ICICTA). IEEE: 255-260.
[20] HAN L, ZHENG K, ZHAO L, et al., 2020. Content-aware traffic data completion in its based on generative adversarial nets[J]. IEEE Transactions on Vehicular Technology, 69(10): 11950-11962.
[21] HAWKINS C, ZHANG Z, 2021. Bayesian tensorized neural networks with automatic rank selection[J]. Neurocomputing, 453: 172-180.
[22] HITCHCOCK F L, 1927. The expression of a tensor or a polyadic as a sum of products[J]. Journal of Mathematics and Physics, 6(1-4): 164-189.
[23] HITCHCOCK F L, 1928. Multiple invariants and generalized rank of a p-way matrix or tensor[J]. Journal of Mathematics and Physics, 7(1-4): 39-79.
[24] KOLDA T G, BADER B W, 2009. Tensor decompositions and applications[J]. SIAM review, 51(3): 455-500.
[25] KOTSIA I, PATRAS I, 2011. Support tucker machines[C]//CVPR 2011. IEEE: 633-640.
[26] KOTSIA I, GUO W, PATRAS I, 2012. Higher rank support tensor machines for visual recognition[J]. Pattern Recognition, 45(12): 4192-4203.
[27] LI L, LI Y, LI Z, 2013. Efficient missing data imputing for traffic flow by considering temporal and spatial dependence[J]. Transportation research part C: emerging technologies, 34: 108-120.
[28] LI W, WANG J, FAN R, et al., 2020. Short-term traffic state prediction from latent structures: Accuracy vs. efficiency[J]. Transportation Research Part C: Emerging Technologies, 111: 72-90.
[29] LIU J, MUSIALSKI P, WONKA P, et al., 2012. Tensor completion for estimating missing values in visual data[J]. IEEE transactions on pattern analysis and machine intelligence, 35(1):208-220.
[30] LIU J, ONG G P, CHEN X, 2020. Graphsage-based traffic speed forecasting for segment network with sparse data[J]. IEEE Transactions on Intelligent Transportation Systems, 23(3):1755-1766.
[31] MENG X, FU H, PENG L, et al., 2020. D-lstm: Short-term road traffic speed prediction model based on gps positioning data[J]. IEEE Transactions on Intelligent Transportation Systems, 23(3): 2021-2030.
[32] NI D, LEONARD J D, 2005. Markov chain monte carlo multiple imputation using bayesian networks for incomplete intelligent transportation systems data[J]. Transportation research record, 1935(1): 57-67.
[33] NIE X, PENG J, WU Y, et al., 2022. Real-time traffic speed estimation for smart cities with spatial temporal data: A gated graph attention network approach[J]. Big Data Research, 28: 100313.
[34] QI H, ZHAO X, YAO Y, et al., 2023. Bgcp-based traffic data imputation and accident detection applications for the national trunk highway[J]. Accident Analysis & Prevention, 186: 107051.
[35] QU L, ZHANG Y, HU J, et al., 2008. A bpca based missing value imputing method for traffic flow volume data[C]//2008 IEEE Intelligent Vehicles Symposium. IEEE: 985-990.
[36] QU L, LI L, ZHANG Y, et al., 2009. Ppca-based missing data imputation for traffic flow volume: A systematical approach[J]. IEEE Transactions on intelligent transportation systems, 10(3): 512-522.
[37] RAI P, WANG Y, GUO S, et al., 2014. Scalable bayesian low-rank decomposition of incomplete multiway tensors[C]//International Conference on Machine Learning. PMLR: 1800-1808.
[38] RAN B, TAN H, FENG J, et al., 2016. Estimating missing traffic volume using low multilinear rank tensor completion[J]. Journal of Intelligent Transportation Systems, 20(2): 152-161.
[39] RAN B, TAN H, WU Y, et al., 2016. Tensor based missing traffic data completion with spatial–temporal correlation[J]. Physica A: Statistical Mechanics and its Applications, 446: 54-63.
[40] SCHIFANELLA C, CANDAN K S, SAPINO M L, 2014. Multiresolution tensor decompositions with mode hierarchies[J]. ACM Transactions on Knowledge Discovery from Data(TKDD), 8(2): 1-38.
[41] SILVA-RAMÍREZ E L, PINO-MEJÍAS R, LÓPEZ-COELLO M, et al., 2011. Missing value imputation on missing completely at random data using multilayer perceptrons[J]. Neural Networks, 24(1): 121-129.
[42] SMITH B L, CONKLIN J H, 2002. Use of local lane distribution patterns to estimate missing data values from traffic monitoring systems[J]. Transportation research record, 1811(1): 50-56.
[43] SMITH B L, SCHERER W T, CONKLIN J H, 2003. Exploring imputation techniques for missing data in transportation management systems[J]. Transportation Research Record, 1836(1): 132-142.
[44] TAKAYAMA H, ZHAO Q, HONTANI H, et al., 2022. Bayesian tensor completion and decomposition with automatic cp rank determination using mgp shrinkage prior[J]. SN ComputerScience, 3(3): 225.
[45] TAN H, FENG G, FENG J, et al., 2013. A tensor-based method for missing traffic data completion[J]. Transportation Research Part C: Emerging Technologies, 28: 15-27.
[46] TAN H, WU Y, CHENG B, et al., 2014. Robust missing traffic flow imputation considering nonnegativity and road capacity[J]. Mathematical Problems in Engineering, 2014.
[47] TANG J, WANG Y, ZHANG S, et al., 2015. On missing traffic data imputation based on fuzzy c-means method by considering spatial–temporal correlation[J]. Transportation Research Record, 2528(1): 86-95.
[48] TAO D, LI X, HU W, et al., 2005. Supervised tensor learning[C]//Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE: 8-pp.
[49] TUCKER L R, 1966. Some mathematical notes on three-mode factor analysis[J]. Psychometrika, 31(3): 279-311.
[50] WANG X, WU Y, ZHUANG D, et al., 2023. Low-rank hankel tensor completion for traffic speed estimation[J]. IEEE Transactions on Intelligent Transportation Systems, 24(5): 4862-4871.
[51] WANG Y, ZHENG Y, XUE Y, 2014. Travel time estimation of a path using sparse trajectories[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 25-34.
[52] WU Y, TAN H, LI Y, et al., 2017. Robust tensor decomposition based on cauchy distribution and its applications[J]. Neurocomputing, 223: 107-117.
[53] XIE K, WANG L, WANG X, et al., 2016. Accurate recovery of internet traffic data: A tensor completion approach[C]//IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications. IEEE: 1-9.
[54] XU J R, LI X Y, YI H J, 2010. Short-term traffic flow forecasting model under missing data[J]. Journal of Computer Applications, 30(4): 1117.
[55] XU Y, KONG Q J, KLETTE R, et al., 2014. Accurate and interpretable bayesian mars for traffic flow prediction[J]. IEEE Transactions on Intelligent Transportation Systems, 15(6): 2457-2469.
[56] YE F, WU Z, JIA X, et al., 2023. Bayesian nonlocal patch tensor factorization for hyperspectral image super-resolution[J]. IEEE Transactions on Image Processing.
[57] YIN W, MURRAY-TUITE P, RAKHA H, 2012. Imputing erroneous data of single-station loop detectors for nonincident conditions: Comparison between temporal and spatial methods[J]. Journal of Intelligent Transportation Systems, 16(3): 159-176.
[58] ZHANG K, HAWKINS C, ZHANG Z, 2022. General-purpose bayesian tensor learning with automatic rank determination and uncertainty quantification[J]. Frontiers in Artificial Intelligence, 4: 668353.
[59] ZHANG L, WEI W, SHI Q, et al., 2017. Beyond low rank: A data-adaptive tensor completion method[A].
[60] ZHANG Y, LIU Y, 2009. Missing traffic flow data prediction using least squares support vector machines in urban arterial streets[C]//2009 IEEE Symposium on Computational Intelligence and Data Mining. IEEE: 76-83.
[61] ZHANG Z, HAWKINS C, 2018. Variational bayesian inference for robust streaming tensor factorization and completion[C]//2018 IEEE International Conference on Data Mining (ICDM). IEEE: 1446-1451.
[62] ZHANG Z, LI M, LIN X, et al., 2020. Network-wide traffic flow estimation with insufficient volume detection and crowdsourcing data[J]. Transportation Research Part C: Emerging Technologies, 121: 102870.
[63] ZHANG Z, LIN X, LI M, et al., 2021. A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J]. Transportation Research Part C: Emerging Technologies, 132: 103372.
[64] ZHAO Q, ZHANG L, CICHOCKI A, 2015. Bayesian cp factorization of incomplete tensors with automatic rank determination[J]. IEEE transactions on pattern analysis and machine intelligence, 37(9): 1751-1763.
[65] ZHONG M, LINGRAS P, SHARMA S, 2004. Estimation of missing traffic counts using factor, genetic, neural, and regression techniques[J]. Transportation Research Part C: Emerging Technologies, 12(2): 139-166.
[66] ZHOU H, ZHANG D, XIE K, et al., 2015. Spatio-temporal tensor completion for imputing missing internet traffic data[C]//2015 ieee 34th international performance computing and communications conference (ipccc). IEEE: 1-7.

所在学位评定分委会
数学
国内图书分类号
O211.4
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/765686
专题南方科技大学
理学院_统计与数据科学系
推荐引用方式
GB/T 7714
Huang RP. BAYESIAN-BASED TENSOR RECOVERY FOR TRAFFIC DATA IMPUTATION[D]. 深圳. 南方科技大学,2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12232876-黄荣平-统计与数据科学(1327KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[黄荣平]的文章
百度学术
百度学术中相似的文章
[黄荣平]的文章
必应学术
必应学术中相似的文章
[黄荣平]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。