中文版 | English
题名

Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data from Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries

作者
通讯作者Huang,Wei
发表日期
2022-06-01
DOI
发表期刊
EISSN
2369-2960
卷号8期号:6
摘要
Background: The SARS-COV-2 virus and its variants pose extraordinary challenges for public health worldwide. Timely and accurate forecasting of the COVID-19 epidemic is key to sustaining interventions and policies and efficient resource allocation. Internet-based data sources have shown great potential to supplement traditional infectious disease surveillance, and the combination of different Internet-based data sources has shown greater power to enhance epidemic forecasting accuracy than using a single Internet-based data source. However, existing methods incorporating multiple Internet-based data sources only used real-time data from these sources as exogenous inputs but did not take all the historical data into account. Moreover, the predictive power of different Internet-based data sources in providing early warning for COVID-19 outbreaks has not been fully explored. Objective: The main aim of our study is to explore whether combining real-time and historical data from multiple Internet-based sources could improve the COVID-19 forecasting accuracy over the existing baseline models. A secondary aim is to explore the COVID-19 forecasting timeliness based on different Internet-based data sources. Methods: We first used core terms and symptom-related keyword-based methods to extract COVID-19–related Internet-based data from December 21, 2019, to February 29, 2020. The Internet-based data we explored included 90,493,912 online news articles, 37,401,900 microblogs, and all the Baidu search query data during that period. We then proposed an autoregressive model with exogenous inputs, incorporating real-time and historical data from multiple Internet-based sources. Our proposed model was compared with baseline models, and all the models were tested during the first wave of COVID-19 epidemics in Hubei province and the rest of mainland China separately. We also used lagged Pearson correlations for COVID-19 forecasting timeliness analysis. Results: Our proposed model achieved the highest accuracy in all 5 accuracy measures, compared with all the baseline models of both Hubei province and the rest of mainland China. In mainland China, except for Hubei, the COVID-19 epidemic forecasting accuracy differences between our proposed model (model i) and all the other baseline models were statistically significant (model 1, t=–8.722, P<.001; model 2, t=–5.000, P<.001, model 3, t=–1.882, P=.06; model 4, t=–4.644, P<.001; model 5, t=–4.488, P<.001). In Hubei province, our proposed model's forecasting accuracy improved significantly compared with the baseline model using historical new confirmed COVID-19 case counts only (model 1, t=–1.732, P=.09). Our results also showed that Internet-based sources could provide a 2- to 6-day earlier warning for COVID-19 outbreaks. Conclusions: Our approach incorporating real-time and historical data from multiple Internet-based sources could improve forecasting accuracy for epidemics of COVID-19 and its variants, which may help improve public health agencies' interventions and resource allocation in mitigating and controlling new waves of COVID-19 or other relevant epidemics.
关键词
相关链接[Scopus记录]
语种
英语
学校署名
通讯
Scopus记录号
2-s2.0-85132455883
来源库
Scopus
引用统计
被引频次[WOS]:0
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/401643
专题商学院
商学院_信息系统与管理工程系
作者单位
1.School of Management,Xi’an Jiaotong University,Xi'an,China
2.Department of Information Systems,City University of Hong Kong,Hong Kong
3.National Center for Applied Mathematics Shenzhen,Shenzhen,China
4.College of Business,Southern University of Science and Technology,Shenzhen,China
5.Department of Information Systems and Intelligent Business,School of Management,Xi’an Jiaotong University,Xi'an,China
6.College of Public Health,University of Georgia,Athens,United States
7.School of Economics,University of Nottingham Ningbo China,Ningbo,China
8.School of Medicine and Health Management,Tongji Medical College,Huazhong University of Science and Technology,Wuhan,China
通讯作者单位商学院
推荐引用方式
GB/T 7714
Li,Jingwei,Huang,Wei,Sia,Choon Ling,et al. Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data from Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries[J]. JMIR Public Health and Surveillance,2022,8(6).
APA
Li,Jingwei,Huang,Wei,Sia,Choon Ling,Chen,Zhuo,Wu,Tailai,&Wang,Qingnan.(2022).Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data from Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries.JMIR Public Health and Surveillance,8(6).
MLA
Li,Jingwei,et al."Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data from Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries".JMIR Public Health and Surveillance 8.6(2022).
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Li,Jingwei]的文章
[Huang,Wei]的文章
[Sia,Choon Ling]的文章
百度学术
百度学术中相似的文章
[Li,Jingwei]的文章
[Huang,Wei]的文章
[Sia,Choon Ling]的文章
必应学术
必应学术中相似的文章
[Li,Jingwei]的文章
[Huang,Wei]的文章
[Sia,Choon Ling]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。