中文版 | English
题名

PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction

作者
通讯作者Jing, Bingyi; Gao, Xin
发表日期
2024-02-01
DOI
发表期刊
ISSN
1367-4803
EISSN
1367-4811
卷号40期号:2
摘要
Motivation Effective drug delivery systems are paramount in enhancing pharmaceutical outcomes, particularly through the use of cell-penetrating peptides (CPPs). These peptides are gaining prominence due to their ability to penetrate eukaryotic cells efficiently without inflicting significant damage to the cellular membrane, thereby ensuring optimal drug delivery. However, the identification and characterization of CPPs remain a challenge due to the laborious and time-consuming nature of conventional methods, despite advances in proteomics. Current computational models, however, are predominantly tailored for balanced datasets, an approach that falls short in real-world applications characterized by a scarcity of known positive CPP instances.Results To navigate this shortfall, we introduce PractiCPP, a novel deep-learning framework tailored for CPP prediction in highly imbalanced data scenarios. Uniquely designed with the integration of hard negative sampling and a sophisticated feature extraction and prediction module, PractiCPP facilitates an intricate understanding and learning from imbalanced data. Our extensive computational validations highlight PractiCPP's exceptional ability to outperform existing state-of-the-art methods, demonstrating remarkable accuracy, even in datasets with an extreme positive-to-negative ratio of 1:1000. Furthermore, through methodical embedding visualizations, we have established that models trained on balanced datasets are not conducive to practical, large-scale CPP identification, as they do not accurately reflect real-world complexities. In summary, PractiCPP potentially offers new perspectives in CPP prediction methodologies. Its design and validation, informed by real-world dataset constraints, suggest its utility as a valuable tool in supporting the acceleration of drug delivery advancements.Availability and implementation The source code of PractiCPP is available on Figshare at https://doi.org/10.6084/m9.figshare.25053878.v1.
相关链接[来源记录]
收录类别
语种
英语
学校署名
通讯
资助项目
NSFC[12371290] ; King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA)["URF/1/ 4352-01-01","FCC/1/1976-44-01","FCC/1/1976-45-01","REI/ 1/5234-01-01","REI/1/5414-01-01","REI/1/5289-01-01","REI/ 1/5404-01-01"]
WOS研究方向
Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Computer Science ; Mathematical & Computational Biology ; Mathematics
WOS类目
Biochemical Research Methods ; Biotechnology & Applied Microbiology ; Computer Science, Interdisciplinary Applications ; Mathematical & Computational Biology ; Statistics & Probability
WOS记录号
WOS:001163271600006
出版者
ESI学科分类
BIOLOGY & BIOCHEMISTRY
来源库
Web of Science
引用统计
被引频次[WOS]:3
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/789280
专题理学院_统计与数据科学系
作者单位
1.Syneron Technol, Guangzhou 510000, Peoples R China
2.Hong Kong Univ Sci & Technol, Individualized Interdisciplinary Program Data Sci, Hong Kong, Peoples R China
3.Hong Kong Univ Sci & Technol Guangzhou, Data Sci & Analyt Thrust, Guangzhou 511400, Guangdong, Peoples R China
4.Southern Univ Sci & Technol, Dept Stat & Data Sci, Shenzhen 518000, Peoples R China
5.King Abdullah Univ Sci & Technol KAUST, Comp Sci Program, Comp Elect & Math Sci & Engn Div, Thuwal 23955, Saudi Arabia
6.King Abdullah Univ Sci & Technol KAUST, Computat Biosci Res Ctr, Thuwal 23955, Saudi Arabia
通讯作者单位统计与数据科学系
推荐引用方式
GB/T 7714
Shi, Kexin,Xiong, Yuanpeng,Wang, Yu,et al. PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction[J]. BIOINFORMATICS,2024,40(2).
APA
Shi, Kexin.,Xiong, Yuanpeng.,Wang, Yu.,Deng, Yifan.,Wang, Wenjia.,...&Gao, Xin.(2024).PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction.BIOINFORMATICS,40(2).
MLA
Shi, Kexin,et al."PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction".BIOINFORMATICS 40.2(2024).
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Shi, Kexin]的文章
[Xiong, Yuanpeng]的文章
[Wang, Yu]的文章
百度学术
百度学术中相似的文章
[Shi, Kexin]的文章
[Xiong, Yuanpeng]的文章
[Wang, Yu]的文章
必应学术
必应学术中相似的文章
[Shi, Kexin]的文章
[Xiong, Yuanpeng]的文章
[Wang, Yu]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。