中文版 | English
题名

Hierarchical Spatial-temporal Masked Contrast for skeleton action recognition

作者
发表日期
2024
DOI
发表期刊
ISSN
2691-4581
卷号PP期号:99
摘要
In the field of 3D action recognition, self-supervised learning has shown promising results but remains a challenging task. Previous approaches to motion modeling often relied on selecting features solely from the temporal or spatial domain, which limited the extraction of higher-level semantic information. Additionally, traditional one-to-one approaches in multilevel comparative learning overlooked the relationships between different levels, hindering the learning representation of the model. To address these issues, we propose the Hierarchical Spatial-temporal Masked network (HSTM) for learning 3D action representations. HSTM introduces a novel masking method that operates simultaneously in both the temporal and spatial dimensions. This approach leverages semantic relevance to identify meaningful regions in time and space, guiding the masking process based on semantic richness. This guidance is crucial for learning useful feature representations effectively. Furthermore, to enhance the learning of potential features, we introduce cross-level distillation (CLD) to extend the comparative learning approach. By training the model with two types of losses simultaneously, each level of the multi-level comparative learning process can be guided by levels rich in semantic information. This allows for more effective supervision of comparative learning, leading to improved performance. Extensive experiments conducted on the NTU-60, NTU-120, and PKU-MMD datasets demonstrate the effectiveness of our proposed framework. The learned action representations exhibit strong transferability and achieve state-of- the-art results.
相关链接[IEEE记录]
学校署名
其他
引用统计
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/803257
专题工学院_电子与电气工程系
作者单位
1.State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen University
2.Department of Electronic and Electrical Engineering, Southern University of Science and Technology
3.Mechanical Design Engineering, UTBM Université de Technologie de Belfort-Montbéliard
推荐引用方式
GB/T 7714
Wenming Cao,Aoyu Zhang,Zhihai He,et al. Hierarchical Spatial-temporal Masked Contrast for skeleton action recognition[J]. IEEE Transactions on Artificial Intelligence,2024,PP(99).
APA
Wenming Cao,Aoyu Zhang,Zhihai He,Yicha Zhang,&Xinpeng Yin.(2024).Hierarchical Spatial-temporal Masked Contrast for skeleton action recognition.IEEE Transactions on Artificial Intelligence,PP(99).
MLA
Wenming Cao,et al."Hierarchical Spatial-temporal Masked Contrast for skeleton action recognition".IEEE Transactions on Artificial Intelligence PP.99(2024).
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Wenming Cao]的文章
[Aoyu Zhang]的文章
[Zhihai He]的文章
百度学术
百度学术中相似的文章
[Wenming Cao]的文章
[Aoyu Zhang]的文章
[Zhihai He]的文章
必应学术
必应学术中相似的文章
[Wenming Cao]的文章
[Aoyu Zhang]的文章
[Zhihai He]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。