中文版 | English
题名

A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception

其他题名
分段注意力解码方法及其在基于神经反馈的目标语音感知中的应用
姓名
学号
11650012
学位类型
博士
学位专业
电子与电气工程
学科门类/专业学位类别
电子与电气工程
导师
陈霏
外机构导师
Ed X. Wu
论文答辩日期
2022-03-09
论文提交日期
2022-03-11
学位授予单位
香港大学
学位授予地点
香港
摘要

Human listeners can perceive the target speech stream effortlessly in the complex auditory scenarios. Neuroimaging technologies such as electroencephalography (EEG) have been widely used to understand the neural mechanism of target speech perception and to decode the auditory attention modulation patterns in the complex auditory scenes. Previous behavioral and neurological studies demonstrated that target speech perception depends on the regularly hierarchical structures, nevertheless, little is known about the interactions between auditory attention modulation and different speech segments to the target speech perception. This doctoral study mainly aimed to reveal the underlying mechanism of auditory attention modulation for different root-mean-square (RMS)-level-based speech segments, and to develop advanced auditory attention decoding (AAD) methods to further help target speech perception in the complex auditory scenarios. Firstly, the contribution of different RMS-level-based segments to speech perception was examined through related behavioral and neurological tests. Behavioral results showed that different RMS-level-based segments carrying distinct information played different roles in speech intelligibility. Besides, neurological experiments demonstrated that each type of RMS-level-based speech segment elicited a specific cortical response pattern with the auditory attention modulation, indicating that the target speech perception was jointly affected by different types of RMS-level-based segments and auditory attention modulation. These findings provided new perspectives to understand the speech perception mechanisms of the auditory attention modulation in the complex auditory scenes. Following that, an effective speech-RMS-level-based segmented AAD model was proposed to promote the AAD performance in a wide range of signal-to-masker ratios (SMRs). The proposed segmented AAD model consisted of three steps. First, a support vector machine classifier was used to predict the perceived auditory stimuli belonging to higher- or lower-RMS-level-based speech segments through the corresponding EEG signals. Subsequently, the speech envelope was reconstructed using the specific AAD model in each type of speech segment. Lastly, the target speech was determined by comparing the correlation coefficients between the original and reconstructed speech envelopes. Compared to the traditional unified AAD model, which did not separate the functional roles of higher- or lower-RMS-level-based speech segments in AAD, the proposed segmented computational method significantly improved the AAD accuracy even under low SMR levels and with the short decoding window lengths. Lastly, the proposal segmented AAD model was further combined with advanced speech processing algorithms to develop an intention-adaptive speech signal processing system in the competing-speaker environments. In order to apply such a neurofeedback-based speech signal processing system in the real-life scenes, subjects were required to focus or switch their attention between the competing speakers according to the experimental requirements. Results showed that the cortical tracking ability to the target speech streams could be a reliable biomarker to reflect dynamics of auditory attention states. The neurofeedback-based intention-adaptive system could facilitate the target speech perception under the different SMRs when the auditory attention was dynamically switched from one to the other speaker stream. These findings indicated that the neurofeedback-based speech separation system has the potential to improve target speech perception in the complex auditory scenes

关键词
语种
英语
培养类别
联合培养
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/406323
专题工学院_电子与电气工程系
推荐引用方式
GB/T 7714
Wang L. A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception[D]. 香港. 香港大学,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
A Segmented Auditory(10684KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[王蕾]的文章
百度学术
百度学术中相似的文章
[王蕾]的文章
必应学术
必应学术中相似的文章
[王蕾]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。