题名 | A Deep Learning Based Approach to Synthesize Intelligible Speech with Limited Temporal Envelope Information |
作者 | |
DOI | |
发表日期 | 2022
|
ISSN | 2375-7477
|
ISBN | 978-1-7281-2783-5
|
会议录名称 | |
页码 | 1972-1976
|
会议日期 | 11-15 July 2022
|
会议地点 | Glasgow, Scotland, United Kingdom
|
摘要 | Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future. |
关键词 | |
学校署名 | 其他
|
相关链接 | [IEEE记录] |
来源库 | IEEE
|
全文链接 | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9871247 |
引用统计 |
被引频次[WOS]:0
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/401559 |
专题 | 工学院_电子与电气工程系 工学院_生物医学工程系 |
作者单位 | 1.Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan 2.Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China 3.Department of Biomedical Engineering and Medical Device Innovation & Translation Center, National Yang Ming Chiao Tung University, Taipei, Taiwan |
推荐引用方式 GB/T 7714 |
Ching-Ju Hsiao,Fei Chen,Ji-Yan Han,et al. A Deep Learning Based Approach to Synthesize Intelligible Speech with Limited Temporal Envelope Information[C],2022:1972-1976.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论