南方科技大学知识苑(SUSTech KC): Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-based Speech Synthesis Model

题名	Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-based Speech Synthesis Model
作者	Chen, Fei1 ; Chiao, Benson C. L.2; Int Speech Commun Assoc
通讯作者	Chen, Fei
DOI	10.21437/Interspeech.2016-66
发表日期	2016
ISSN	19909772
会议录名称	17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5
卷号	08-12-September-2016
页码	1355-1358
会议地点	San Francisco, CA, United states
出版地	C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE
出版者	ISCA-INT SPEECH COMMUNICATION ASSOC
摘要	Vocoder-based speech synthesis model has been long used to assess the contribution of acoustic cue for speech recognition. This study compared the perceptual contributions of amplitude and phase by using two types of stimuli, i.e., amplitude- and phase-based vocoded stimuli. The amplitude-based vocoded stimuli were synthesized by preserving amplitude fluctuation cue but discarding phase cue (i.e., setting phase to zero), while the phase-based vocoded stimuli were synthesized by preserving phase cue and discarding amplitude cue (i.e., setting amplitude to unit). Listening experiments with normal hearing participants showed consistent findings with earlier studies that the intelligibility scores of both amplitude- and phase-based vocoded stimuli increased when using a large number of channels in vocoder-based speech synthesis. In addition, at all tested conditions, the intelligibility scores of amplitude-based vocoded stimuli were significantly larger than those of phase-based vocoded stimuli, suggesting that amplitude might carry more perceptual contribution than phase. This intelligibility advantage of amplitude over phase may be attributed to the difference in the amount of envelope information contained in the two types of vocoded stimuli.
关键词	Speech intelligibility amplitude and phase vocoder simulation
学校署名	第一 ; 通讯
语种	英语
相关链接	[来源记录]
收录类别	CPCI-S ; ISSHP ; CPCI-SSH ; EI
资助项目	National Natural Science Foundation of China[61571213]
WOS研究方向	Acoustics ; Computer Science ; Engineering ; Linguistics
WOS类目	Acoustics ; Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic ; Linguistics
WOS记录号	WOS:000409394400283
EI入藏号	20164603003709
EI主题词	Speech communication ; Speech processing ; Speech recognition ; Speech synthesis ; Vocoders
EI分类号	Speech:751.5 ; Sound Recording:752.2
来源库	Web of Science
引用统计	被引频次[WOS]：0
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/24947
专题	工学院_电子与电气工程系
作者单位	1.Southern Univ Sci & Technol, Dept Elect & Elect Engn, Shenzhen, Peoples R China 2.Univ Hong Kong, Div Speech & Hearing Sci, Hong Kong, Hong Kong, Peoples R China
第一作者单位	电子与电气工程系
通讯作者单位	电子与电气工程系
第一作者的第一单位	电子与电气工程系
推荐引用方式 GB/T 7714	Chen, Fei,Chiao, Benson C. L.,Int Speech Commun Assoc. Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-based Speech Synthesis Model[C]. C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE:ISCA-INT SPEECH COMMUNICATION ASSOC,2016:1355-1358.