题名 | Estimation and Correction of Relative Transfer Function for Binaural Speech Separation Networks to Preserve Spatial Cues |
作者 | |
通讯作者 | Feng,Zicheng |
发表日期 | 2021
|
ISSN | 2640-009X
|
ISBN | 978-1-6654-4162-9
|
会议录名称 | |
页码 | 1239-1244
|
会议日期 | 14-17 Dec. 2021
|
会议地点 | Tokyo, Japan
|
摘要 | Deep learning based approaches have achieved great success in mono-channel and multi-channel speech separation, but limited studies have focused on the binaural output, not even to mention the preservation of spatial cues. Existing speech separation networks preserve spatial cues by improving the signal-to-noise ratio (SNR) of the separated speech, regardless of the different requirements between reducing noise and preserving spatial cues. This work proposed a framework to optimize spatial cue preservation for binaural speech separation. It consisted of a relative transfer function (RTF) corrector that modified the distorted RTF of the separated speech into a correct one, and an RTF estimator to extract the correct RTF. A new RTF estimator was designed to obtain an accurate RTF. The framework was evaluated on a binaural version of WSJ0-2mix dataset, which was spatialized by anechoic head-related impulse responses. Experimental results showed that the proposed framework significantly reduced the interaural time difference (ITD) and interaural level difference (ILD) errors of the existing binaural separation networks, but did not notably sacrifice the SNR of the separated speech signals. |
关键词 | |
学校署名 | 第一
; 通讯
|
语种 | 英语
|
相关链接 | [Scopus记录] |
收录类别 | |
EI入藏号 | 20221211827082
|
EI主题词 | Deep learning
; Separation
; Signal to noise ratio
; Source separation
; Speech
; Speech analysis
|
EI分类号 | Ergonomics and Human Factors Engineering:461.4
; Information Theory and Signal Processing:716.1
; Speech:751.5
; Chemical Operations:802.3
; Mathematics:921
|
Scopus记录号 | 2-s2.0-85126710873
|
来源库 | Scopus
|
全文链接 | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9689486 |
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/328033 |
专题 | 工学院_电子与电气工程系 |
作者单位 | 1.Southern University of Science and Technology,Department of Electrical and Electronic Engineering,China 2.Research Center for Information Technology Innovation,Academia Sinica,Taiwan |
第一作者单位 | 电子与电气工程系 |
通讯作者单位 | 电子与电气工程系 |
第一作者的第一单位 | 电子与电气工程系 |
推荐引用方式 GB/T 7714 |
Feng,Zicheng,Tsao,Yu,Chen,Fei. Estimation and Correction of Relative Transfer Function for Binaural Speech Separation Networks to Preserve Spatial Cues[C],2021:1239-1244.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论