background
logo
ArxivPaperAI

Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN

Author:
Shiqi Zhang, Zheng Qiu, Daiki Takeuchi, Noboru Harada, Shoji Makino
Keyword:
Electrical Engineering and Systems Science, Audio and Speech Processing, Audio and Speech Processing (eess.AS), Sound (cs.SD)
journal:
--
date:
2024-02-13 00:00:00
Abstract
With the rapid development of neural networks in recent years, the ability of various networks to enhance the magnitude spectrum of noisy speech in the single-channel speech enhancement domain has become exceptionally outstanding. However, enhancing the phase spectrum using neural networks is often ineffective, which remains a challenging problem. In this paper, we found that the human ear cannot sensitively perceive the difference between a precise phase spectrum and a biased phase (BP) spectrum. Therefore, we propose an optimization method of phase reconstruction, allowing freedom on the global-phase bias instead of reconstructing the precise phase spectrum. We applied it to a Conformer-based Metric Generative Adversarial Networks (CMGAN) baseline model, which relaxes the existing constraints of precise phase and gives the neural network a broader learning space. Results show that this method achieves a new state-of-the-art performance without incurring additional computational overhead.
PDF: Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN.pdf
Empowered by ChatGPT