Abstract
In light of the extensive use of English in daily life, research on improving the quality of noisy English speech plays a crucial role. This paper proposed an enhanced U-Net model for enhancing noisy English speech by incorporating an attention mechanism and optimizing the loss function. The enhancement effect of the method was evaluated using the VoiceBank-DEMAND dataset. The optimized loss function yielded superior enhancement outcomes for noisy English speech. The perceptual evaluation of speech quality (PESQ) was 3.18, the short-time objective intelligibility (STOI) was 0.95, and the CSIG, CBAK, and COVL were 4.41, 3.65, and 3.83, respectively. These results outperformed other U-Net improved models and existing models. The findings validate the efficacy of the proposed approach for enhancing noisy English speech, thereby demonstrating its practical applicability.
Get full access to this article
View all access options for this article.
