Abstract
Fully Connected Deep Neural Network (FCDNN) are used for speech enhancement for Hindi speech databases contaminated by a diverse range of background noises. The database includes both stationary and nonstationary noises such as Car Noise, Factory Noise, Machine Gun Noise and Fighter Plane Noise. These noises are added artificially to clean speech signal at varying input Signal-to Noise Ratio (SNR) levels i.e., −5, 0, 5, and 10 db to simulate real-world scenarios with different levels of noise interferences. The background noise, such as Machine Gun and Factory Noise are more non-stationarity compared to Car Noise and Fighter Plane Noises. This distinction underlines the importance of evaluating speech enhancement systems under diverse noise conditions to assess their robustness in real-world applications. The proposed system demonstrates significant improvements in SNR, PESQ and STOI for all four noises. Even with a speech signal corrupted by a highly nonstationary machine gun noise at −5 db input SNR level, an SNR improvement of 13.94 db with PESQ value 2.91 and STOI 0.94 is observed, which shows recovered speech quality and intelligibility is retained. Such findings from the results highlighted the effectiveness of FCDNN-based approaches in removing both stationary and nonstationary background noises from corrupted speech signals. Overall, this research contributes to enhance the quality and intelligibility of speech signals in noisy environments by leveraging the capabilities of deep learning techniques.
Keywords
Get full access to this article
View all access options for this article.
