Abstract
Voice activity detection (VAD) identifies the presence/absence of human speech in a frame of a given speech signal. Presence/Absence of human speech can easily be identified in clean speech signal but its accuracy decreases with decreasing Signal-to-Noise ratio (SNR) value. Robust VAD helps to enhance the efficiency of speech signal based automated applications like speech enhancement, speaker identification, hearing aid devices etc. In this paper, a new feature of speech signal- “Peak of Log Magnitude Spectrum (PLMS)” is introduced and used for VAD. This newly defined feature PLMS along with three existing acoustic features(MFCC;RASTA-PLP and Formant Frequency) are used to train SVM classifier for VAD. Experimentally, it is found that coefficients of PLMS play most prominent role. Experimentally, it is also observed that the accuracy of the trained SVM classifier for VAD is the highest when compared with other state of the art methods (Sohn VAD and VAD G.729).
Get full access to this article
View all access options for this article.
