Abstract
In recent years a simple representation of a speech excerpt has been proposed, as a binary matrix allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a temporal information representation based on sequential changes present in the binary representation. A new temporal information is proposed in order to add it to speaker recognition systems. A new specificity selection approach using a mask in the cumulative vector space is also proposed. Furthermore in this space, temporal information can be exploited to compensate for the effects of session variability. A new variability compensation method in the temporal space is proposed in order to remove the unwanted attributes of session variability and the common attributes among speakers. This aims to increase effectiveness in the speaker binary key paradigm. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement of 9%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.
Get full access to this article
View all access options for this article.
