Article quick-view

Data selection for i-vector based automatic speaker verification anti-spoofing

ABSTRACT

State-of-the-art i-vector based automatic speaker verification (ASV) systems lead to considerably high performance and thus voice becomes one of the most important biometric modality for person authentication. However, similar to other biometrics, ASV systems are highly vulnerable to spoofing attacks. Therefore, developing countermeasures for detecting spoofing attacks plays an important role against the concerns regarding the reliability of ASV systems. Recent studies have shown that simple Gaussian mixture model (GMM) classifier outperforms i-vector countermeasures. In this study, we focus on improving the spoofing detection performance of i-vector system using cosine and probabilistic linear discriminant analysis (PLDA) scoring. Experimental results conducted on ASVspoof 2015 database reveals that the data used to train the two key elements of i-vector system, universal background model (UBM) and the i-vector extractor (T-matrix), play an important role on spoofing detection performance. In this paper, we study the effect of using different type of data (genuine/human or spoofed) to train these two elements and their performance on spoofing detection. In particular, extracting i-vectors using UBM trained with genuine (human) speech utterances and T-matrix trained from both genuine and spoofed utterances leads to 50%<math class="math"><mn is="true">50</mn><mtext is="true">%</mtext></math> performance improvement on spoofing detection. With the proposed scheme, unlike the previous results, i-vector countermeasure outperforms GMM classifier. Finally, experimental results shows that recently proposed constant Q cepstral coefficients (CQCC) shows superior performance in comparison to standard Mel-frequency cepstral coefficients (MFCC).