Dirichlet process mixture models for clustering i-vector data

Shreyas Seshadri, Ulpu Remes, Okko Räsänen

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review


Non-parametric Bayesian methods have recently gained popularity in several research areas dealing with unsupervised learning. These models are capable of simultaneously learning the cluster models as well as their number based on properties of a dataset. The most commonly applied models are using Dirichlet process priors and Gaussian models, called as Dirichlet process Gaussian mixture models (DPGMMs). Recently, von Mises-Fisher mixture models (VMMs) have also been gaining popularity in modelling high-dimensional unit-normalized features such as text documents and gene expression data. VMMs are potentially more efficient in modeling certain speech representations such as i-vector data when compared to the GMM-based models, as they work with unit-normalized features based on cosine distance. The current work investigates the applicability of Dirichlet process VMMs (DPVMMs) for i-vector-based speaker clustering and verification, showing that they indeed show superior performance in comparison to DPGMMs in the tasks. In addition, we introduce an implementation of the DPVMMs with variational inference that is publicly available for use.
Titel på värdpublikationAcoustics, Speech and Signal Processing (ICASSP) : 2017 IEEE International Conference on
Antal sidor5
Utgivningsdatum19 juni 2017
ISBN (tryckt)978-1-5090-4117-6
StatusPublicerad - 19 juni 2017
Externt publiceradJa
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangIEEE International Conference on Acoustics, Speech and Signal Processing - New Orleans, Förenta Staterna (USA)
Varaktighet: 5 mars 20179 mars 2017
Konferensnummer: 42


NamnInternational Conference on Acoustics, Speech, and Signal Processing
ISSN (elektroniskt)2379-190X


  • 112 Statistik
  • 113 Data- och informationsvetenskap

Citera det här