Detail výsledku VaV

Originální název

Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors

Anglický název

Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors

Druh

Článek WoS

Originální abstrakt

In our previous work, we introduced our Bayesian Hidden Markov Model with eigenvoice priors, which has been recently recognized as the state-of-the-art model for Speaker Diarization. In this paper we present a more complete analysis of the Diarization system. The inference of the model is fully described and derivations of all update formulas are provided for a complete understanding of the algorithm. An extensive analysis on the effect, sensitivity and interactions of all model parameters is provided, which might be used as a guide for their optimal setting. The newly introduced speaker regularization coefficient allows us to control the number of speakers inferred in an utterance. A naive speaker model merging strategy is also presented, which allows to drive the variational inference out of local optima. Experiments for the different diarization scenarios are presented on CALLHOME and DIHARD datasets.

Anglický abstrakt

In our previous work, we introduced our Bayesian Hidden Markov Model with eigenvoice priors, which has been recently recognized as the state-of-the-art model for Speaker Diarization. In this paper we present a more complete analysis of the Diarization system. The inference of the model is fully described and derivations of all update formulas are provided for a complete understanding of the algorithm. An extensive analysis on the effect, sensitivity and interactions of all model parameters is provided, which might be used as a guide for their optimal setting. The newly introduced speaker regularization coefficient allows us to control the number of speakers inferred in an utterance. A naive speaker model merging strategy is also presented, which allows to drive the variational inference out of local optima. Experiments for the different diarization scenarios are presented on CALLHOME and DIHARD datasets.

Klíčová slova

Hidden Markov Models, Bayes methods, Task analysis, Probabilistic logic, Training, Speech processing, Complexity theory

Klíčová slova v angličtině

Hidden Markov Models, Bayes methods, Task analysis, Probabilistic logic, Training, Speech processing, Complexity theory

Autoři

DIEZ SÁNCHEZ, M.; BURGET, L.; LANDINI, F.; ČERNOCKÝ, J.

Rok RIV

2020

Vydáno

01.12.2020

ISSN

2329-9290

Periodikum

IEEE-ACM Transactions on Audio Speech and Language Processing

Svazek

28

Číslo

1

Stát

Spojené státy americké

Strany od

355

Strany do

368

Strany počet

14

URL

https://ieeexplore.ieee.org/document/8910412

BibTex

@article{BUT161472,
  author="Mireia {Diez Sánchez} and Lukáš {Burget} and Federico Nicolás {Landini} and Jan {Černocký}",
  title="Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors",
  journal="IEEE-ACM Transactions on Audio Speech and Language Processing",
  year="2020",
  volume="28",
  number="1",
  pages="355--368",
  doi="10.1109/TASLP.2019.2955293",
  issn="2329-9290",
  url="https://ieeexplore.ieee.org/document/8910412"
}

Dokumenty

MDiez_IEEE_TASLP_2020

VUT

Fakulty a vysokoškolské ústavy

Součásti

Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors