Detail publikačního výsledku

Dereverberation and Beamforming in Robust Far-Field Speaker Recognition

MOŠNER, L.; PLCHOT, O.; MATĚJKA, P.; NOVOTNÝ, O.; ČERNOCKÝ, J.

Originální název

Dereverberation and Beamforming in Robust Far-Field Speaker Recognition

Anglický název

Dereverberation and Beamforming in Robust Far-Field Speaker Recognition

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

This paper deals with robust speaker verification (SV) in farfieldsensing. The robustness is verified on a subset of NISTSRE 2010 corpus retransmitted in multiple real rooms of differentacoustics and captured with multiple microphones. Weexperimented with various data preprocessing steps includingdifferent approaches to dereverberation and beamforming appliedto ad-hoc microphone arrays. We found that significantimprovements in accuracy can be achieved with neural networkbased generalized eigenvalue beamformer preceded byweighted prediction error dereverberation. We also exploredthe effect of data augmentation by adding various real or simulatedroom acoustic properties to the Probabilistic Linear DiscriminantAnalysis (PLDA) training dataset. As a result, wedeveloped a speaker recognition system whose performanceis stable across different room acoustic conditions. It yields41.4% relative improvement in performance over the systemwithout multi-channel processing tested on the cleanest microphonedata. With the best combination of data preprocessingand augmentation, we obtained a performance close to the onewe achieved with the original clean test data.

Anglický abstrakt

This paper deals with robust speaker verification (SV) in farfieldsensing. The robustness is verified on a subset of NISTSRE 2010 corpus retransmitted in multiple real rooms of differentacoustics and captured with multiple microphones. Weexperimented with various data preprocessing steps includingdifferent approaches to dereverberation and beamforming appliedto ad-hoc microphone arrays. We found that significantimprovements in accuracy can be achieved with neural networkbased generalized eigenvalue beamformer preceded byweighted prediction error dereverberation. We also exploredthe effect of data augmentation by adding various real or simulatedroom acoustic properties to the Probabilistic Linear DiscriminantAnalysis (PLDA) training dataset. As a result, wedeveloped a speaker recognition system whose performanceis stable across different room acoustic conditions. It yields41.4% relative improvement in performance over the systemwithout multi-channel processing tested on the cleanest microphonedata. With the best combination of data preprocessingand augmentation, we obtained a performance close to the onewe achieved with the original clean test data.

Klíčová slova

speaker verification, beamforming, dereverberation,autoencoder

Klíčová slova v angličtině

speaker verification, beamforming, dereverberation,autoencoder

Autoři

MOŠNER, L.; PLCHOT, O.; MATĚJKA, P.; NOVOTNÝ, O.; ČERNOCKÝ, J.

Rok RIV

2019

Vydáno

02.09.2018

Nakladatel

International Speech Communication Association

Místo

Hyderabad

Kniha

Proceedings of Interspeech 2018

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Svazek

2018

Číslo

9

Stát

Francouzská republika

Strany od

1334

Strany do

1338

Strany počet

5

URL

BibTex

@inproceedings{BUT155103,
  author="Ladislav {Mošner} and Oldřich {Plchot} and Pavel {Matějka} and Ondřej {Novotný} and Jan {Černocký}",
  title="Dereverberation and Beamforming in Robust Far-Field Speaker Recognition",
  booktitle="Proceedings of Interspeech 2018",
  year="2018",
  journal="Proceedings of Interspeech",
  volume="2018",
  number="9",
  pages="1334--1338",
  publisher="International Speech Communication Association",
  address="Hyderabad",
  doi="10.21437/Interspeech.2018-2306",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2018/abstracts/2306.html"
}

Dokumenty