Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; NAKATANI, T.
Originální název
Speaker activity driven neural speech extraction
Anglický název
Druh
Stať ve sborníku v databázi WoS či Scopus
Originální abstrakt
Target speech extraction, which extracts the speech of a targetspeaker in a mixture given auxiliary speaker clues, has recentlyreceived increased interest. Various clues have been investigatedsuch as pre-recorded enrollment utterances, direction information,or video of the target speaker. In this paper, we explore the use ofspeaker activity information as an auxiliary clue for single-channelneural network-based speech extraction. We propose a speaker activitydriven speech extraction neural network (ADEnet) and showthat it can achieve performance levels competitive with enrollmentbasedapproaches, without the need for pre-recordings. We furtherdemonstrate the potential of the proposed approach for processingmeeting-like recordings, where speaker activity obtained from a diarizationsystem is used as a speaker clue for ADEnet. We show thatthis simple yet practical approach can successfully extract speakersafter diarization, which leads to improved ASR performancewhen using a single microphone, especially in high overlappingconditions, with relative word error rate reduction of up to 25 %.
Anglický abstrakt
Klíčová slova
Speech extraction, Speaker activity, Speech enhancement,Meeting recognition, Neural network
Klíčová slova v angličtině
Autoři
Rok RIV
2022
Vydáno
06.06.2021
Nakladatel
IEEE Signal Processing Society
Místo
Toronto
ISBN
978-1-7281-7605-5
Kniha
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Strany od
6099
Strany do
6103
Strany počet
5
URL
https://www.fit.vut.cz/research/publication/12479/
BibTex
@inproceedings{BUT171749, author="DELCROIX, M. and ŽMOLÍKOVÁ, K. and OCHIAI, T. and KINOSHITA, K. and NAKATANI, T.", title="Speaker activity driven neural speech extraction", booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings", year="2021", pages="6099--6103", publisher="IEEE Signal Processing Society", address="Toronto", doi="10.1109/ICASSP39728.2021.9414998", isbn="978-1-7281-7605-5", url="https://www.fit.vut.cz/research/publication/12479/" }
Dokumenty
delcroix_icassp2021_09414998