Přístupnostní navigace
E-application
Search Search Close
Publication result detail
DELCROIX, M.; ŽMOLÍKOVÁ, K.; OCHIAI, T.; KINOSHITA, K.; NAKATANI, T.
Original Title
Speaker activity driven neural speech extraction
English Title
Type
Paper in proceedings (conference paper)
Original Abstract
Target speech extraction, which extracts the speech of a targetspeaker in a mixture given auxiliary speaker clues, has recentlyreceived increased interest. Various clues have been investigatedsuch as pre-recorded enrollment utterances, direction information,or video of the target speaker. In this paper, we explore the use ofspeaker activity information as an auxiliary clue for single-channelneural network-based speech extraction. We propose a speaker activitydriven speech extraction neural network (ADEnet) and showthat it can achieve performance levels competitive with enrollmentbasedapproaches, without the need for pre-recordings. We furtherdemonstrate the potential of the proposed approach for processingmeeting-like recordings, where speaker activity obtained from a diarizationsystem is used as a speaker clue for ADEnet. We show thatthis simple yet practical approach can successfully extract speakersafter diarization, which leads to improved ASR performancewhen using a single microphone, especially in high overlappingconditions, with relative word error rate reduction of up to 25 %.
English abstract
Keywords
Speech extraction, Speaker activity, Speech enhancement,Meeting recognition, Neural network
Key words in English
Authors
RIV year
2022
Released
06.06.2021
Publisher
IEEE Signal Processing Society
Location
Toronto
ISBN
978-1-7281-7605-5
Book
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages from
6099
Pages to
6103
Pages count
5
URL
https://www.fit.vut.cz/research/publication/12479/
BibTex
@inproceedings{BUT171749, author="DELCROIX, M. and ŽMOLÍKOVÁ, K. and OCHIAI, T. and KINOSHITA, K. and NAKATANI, T.", title="Speaker activity driven neural speech extraction", booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings", year="2021", pages="6099--6103", publisher="IEEE Signal Processing Society", address="Toronto", doi="10.1109/ICASSP39728.2021.9414998", isbn="978-1-7281-7605-5", url="https://www.fit.vut.cz/research/publication/12479/" }
Documents
delcroix_icassp2021_09414998