Přístupnostní navigace
E-application
Search Search Close
Publication result detail
ZHANG, L.; STAFYLAKIS, T.; LANDINI, F.; DIEZ SÁNCHEZ, M.; SILNOVA, A.; BURGET, L.
Original Title
Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?
English Title
Type
Paper in proceedings outside WoS and Scopus
Original Abstract
In this paper, we apply the variational information bottleneck approach to end-to-end neural diarization with encoder-decoder attractors (EEND-EDA). This allows us to investigate what in- formation is essential for the model. EEND-EDA utilizes attrac- tors, vector representations of speakers in a conversation. Our analysis shows that, attractors do not necessarily have to con- tain speaker characteristic information. On the other hand, giv- ing the attractors more freedom to allow them to encode some extra (possibly speaker-specific) information leads to small but consistent diarization performance improvements. Despite ar- chitectural differences in EEND systems, the notion of attrac- tors and frame embeddings is common to most of them and not specific to EEND-EDA. We believe that the main conclu- sions of this work can apply to other variants of EEND. Thus, we hope this paper will be a valuable contribution to guide the community to make more informed decisions when designing new systems.
English abstract
Keywords
End-to-End Neural Diarization, Speaker Characteristic Information
Key words in English
Authors
Released
18.06.2024
Publisher
International Speech Communication Association
Location
Québec City
Book
Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop
Pages from
123
Pages to
130
Pages count
8
URL
https://www.isca-archive.org/odyssey_2024/zhang24_odyssey.pdf
BibTex
@inproceedings{BUT193432, author="ZHANG, L. and STAFYLAKIS, T. and LANDINI, F. and DIEZ SÁNCHEZ, M. and SILNOVA, A. and BURGET, L.", title="Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?", booktitle="Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop", year="2024", pages="123--130", publisher="International Speech Communication Association", address="Québec City", doi="10.21437/odyssey.2024-18", url="https://www.isca-archive.org/odyssey_2024/zhang24_odyssey.pdf" }
Documents
zhang_2024_odyssey