Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
ZHANG, L.; STAFYLAKIS, T.; LANDINI, F.; DIEZ SÁNCHEZ, M.; SILNOVA, A.; BURGET, L.
Originální název
Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?
Anglický název
Druh
Stať ve sborníku mimo WoS a Scopus
Originální abstrakt
In this paper, we apply the variational information bottleneck approach to end-to-end neural diarization with encoder-decoder attractors (EEND-EDA). This allows us to investigate what in- formation is essential for the model. EEND-EDA utilizes attrac- tors, vector representations of speakers in a conversation. Our analysis shows that, attractors do not necessarily have to con- tain speaker characteristic information. On the other hand, giv- ing the attractors more freedom to allow them to encode some extra (possibly speaker-specific) information leads to small but consistent diarization performance improvements. Despite ar- chitectural differences in EEND systems, the notion of attrac- tors and frame embeddings is common to most of them and not specific to EEND-EDA. We believe that the main conclu- sions of this work can apply to other variants of EEND. Thus, we hope this paper will be a valuable contribution to guide the community to make more informed decisions when designing new systems.
Anglický abstrakt
Klíčová slova
End-to-End Neural Diarization, Speaker Characteristic Information
Klíčová slova v angličtině
Autoři
Vydáno
18.06.2024
Nakladatel
International Speech Communication Association
Místo
Québec City
Kniha
Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop
Strany od
123
Strany do
130
Strany počet
8
URL
https://www.isca-archive.org/odyssey_2024/zhang24_odyssey.pdf
BibTex
@inproceedings{BUT193432, author="ZHANG, L. and STAFYLAKIS, T. and LANDINI, F. and DIEZ SÁNCHEZ, M. and SILNOVA, A. and BURGET, L.", title="Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?", booktitle="Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop", year="2024", pages="123--130", publisher="International Speech Communication Association", address="Québec City", doi="10.21437/odyssey.2024-18", url="https://www.isca-archive.org/odyssey_2024/zhang24_odyssey.pdf" }
Dokumenty
zhang_2024_odyssey