Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
Alexander Polok, Jiangyu Han, Dominik Klement, Samuele Cornell, Jan Černocký, Lukáš Burget
Originální název
BUT System for the MLC-SLM Challenge
Anglický název
Druh
Stať ve sborníku mimo WoS a Scopus
Originální abstrakt
We present a two-speaker automatic speech recognition (ASR) system that combines DiCoW—a diarization-conditioned variant of Whisper—with DiariZen, a diarization pipeline built on top of Pyannote. We first evaluate both systems in out-of-domain (OOD) multilingual scenarios without any fine-tuning. In this scenario, DiariZen consistently outperforms the baseline Pyannote diarization model, demonstrating strong generalization. Despite being fine-tuned on English-only data for target-speaker ASR, DiCoW retains solid multilingual performance,indicating that encoder modifications preserve Whisper’s multilingual capabilities. We then fine-tune both DiCoW and DiariZen on the MLC-SLM challenge data. The fine-tuned DiariZen continues to outperform the fine-tuned Pyannote baseline, while DiCoW sees further gains from domain adaptation. Our final system achieves a micro-average tcpWER/CER of 16.75 % and ranks second in Task 2 of the MLC-SLM challenge. Lastly, we identify several labeling inconsistencies in the training data—such as missing speech segments and incorrect silence annotations—which can hinder diarization fine-tuning. We propose simple mitigation strategies to address these issues and improve system robustness.
Anglický abstrakt
Klíčová slova
DiCoW, Multilingual Multi-Talker ASR, DiariZen, Whisper
Klíčová slova v angličtině
Autoři
Rok RIV
2026
Vydáno
22.08.2025
Nakladatel
ISCA
Místo
Strany od
23
Strany do
27
Strany počet
5
URL
https://www.isca-archive.org/mlcslm_2025/polok25_mlcslm.pdf
BibTex
@inproceedings{BUT199410, author="Alexander {Polok} and Jiangyu {Han} and Dominik {Klement} and {} and Jan {Černocký} and Lukáš {Burget}", title="BUT System for the MLC-SLM Challenge", year="2025", pages="23--27", publisher="ISCA", address="ISCA", doi="10.21437/mlcslm.2025-6", url="https://www.isca-archive.org/mlcslm_2025/polok25_mlcslm.pdf" }
Dokumenty
polok_mlcslm_interspeech-2025_satelite workshop