Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
HAN, J.; LANDINI, F.; ROHDIN, J.; SILNOVA, A.; DIEZ, M.; ČERNOCKÝ, J.; BURGET, L.
Originální název
Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization
Anglický název
Druh
Stať ve sborníku v databázi WoS či Scopus
Originální abstrakt
Self-supervised learning (SSL) models like WavLM can be effectively utilized when building speaker diarization systems but are often large and slow, limiting their use in resource-constrained scenarios. Previous studies have explored compression techniques, but usually for the price of degraded performance at high pruning ratios. In this work, we propose to compress SSL models through structured pruning by introducing knowledge distillation. Different from the existing works, we emphasize the importance of fine-tuning SSL models before pruning. Experiments on far-field single-channel AMI, AISHELL-4, and AliMeeting datasets show that our method can remove redundant parameters of WavLM Base+ and WavLM Large by up to 80% without any performance degradation. After pruning, the inference speeds on a single GPU for the Base+ and Large models are 4.0 and 2.6 times faster, respectively. Our source code is publicly available.
Anglický abstrakt
Klíčová slova
fine-tuning | knowledge distillation | model compression | speaker diarization | structured pruning | WavLM
Klíčová slova v angličtině
Autoři
Rok RIV
2026
Vydáno
17.08.2025
Nakladatel
International Speech Communication Association
Místo
Rotterdam, The Netherlands
Kniha
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
Periodikum
Interspeech
Stát
Nizozemsko
Strany od
1583
Strany do
1587
Strany počet
5
URL
https://www.isca-archive.org/interspeech_2025/han25_interspeech.pdf
BibTex
@inproceedings{BUT199389, author="Jiangyu {Han} and Federico Nicolás {Landini} and Johan Andréas {Rohdin} and Anna {Silnova} and Mireia {Diez Sánchez} and Jan {Černocký} and Lukáš {Burget}", title="Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association Interspeech", year="2025", journal="Interspeech", pages="1583--1587", publisher="International Speech Communication Association", address="Rotterdam, The Netherlands", doi="10.21437/Interspeech.2025-484", url="https://www.isca-archive.org/interspeech_2025/han25_interspeech.pdf" }
Dokumenty
han_interspeech_2025