Detail publikačního výsledku

Diacorrect: Error Correction Back-End for Speaker Diarization

HAN, J.; LANDINI, F.; ROHDIN, J.; DIEZ SÁNCHEZ, M.; BURGET, L.; CAO, Y.; LU, H.; ČERNOCKÝ, J.

Originální název

Diacorrect: Error Correction Back-End for Speaker Diarization

Anglický název

Diacorrect: Error Correction Back-End for Speaker Diarization

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

In this work, we propose an error correction framework, named DiaCorrect, to refine the output of a diarization system in a simple yet effective way. This method is inspired by error correction techniques in automatic speech recognition. Our model consists of two parallel convolutional encoders and a transformerbased decoder. By exploiting the interactions between the input recording and the initial system's outputs, DiaCorrect can automatically correct the initial speaker activities to minimize the diarization errors. Experiments on 2-speaker telephony data show that the proposed DiaCorrect can effectively improve the initial model's results. Our source code is publicly available at https://github.com/BUTSpeechFIT/diacorrect.

Anglický abstrakt

In this work, we propose an error correction framework, named DiaCorrect, to refine the output of a diarization system in a simple yet effective way. This method is inspired by error correction techniques in automatic speech recognition. Our model consists of two parallel convolutional encoders and a transformerbased decoder. By exploiting the interactions between the input recording and the initial system's outputs, DiaCorrect can automatically correct the initial speaker activities to minimize the diarization errors. Experiments on 2-speaker telephony data show that the proposed DiaCorrect can effectively improve the initial model's results. Our source code is publicly available at https://github.com/BUTSpeechFIT/diacorrect.

Klíčová slova

Speaker diarization, error correction, conversational telephone speech

Klíčová slova v angličtině

Speaker diarization, error correction, conversational telephone speech

Autoři

HAN, J.; LANDINI, F.; ROHDIN, J.; DIEZ SÁNCHEZ, M.; BURGET, L.; CAO, Y.; LU, H.; ČERNOCKÝ, J.

Rok RIV

2025

Vydáno

14.04.2024

Nakladatel

IEEE Signal Processing Society

Místo

Seoul

ISBN

979-8-3503-4485-1

Kniha

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Strany od

11181

Strany do

11185

Strany počet

5

URL

Plný text v Digitální knihovně

BibTex

@inproceedings{BUT189697,
  author="HAN, J. and LANDINI, F. and ROHDIN, J. and DIEZ SÁNCHEZ, M. and BURGET, L. and CAO, Y. and LU, H. and ČERNOCKÝ, J.",
  title="Diacorrect: Error Correction Back-End for Speaker Diarization",
  booktitle="ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
  year="2024",
  pages="11181--11185",
  publisher="IEEE Signal Processing Society",
  address="Seoul",
  doi="10.1109/ICASSP48485.2024.10446968",
  isbn="979-8-3503-4485-1",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446968"
}

Dokumenty