Detail publikačního výsledku

SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels

KIŠŠ, M.; HRADIŠ, M.; BENEŠ, K.; BUCHAL, P.; KULA, M.

Originální název

SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels

Anglický název

SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels

Druh

Článek WoS

Originální abstrakt

This paper explores semi-supervised training for sequence tasks, such as optical character recognition or automatic speech recognition. We propose a novel loss function-SoftCTC-which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence-based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely tuned filtering-based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a nave CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.

Anglický abstrakt

This paper explores semi-supervised training for sequence tasks, such as optical character recognition or automatic speech recognition. We propose a novel loss function-SoftCTC-which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence-based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely tuned filtering-based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a nave CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.

Klíčová slova

CTC, SoftCTC, OCR, Text recognition, Confusion networks

Klíčová slova v angličtině

CTC, SoftCTC, OCR, Text recognition, Confusion networks

Autoři

KIŠŠ, M.; HRADIŠ, M.; BENEŠ, K.; BUCHAL, P.; KULA, M.

Rok RIV

2024

Vydáno

06.10.2023

Kniha

International Journal on Document Analysis and Recognition

ISSN

1433-2825

Periodikum

International Journal on Document Analysis and Recognition

Svazek

2024

Číslo

27

Stát

Spolková republika Německo

Strany od

177

Strany do

193

Strany počet

17

URL

BibTex

@article{BUT185136,
  author="Martin {Kišš} and Michal {Hradiš} and Karel {Beneš} and Petr {Buchal} and Michal {Kula}",
  title="SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels",
  journal="International Journal on Document Analysis and Recognition",
  year="2023",
  volume="2024",
  number="27",
  pages="177--193",
  doi="10.1007/s10032-023-00452-9",
  issn="1433-2833",
  url="https://link.springer.com/article/10.1007/s10032-023-00452-9"
}