SoftCTC -- Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
Kišš, Martin, Hradiš, Michal, Beneš, Karel, Buchal, Petr, Kula, Michal
–arXiv.org Artificial Intelligence
This paper explores semi-supervised training for sequence tasks, such as Optical Character Recognition or Automatic Speech Recognition. We propose a novel loss function $\unicode{x2013}$ SoftCTC $\unicode{x2013}$ which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely-tuned filtering based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a na\"ive CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.
arXiv.org Artificial Intelligence
Sep-19-2023
- Country:
- Asia (0.14)
- Europe > Czechia (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > New Finding (0.68)
- Technology: