Handwritten text generation and strikethrough characters augmentation

التفاصيل البيبلوغرافية
العنوان: Handwritten text generation and strikethrough characters augmentation
المؤلفون: Shonenkov, Alex, Karachev, Denis, Novopoltsev, Max, Potanin, Mark, Dimitrov, Denis, Chertok, Andrey
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, I.7.5, I.4.6
الوصف: We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate (WER) and Character Error Rate (CER) beyond best-reported results on handwriting text recognition (HTR) tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix), which proved to be very effective in HTR tasks. StackMix uses weakly-supervised framework to get character boundaries. Because these data augmentation techniques are independent of the network used, they could also be applied to enhance the performance of other networks and approaches to HTR. Extensive experiments on ten handwritten text datasets show that HandWritten Blots augmentation and StackMix significantly improve the quality of HTR models
Comment: 16 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2108.11667
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2112.07395
رقم الأكسشن: edsarx.2112.07395
قاعدة البيانات: arXiv