InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

التفاصيل البيبلوغرافية
العنوان: InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write
المؤلفون: Mitrevski, Blagoj, Rak, Arina, Schnitzler, Julian, Li, Chengkun, Maksai, Andrii, Berent, Jesse, Musat, Claudiu
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
الوصف: Digital note-taking is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in the vectorized form, known as digital ink. However, a substantial gap remains between this way of note-taking and traditional pen-and-paper note-taking, a practice still favored by a vast majority. Our work, InkSight, aims to bridge the gap by empowering physical note-takers to effortlessly convert their work (offline handwriting) to digital ink (online handwriting), a process we refer to as Derendering. Prior research on the topic has focused on the geometric properties of images, resulting in limited generalization beyond their training domains. Our approach combines reading and writing priors, allowing training a model in the absence of large amounts of paired samples, which are difficult to obtain. To our knowledge, this is the first work that effectively derenders handwritten text in arbitrary photos with diverse visual characteristics and backgrounds. Furthermore, it generalizes beyond its training domain into simple sketches. Our human evaluation reveals that 87% of the samples produced by our model on the challenging HierText dataset are considered as a valid tracing of the input image and 67% look like a pen trajectory traced by a human. Interactive visualizations of 100 word-level model outputs for each of the three public datasets are available in our Hugging Face space: https://huggingface.co/spaces/Derendering/Model-Output-Playground. Model release is in progress.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2402.05804
رقم الأكسشن: edsarx.2402.05804
قاعدة البيانات: arXiv