Return to search

Writer Adaptive Hand-Written Text Recognition With Confidence-Based Ensemble : Developing and implementing a pipeline to transcribe Swedish documents

Hand-written text recognition (HTR) is a transformative technology in recent years that significantly assists the study of historical documents, therefore, boosting digital humanity research. Conventional optical character recognition (OCR) technique is sensitive to certain writing styles and thus not adaptive. Our study attempts an adaptive pipeline that enables the HTR of Swedish hand-written documents which contributes to the study of Swedish history including the modern and contemporary democratization process with the Labour’s Memory dataset and Demokrati 100 dataset. This pipeline integrates transfer learning, fine-tuning techniques, and a novel confidence-based ensemble strategy to reduce the transcribing error rate. Our findings demonstrate the efficacy of these strategies in significantly improving performance metrics. Results indicate a substantial reduction in transcribing error rate compared to baseline methods. Notably, our transfer learning model achieves a Character Error Rate (CER) of 6.664%. The introduction of a confidence-based ensemble strategy yields a CER of 5.976%, outperforming any individual model and the baseline significantly. We further propose optimizations in transfer learning by identifying that fine-tuning only recurrent and dense layers balances performance and computational efficiency. This approach enables a more time-efficient training process in case of a large dataset without compromising accuracy, offering practical benefits for real-world applications. Furthermore, our analysis reveals critical insights into the challenges of baseline detection and ground truth accuracy. We identify over-segmentation as a bottleneck in baseline detection and highlight the significance of addressing systematic errors in ground truth data.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-530066
Date January 2024
CreatorsYang, Zhihao
PublisherUppsala universitet, Institutionen för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationIT ; mDA 24 003

Page generated in 0.0018 seconds