This master’s thesis explores the application of Artificial Intelligence (AI) in the digitization ofunstructured documents which contains normal text, handwritten text and also integers- a criticalaspect for infrastructure management. As digitization progresses, the efficiency in handling suchdocuments remains a considerable challenge due to their unstructured nature and variedhandwritten quality. The research evaluated several Optical Character Recognition (OCR)models, including Pytesseract, EasyOCR, KerasOCR, docTR, to identify the most effectivemethod for converting handwritten documents into digital, searchable formats. In this study, eachmodel was rigorously tested using a carefully curated dataset containing handwritten and printeddocuments of varying quality and complexity. The models were assessed based on their ability toaccurately recognize characters and words, handle multilingual documents, and process a mix ofhandwritten and printed content. Performance metrics such as Character Error Rate (CER) andWord Error Rate (WER) were used to quantify their accuracy. The results reveal that each model exhibits unique strengths. PyTesseract excelled at convertinghigh-quality images to text with minimal errors, while EasyOCR demonstrated robustrecognition across multiple languages. KerasOCR and docTR proved effective in handlingcomplex, unstructured documents due to their advanced AI architectures. By leveraging thesetechnologies, the thesis proposes an optimized approach that integrates metadata extraction toenhance the organization and searchability of digitized content. The proposed solution,compatible with both CPU and GPU platforms, reduces the time and resources required formanual processing, making it accessible for a broader audience. This research contributes to the field by offering insights into the performance of different OCRmodels and providing a practical, scalable solution for digitizing and managing unstructuredhandwritten documents. The solution promises to significantly improve the efficiency ofdocument management, paving the way for future innovations in this space.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ltu-105734 |
Date | January 2024 |
Creators | Qurban, Hamidullah Ehsani |
Publisher | Luleå tekniska universitet, Institutionen för system- och rymdteknik, hamqur-9@student.ltu.se |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0014 seconds