Spelling suggestions: "subject:"0ptical character recognition"" "subject:"aoptical character recognition""
81 |
Vyhodnocení testových formulářů pomocí OCR / Test form evaluation by OCRNoghe, Petr January 2013 (has links)
This thesis deals with the evaluation forms using optical character recognition. Image processing and methods used for OCR is described in the first part of thesis. In the practical part is created database of sample characters. The chosen method is based on correlation between patterns and recognized characters. The program is designed in a graphical environment MATLAB. Finally, several forms are evaluated and success rate of the proposed program is detected.
|
82 |
Improvement of Optical Character Recognition on Scanned Historical Documents Using Image ProcessingAula, Lara January 2021 (has links)
As an effort to improve accessibility to historical documents, digitization of historical archives has been an ongoing process at many institutions since the origination of Optical Character Recognition. The old, scanned documents can contain deteriorations acquired over time or caused by old printing methods. Common visual attributes seen on the documents are variations in style and font, broken characters, ink intensity, noise levels and damage caused by folding or ripping and more. Many of these attributes are disfavoring for modern Optical Character Recognition tools and can lead to failed character recognition. This study approaches stated problem by using image processing methods to improve the result of character recognition. Furthermore, common image quality characteristics of scanned historical documents with unidentifiable text are analyzed. The Optical Character Recognition tool used to conduct this research was the open-source Tesseract software. Image processing methods like Gaussian lowpass filtering, Otsu’s optimum thresholding method and morphological operations were used to prepare the historical documents for Tesseract. Using the Precision and Recall classification method, the OCR output was evaluated, and it was seen that the recall improved by 63 percentage points and the precision by 18 percentage points. This shows that using image pre-processing methods as an approach to increase the readability of historical documents for Optical Character Recognition tools is effective. Further it was seen that common characteristics that are especially disadvantageous for Tesseract are font deviations, occurrence of non-belonging objects, character fading, broken characters, and Poisson noise.
|
83 |
CArDIS: A Swedish Historical Handwritten Character and Word Dataset for OCRThummanapally, Shivani, Rijwan, Sakib January 2022 (has links)
Background: To preserve valuable sources and cultural heritage, digitization of handwritten characters is crucial. For this, Optical Character Recognition (OCR) systems were introduced and most widely used to recognize digital characters. Incase of ancient or historical characters, automatic transcription is more challenging due to lack of data, high complexity and low quality of the resource. To solve these problems, multiple image based handwritten dataset were collected from historicaland modern document images. But these dataset also have some limitations. To overcome the limitations, we were inspired to create a new image-based historical handwritten character and word dataset and evaluate it’s performance using machine learning algorithms. Objectives: The main objective of this thesis is to create a first ever Swedish historical handwritten character and word dataset named CArDIS (Character Arkiv Digital Sweden) which will be publicly available for further research. In addition,verify the correctness of the dataset and perform a quantitative analysis using different machine learning methods. Methods: Initially we searched for existing character dataset to know how modern character dataset differs from the historical handwritten dataset. We have performed literature review to learn about most commonly used dataset for OCR. On the other hand, we have also studied different machine learning algorithms and their applica-tions. Finally, we have trained six different machine learning methods namely Support Vector Machine, k-Nearest Neighbor, Convolutional Neural Network, Recurrent Neural Network, Random Forest, SVM-HOG with existing dataset and newly created dataset to evaluate the performance and efficiency of recognizing ancient handwritten characters. Results: The performance/evaluation results show that the machine learning classifiers struggle to recognise the ancient handwritten characters with less recognition accuracy. Out of which CNN outperforms with highest recognition accuracy. Conclusions: The current thesis introduces first ever newly created historical hand-written character and word dataset in Swedish named CArDIS. The character dataset contains 1,01,500 Latin and Swedish character images belonging to 29 classes while the word dataset contains 10,000 word images containing ten popular Swedish names belonging to 10 classes in RGB color space. Also, the performance of six machine learning classifiers on CArDIS and existing datasets have been reported. The thesis concludes that classifiers when trained on existing dataset and tested on CArDIS dataset show low recognition accuracy proving that, the CArDIS dataset have unique characteristics and features over the existing handwritten datasets. Finally, this re-search provided a first Swedish character and word dataset, which is robust with a proven accuracy; also it is publicly available for further research.
|
84 |
Retrofitting analogue meters with smart devices : A feasibility study of local OCR processes on an energy critical driven systemAndreasson, Joel, Ehrenbåge, Elin January 2023 (has links)
Internet of Things (IoT) are becoming increasingly popular replacements for their analogue counterparts. However, there is still demand to keep analogue equipment that is already installed, while also having automated monitoring of the equipment, such as analogue water meters. A proposed solution for this problem is to install a battery powered add-on component that can optically read meter values using Optical Character Recognition (OCR) and transmit the readings wirelessly. Two ways to do this could be to either offload the OCR process to a server, or to do the OCR processing locally on the add-on component. Since water meters are often located where reception is weak and the add-on component is battery powered, a suitable technology for data transmission could be Long Range (LoRa) because of its low-power and long-range capabilities. Since LoRa has low transfer rate there is a need to keep data transfers small in size, which could make offloading a less favorable alternative compared to local OCR processing. The purpose of this thesis is therefore to research the feasibility, in terms of energy efficiency, of doing local OCR processing on the add-on component. The feasibility condition of this study is defined as being able to continually read an analogue meter for a 10-year lifespan, while consuming under 2600 milliampere hours (mAh) of energy. The two OCR algorithms developed for this study are a specialized OCR algorithm that utilizes pattern matching principles, and a Sum of Absolute Differences (SAD) OCR algorithm. These two algorithms have been compared against each other, to determine which one is more suitable for the system. This comparison yielded that the SAD algorithm was more suitable, and was then studied further by using different image resolutions and settings to determine if it was possible to further reduce energy consumption. The results showed that it was possible to significantly reduce energy consumption by reducing the image resolution. The study also researched the possibility of reducing energy consumption further by not reading all digits on the tested water meter, depending on the measuring frequency and water flow. The study concluded that OCR processing is feasible on an energy critical driven system when reading analouge meters, depending on the measuring frequency.
|
85 |
Analysis of the OCR System Application in Intermodal Terminals : Malmö Intermodal TerminalRUBIO VILLALBA, IGNACIO January 2020 (has links)
The analysis carried out in this thesis is made from two different points of view, the qualitative and the quantitative, by using the case study of Malmö intermodal terminal. The first analysis is focused on how the intermodal terminals works and which elements of it interact and how, in order to achieve the purpose of the terminal, and how the Intelligent Video Gate is able to affect in any way to this functioning, mainly in a positive way that allows the better functioning of the terminal.From the quantitative point of view what is carried out is a timing and economic analysis of the Malmö Intermodal Terminal, which is based on the information obtained from the qualitative analysis and from the data provided by the terminal operators that allow to make different simulations to compare the effect of the Intelligent Video Gate implementation in this specific terminal, and that could be extended to similar intermodal terminals located in regions with similar labour conditions and that as the European Union have a huge standardized freight system.Finally, what is stated with the provided data, despite not allowing to make the most complex and representative simulation, is that the aim of the Intelligent Video Gate is reached successfully with a great improvement of the efficiency what allows to ensure with quite certainty that the system implementation is recommended in this kind of terminals.
|
86 |
Artificial Intelligence and Pattern Recognition Technologies for Cultural Heritage : Involvement of Optical Character Recognition Software for Citizen Science in the processes for Crowdsourcing of Ancient Italian Texts.Ballerino, Julie January 2022 (has links)
Cultural heritage makes reference to an extremely diverse set of sources. More specifically, historical artifacts as well as intangible elements of a community’s history, pertain to cultural heritage. However, when looking at the conservation, the enrichment, and the divulgation of these elements, the question becomes more complex. Even more so, when a context of nebulous regulation, unequal distribution of resources and funding of cultural heritage institutions, as well as a bureaucratically complex division of competencies between territories, are present. This is the case in Italy, and more specifically, Southern and Central Italy, where all these issues are present, and further hinder the exploration of undiscovered historical material, as well as the organization and divulgation of discovered material. Following this discrepancy along the lines of legal and practical restrictions, this thesis aims to explore and evaluate how technology can obviate to said issues. For instance, a methodological exercise was endeavored by scanning some ancient texts, in Latin and in Old Italian, and by running an optical character recognition software on the latter. More specifically, this thesis applies the paradigm of citizen science for crowdsourcing to explore how well optical character recognition software works in terms of accessibility and efficiency. As such, this methodological exercise does not consist primarily of a technological evaluation but aims at opening up new ways for the public to interact with cultural heritage institutions, for exchanging historical information while respecting the legal and practical considerations that were mentioned. In conclusion, by highlighting this issue, it would be possible to further research and enrich the publicly available data on Italian educational history between the 18th and the 19th Century.
|
87 |
Complex Document Parsing with Vision Language ModelsYifei Hu (9193709) 17 December 2024 (has links)
<p dir="ltr">This thesis explores the application of vision language models (VLMs) on document layout analysis (DLA) and optical character recognition (OCR). For document layout analysis, we found that VLMs excel at detecting text areas by leveraging their understanding of textual content, rather than relying solely on visual features. This approach proves more robust than traditional object detection methods, particularly for text-rich images typical in document analysis tasks. In addressing OCR challenges, we identified a critical bottleneck: the lack of high-quality, document-level OCR datasets. To overcome this limitation, we developed a novel synthetic data generation pipeline. This pipeline utilizes Large Language Models to create OCR training data by rendering markdown source text into images. Our experiments show that VLMs trained on this synthetic data outperform models trained on conventional datasets. This research highlights the potential of VLMs in document understanding tasks and introduces an innovative approach to generating training data for OCR. Our findings suggest that leveraging the dual image-text understanding capabilities of VLMs, combined with strategically generated synthetic data, can significantly advance the state of the art in document layout analysis and OCR.</p>
|
88 |
OCR of hand-written transcriptions of hieroglyphic textNederhof, Mark-Jan 20 April 2016 (has links) (PDF)
Encoding hieroglyphic texts is time-consuming. If a text already exists as hand-written transcription, there is an alternative, namely OCR. Off-the-shelf OCR systems seem difficult to adapt to the peculiarities of Ancient Egyptian. Presented is a proof-of-concept tool that was designed to digitize texts of Urkunden IV in the hand-writing of Kurt Sethe. It automatically recognizes signs and produces a normalized encoding, suitable for storage in a database, or for printing on a screen or on paper, requiring little manual correction.
The encoding of hieroglyphic text is RES (Revised Encoding Scheme) rather than (common dialects of) MdC (Manuel de Codage). Earlier papers argued against MdC and in favour of RES for corpus development. Arguments in favour of RES include longevity of the encoding, as its semantics are font-independent. The present study provides evidence that RES is also much preferable to MdC in the context of OCR. With a well-understood parsing technique, relative positioning of scanned signs can be straightforwardly mapped to suitable primitives of the encoding.
|
89 |
Capturing, Eliciting, and Prioritizing (CEP) Non-Functional Requirements Metadata during the Early Stages of Agile Software DevelopmentMaiti, Richard Rabin 01 January 2016 (has links)
Agile software engineering has been a popular methodology to develop software rapidly and efficiently. However, the Agile methodology often favors Functional Requirements (FRs) due to the nature of agile software development, and strongly neglects Non-Functional Requirements (NFRs). Neglecting NFRs has negative impacts on software products that have resulted in poor quality and higher cost to fix problems in later stages of software development.
This research developed the CEP “Capture Elicit Prioritize” methodology to effectively gather NFRs metadata from software requirement artifacts such as documents and images. Artifact included the Optical Character Recognition (OCR) artifact which gathered metadata from images. The other artifacts included: Database Artifact, NFR Locator Plus, NFR Priority Artifact, and Visualization Artifact. The NFRs metadata gathered reduced false positives to include NFRs in the early stages of software requirements gathering along with FRs. Furthermore, NFRs were prioritized using existing FRs methodologies which are important to stakeholders as well as software engineers in delivering quality software. This research built on prior studies by specifically focusing on NFRs during the early stages of agile software development.
Validation of the CEP methodology was accomplished by using the 26 requirements of the European Union (EU) eProcurement System. The NORMAP methodology was used as a baseline. In addition, the NERV methodology baseline results were used for comparison. The research results show that the CEP methodology successfully identified NFRs in 56 out of 57 requirement sentences that contained NFRs compared to 50 of the baseline and 55 of the NERV methodology. The results showed that the CEP methodology was successful in eliciting 98.24% of the baseline compared to the NORMAP methodology of 87.71%. This represents an improvement of 10.53% compared to the baseline results. of The NERV methodology result was 96.49% which represents an improvement of 1.75% for CEP. The CEP methodology successfully elicited 86 out of 88 NFR compared to the baseline NORMAP methodology of 75 and NERV methodology of 82. The NFR count elicitation success for the CEP methodology was 97.73 % compared to NORMAP methodology of 85.24 %which is an improvement of 12.49%. Comparison to the NERV methodology of 93.18%, CEP has an improvement of 4.55%. CEP methodology utilized the associated NFR Metadata (NFRM)/Figures/images and linked them to the related requirements to improve over the NORMAP and NERV methodologies. There were 29 baseline NFRs that were found in the associated Figures/images (NFRM) and 129 NFRs were both in the requirement sentence and the associated Figure/images (NFRM).
Another goal of this study was to improve the prioritization of NFRs compared to prior studies. This research provided effective techniques to prioritize NFRs during the early stages of agile software development and the impacts that NFRs have on the software development process. The CEP methodology effectively prioritized NFRs by utilizing the αβγ-framework in a similarly way to FRs. The sub-process of the αβγ-framework was modified in a way that provided a very attractive feature to agile team members. Modification allowed the replacement of parts of the αβγ-framework to suit the team’s specific needs in prioritizing NFRs. The top five requirements based on NFR prioritization were the following: 12.3, 24.5, 15.3, 7.5, and 7.1. The prioritization of NFRs fit the agile software development cycle and allows agile developers and members to plan accordingly to accommodate time and budget constraints.
|
90 |
Arabic text recognition of printed manuscripts : efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processingAl-Muhtaseb, Husni Abdulghani January 2010 (has links)
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%.
|
Page generated in 0.101 seconds