Spelling suggestions: "subject:"0ptical character recognition"" "subject:"0ptical haracter recognition""
81 |
Vyhodnocení testových formulářů pomocí OCR / Test form evaluation by OCRNoghe, Petr January 2013 (has links)
This thesis deals with the evaluation forms using optical character recognition. Image processing and methods used for OCR is described in the first part of thesis. In the practical part is created database of sample characters. The chosen method is based on correlation between patterns and recognized characters. The program is designed in a graphical environment MATLAB. Finally, several forms are evaluated and success rate of the proposed program is detected.
|
82 |
Improvement of Optical Character Recognition on Scanned Historical Documents Using Image ProcessingAula, Lara January 2021 (has links)
As an effort to improve accessibility to historical documents, digitization of historical archives has been an ongoing process at many institutions since the origination of Optical Character Recognition. The old, scanned documents can contain deteriorations acquired over time or caused by old printing methods. Common visual attributes seen on the documents are variations in style and font, broken characters, ink intensity, noise levels and damage caused by folding or ripping and more. Many of these attributes are disfavoring for modern Optical Character Recognition tools and can lead to failed character recognition. This study approaches stated problem by using image processing methods to improve the result of character recognition. Furthermore, common image quality characteristics of scanned historical documents with unidentifiable text are analyzed. The Optical Character Recognition tool used to conduct this research was the open-source Tesseract software. Image processing methods like Gaussian lowpass filtering, Otsu’s optimum thresholding method and morphological operations were used to prepare the historical documents for Tesseract. Using the Precision and Recall classification method, the OCR output was evaluated, and it was seen that the recall improved by 63 percentage points and the precision by 18 percentage points. This shows that using image pre-processing methods as an approach to increase the readability of historical documents for Optical Character Recognition tools is effective. Further it was seen that common characteristics that are especially disadvantageous for Tesseract are font deviations, occurrence of non-belonging objects, character fading, broken characters, and Poisson noise.
|
83 |
CArDIS: A Swedish Historical Handwritten Character and Word Dataset for OCRThummanapally, Shivani, Rijwan, Sakib January 2022 (has links)
Background: To preserve valuable sources and cultural heritage, digitization of handwritten characters is crucial. For this, Optical Character Recognition (OCR) systems were introduced and most widely used to recognize digital characters. Incase of ancient or historical characters, automatic transcription is more challenging due to lack of data, high complexity and low quality of the resource. To solve these problems, multiple image based handwritten dataset were collected from historicaland modern document images. But these dataset also have some limitations. To overcome the limitations, we were inspired to create a new image-based historical handwritten character and word dataset and evaluate it’s performance using machine learning algorithms. Objectives: The main objective of this thesis is to create a first ever Swedish historical handwritten character and word dataset named CArDIS (Character Arkiv Digital Sweden) which will be publicly available for further research. In addition,verify the correctness of the dataset and perform a quantitative analysis using different machine learning methods. Methods: Initially we searched for existing character dataset to know how modern character dataset differs from the historical handwritten dataset. We have performed literature review to learn about most commonly used dataset for OCR. On the other hand, we have also studied different machine learning algorithms and their applica-tions. Finally, we have trained six different machine learning methods namely Support Vector Machine, k-Nearest Neighbor, Convolutional Neural Network, Recurrent Neural Network, Random Forest, SVM-HOG with existing dataset and newly created dataset to evaluate the performance and efficiency of recognizing ancient handwritten characters. Results: The performance/evaluation results show that the machine learning classifiers struggle to recognise the ancient handwritten characters with less recognition accuracy. Out of which CNN outperforms with highest recognition accuracy. Conclusions: The current thesis introduces first ever newly created historical hand-written character and word dataset in Swedish named CArDIS. The character dataset contains 1,01,500 Latin and Swedish character images belonging to 29 classes while the word dataset contains 10,000 word images containing ten popular Swedish names belonging to 10 classes in RGB color space. Also, the performance of six machine learning classifiers on CArDIS and existing datasets have been reported. The thesis concludes that classifiers when trained on existing dataset and tested on CArDIS dataset show low recognition accuracy proving that, the CArDIS dataset have unique characteristics and features over the existing handwritten datasets. Finally, this re-search provided a first Swedish character and word dataset, which is robust with a proven accuracy; also it is publicly available for further research.
|
84 |
Retrofitting analogue meters with smart devices : A feasibility study of local OCR processes on an energy critical driven systemAndreasson, Joel, Ehrenbåge, Elin January 2023 (has links)
Internet of Things (IoT) are becoming increasingly popular replacements for their analogue counterparts. However, there is still demand to keep analogue equipment that is already installed, while also having automated monitoring of the equipment, such as analogue water meters. A proposed solution for this problem is to install a battery powered add-on component that can optically read meter values using Optical Character Recognition (OCR) and transmit the readings wirelessly. Two ways to do this could be to either offload the OCR process to a server, or to do the OCR processing locally on the add-on component. Since water meters are often located where reception is weak and the add-on component is battery powered, a suitable technology for data transmission could be Long Range (LoRa) because of its low-power and long-range capabilities. Since LoRa has low transfer rate there is a need to keep data transfers small in size, which could make offloading a less favorable alternative compared to local OCR processing. The purpose of this thesis is therefore to research the feasibility, in terms of energy efficiency, of doing local OCR processing on the add-on component. The feasibility condition of this study is defined as being able to continually read an analogue meter for a 10-year lifespan, while consuming under 2600 milliampere hours (mAh) of energy. The two OCR algorithms developed for this study are a specialized OCR algorithm that utilizes pattern matching principles, and a Sum of Absolute Differences (SAD) OCR algorithm. These two algorithms have been compared against each other, to determine which one is more suitable for the system. This comparison yielded that the SAD algorithm was more suitable, and was then studied further by using different image resolutions and settings to determine if it was possible to further reduce energy consumption. The results showed that it was possible to significantly reduce energy consumption by reducing the image resolution. The study also researched the possibility of reducing energy consumption further by not reading all digits on the tested water meter, depending on the measuring frequency and water flow. The study concluded that OCR processing is feasible on an energy critical driven system when reading analouge meters, depending on the measuring frequency.
|
85 |
Analysis of the OCR System Application in Intermodal Terminals : Malmö Intermodal TerminalRUBIO VILLALBA, IGNACIO January 2020 (has links)
The analysis carried out in this thesis is made from two different points of view, the qualitative and the quantitative, by using the case study of Malmö intermodal terminal. The first analysis is focused on how the intermodal terminals works and which elements of it interact and how, in order to achieve the purpose of the terminal, and how the Intelligent Video Gate is able to affect in any way to this functioning, mainly in a positive way that allows the better functioning of the terminal.From the quantitative point of view what is carried out is a timing and economic analysis of the Malmö Intermodal Terminal, which is based on the information obtained from the qualitative analysis and from the data provided by the terminal operators that allow to make different simulations to compare the effect of the Intelligent Video Gate implementation in this specific terminal, and that could be extended to similar intermodal terminals located in regions with similar labour conditions and that as the European Union have a huge standardized freight system.Finally, what is stated with the provided data, despite not allowing to make the most complex and representative simulation, is that the aim of the Intelligent Video Gate is reached successfully with a great improvement of the efficiency what allows to ensure with quite certainty that the system implementation is recommended in this kind of terminals.
|
86 |
Artificial Intelligence and Pattern Recognition Technologies for Cultural Heritage : Involvement of Optical Character Recognition Software for Citizen Science in the processes for Crowdsourcing of Ancient Italian Texts.Ballerino, Julie January 2022 (has links)
Cultural heritage makes reference to an extremely diverse set of sources. More specifically, historical artifacts as well as intangible elements of a community’s history, pertain to cultural heritage. However, when looking at the conservation, the enrichment, and the divulgation of these elements, the question becomes more complex. Even more so, when a context of nebulous regulation, unequal distribution of resources and funding of cultural heritage institutions, as well as a bureaucratically complex division of competencies between territories, are present. This is the case in Italy, and more specifically, Southern and Central Italy, where all these issues are present, and further hinder the exploration of undiscovered historical material, as well as the organization and divulgation of discovered material. Following this discrepancy along the lines of legal and practical restrictions, this thesis aims to explore and evaluate how technology can obviate to said issues. For instance, a methodological exercise was endeavored by scanning some ancient texts, in Latin and in Old Italian, and by running an optical character recognition software on the latter. More specifically, this thesis applies the paradigm of citizen science for crowdsourcing to explore how well optical character recognition software works in terms of accessibility and efficiency. As such, this methodological exercise does not consist primarily of a technological evaluation but aims at opening up new ways for the public to interact with cultural heritage institutions, for exchanging historical information while respecting the legal and practical considerations that were mentioned. In conclusion, by highlighting this issue, it would be possible to further research and enrich the publicly available data on Italian educational history between the 18th and the 19th Century.
|
87 |
OCR of hand-written transcriptions of hieroglyphic textNederhof, Mark-Jan 20 April 2016 (has links) (PDF)
Encoding hieroglyphic texts is time-consuming. If a text already exists as hand-written transcription, there is an alternative, namely OCR. Off-the-shelf OCR systems seem difficult to adapt to the peculiarities of Ancient Egyptian. Presented is a proof-of-concept tool that was designed to digitize texts of Urkunden IV in the hand-writing of Kurt Sethe. It automatically recognizes signs and produces a normalized encoding, suitable for storage in a database, or for printing on a screen or on paper, requiring little manual correction.
The encoding of hieroglyphic text is RES (Revised Encoding Scheme) rather than (common dialects of) MdC (Manuel de Codage). Earlier papers argued against MdC and in favour of RES for corpus development. Arguments in favour of RES include longevity of the encoding, as its semantics are font-independent. The present study provides evidence that RES is also much preferable to MdC in the context of OCR. With a well-understood parsing technique, relative positioning of scanned signs can be straightforwardly mapped to suitable primitives of the encoding.
|
88 |
Capturing, Eliciting, and Prioritizing (CEP) Non-Functional Requirements Metadata during the Early Stages of Agile Software DevelopmentMaiti, Richard Rabin 01 January 2016 (has links)
Agile software engineering has been a popular methodology to develop software rapidly and efficiently. However, the Agile methodology often favors Functional Requirements (FRs) due to the nature of agile software development, and strongly neglects Non-Functional Requirements (NFRs). Neglecting NFRs has negative impacts on software products that have resulted in poor quality and higher cost to fix problems in later stages of software development.
This research developed the CEP “Capture Elicit Prioritize” methodology to effectively gather NFRs metadata from software requirement artifacts such as documents and images. Artifact included the Optical Character Recognition (OCR) artifact which gathered metadata from images. The other artifacts included: Database Artifact, NFR Locator Plus, NFR Priority Artifact, and Visualization Artifact. The NFRs metadata gathered reduced false positives to include NFRs in the early stages of software requirements gathering along with FRs. Furthermore, NFRs were prioritized using existing FRs methodologies which are important to stakeholders as well as software engineers in delivering quality software. This research built on prior studies by specifically focusing on NFRs during the early stages of agile software development.
Validation of the CEP methodology was accomplished by using the 26 requirements of the European Union (EU) eProcurement System. The NORMAP methodology was used as a baseline. In addition, the NERV methodology baseline results were used for comparison. The research results show that the CEP methodology successfully identified NFRs in 56 out of 57 requirement sentences that contained NFRs compared to 50 of the baseline and 55 of the NERV methodology. The results showed that the CEP methodology was successful in eliciting 98.24% of the baseline compared to the NORMAP methodology of 87.71%. This represents an improvement of 10.53% compared to the baseline results. of The NERV methodology result was 96.49% which represents an improvement of 1.75% for CEP. The CEP methodology successfully elicited 86 out of 88 NFR compared to the baseline NORMAP methodology of 75 and NERV methodology of 82. The NFR count elicitation success for the CEP methodology was 97.73 % compared to NORMAP methodology of 85.24 %which is an improvement of 12.49%. Comparison to the NERV methodology of 93.18%, CEP has an improvement of 4.55%. CEP methodology utilized the associated NFR Metadata (NFRM)/Figures/images and linked them to the related requirements to improve over the NORMAP and NERV methodologies. There were 29 baseline NFRs that were found in the associated Figures/images (NFRM) and 129 NFRs were both in the requirement sentence and the associated Figure/images (NFRM).
Another goal of this study was to improve the prioritization of NFRs compared to prior studies. This research provided effective techniques to prioritize NFRs during the early stages of agile software development and the impacts that NFRs have on the software development process. The CEP methodology effectively prioritized NFRs by utilizing the αβγ-framework in a similarly way to FRs. The sub-process of the αβγ-framework was modified in a way that provided a very attractive feature to agile team members. Modification allowed the replacement of parts of the αβγ-framework to suit the team’s specific needs in prioritizing NFRs. The top five requirements based on NFR prioritization were the following: 12.3, 24.5, 15.3, 7.5, and 7.1. The prioritization of NFRs fit the agile software development cycle and allows agile developers and members to plan accordingly to accommodate time and budget constraints.
|
89 |
Arabic text recognition of printed manuscripts : efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processingAl-Muhtaseb, Husni Abdulghani January 2010 (has links)
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%.
|
90 |
Embedded Arabic text detection and recognition in videos / Détection et reconnaissance du texte arabe incrusté dans les vidéosYousfi, Sonia 06 July 2016 (has links)
Cette thèse s'intéresse à la détection et la reconnaissance du texte arabe incrusté dans les vidéos. Dans ce contexte, nous proposons différents prototypes de détection et d'OCR vidéo (Optical Character Recognition) qui sont robustes à la complexité du texte arabe (différentes échelles, tailles, polices, etc.) ainsi qu'aux différents défis liés à l'environnement vidéo et aux conditions d'acquisitions (variabilité du fond, luminosité, contraste, faible résolution, etc.). Nous introduisons différents détecteurs de texte arabe qui se basent sur l'apprentissage artificiel sans aucun prétraitement. Les détecteurs se basent sur des Réseaux de Neurones à Convolution (ConvNet) ainsi que sur des schémas de boosting pour apprendre la sélection des caractéristiques textuelles manuellement conçus. Quant à notre méthodologie d'OCR, elle se passe de la segmentation en traitant chaque image de texte en tant que séquence de caractéristiques grâce à un processus de scanning. Contrairement aux méthodes existantes qui se basent sur des caractéristiques manuellement conçues, nous proposons des représentations pertinentes apprises automatiquement à partir des données. Nous utilisons différents modèles d'apprentissage profond, regroupant des Auto-Encodeurs, des ConvNets et un modèle d'apprentissage non-supervisé, qui génèrent automatiquement ces caractéristiques. Chaque modèle résulte en un système d'OCR bien spécifique. Le processus de reconnaissance se base sur une approche connexionniste récurrente pour l'apprentissage de l'étiquetage des séquences de caractéristiques sans aucune segmentation préalable. Nos modèles d'OCR proposés sont comparés à d'autres modèles qui se basent sur des caractéristiques manuellement conçues. Nous proposons, en outre, d'intégrer des modèles de langage (LM) arabes afin d'améliorer les résultats de reconnaissance. Nous introduisons différents LMs à base des Réseaux de Neurones Récurrents capables d'apprendre des longues interdépendances linguistiques. Nous proposons un schéma de décodage conjoint qui intègre les inférences du LM en parallèle avec celles de l'OCR tout en introduisant un ensemble d’hyper-paramètres afin d'améliorer la reconnaissance et réduire le temps de réponse. Afin de surpasser le manque de corpus textuels arabes issus de contenus multimédia, nous mettons au point de nouveaux corpus manuellement annotés à partir des flux TV arabes. Le corpus conçu pour l'OCR, nommé ALIF et composée de 6,532 images de texte annotées, a été publié a des fins de recherche. Nos systèmes ont été développés et évalués sur ces corpus. L’étude des résultats a permis de valider nos approches et de montrer leurs efficacité et généricité avec plus de 97% en taux de détection, 88.63% en taux de reconnaissance mots sur le corpus ALIF dépassant ainsi un des systèmes d'OCR commerciaux les mieux connus par 36 points. / This thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transformed into sequences of features using a multi-scale scanning scheme. Standing out from the dominant methodology of hand-crafted features, we propose to learn relevant text representations from data using different deep learning methods, namely Deep Auto-Encoders, ConvNets and unsupervised learning models. Each one leads to a specific OCR (Optical Character Recognition) solution. Sequence labeling is performed without any prior segmentation using a recurrent connectionist learning model. Proposed solutions are compared to other methods based on non-connectionist and hand-crafted features. In addition, we propose to enhance the recognition results using Recurrent Neural Network-based language models that are able to capture long-range linguistic dependencies. Both OCR and language model probabilities are incorporated in a joint decoding scheme where additional hyper-parameters are introduced to boost recognition results and reduce the response time. Given the lack of public multimedia Arabic datasets, we propose novel annotated datasets issued from Arabic videos. The OCR dataset, called ALIF, is publicly available for research purposes. As the best of our knowledge, it is first public dataset dedicated for Arabic video OCR. Our proposed solutions were extensively evaluated. Obtained results highlight the genericity and the efficiency of our approaches, reaching a word recognition rate of 88.63% on the ALIF dataset and outperforming well-known commercial OCR engine by more than 36%.
|
Page generated in 0.2735 seconds