81 |
Design of a realtime high speed recognizer for unconstrained handprinted alphanumeric charactersWong, Ing Hoo January 1985 (has links)
This thesis presents the design of a recognizer for unconstrained handprinted alphanumeric characters. The design is based on a thinning process that is capable of producing thinned images with well defined features that are considered essential for character image description and recognition. By choosing the topological points of the thinned ('line') character image as these desired features, the thinning process achieves not only a high degree of data reduction but also transforms a binary image into a discrete form of line drawing that can be represented by graphs. As a result powerful graphical analysis techniques can be applied to analyze and classify the image.
The image classification is performed in two stages. Firstly, a technique for identifying the topological points in the thinned image is developed. These topological points represent the global features of the image and because of their invariance to elastic deformations, they are used for image preclassification. Preclassification results in a substantial reduction in the entropy of the input image. The subsequent process can concentrate only on the differentiation of images that are topologically equivalent. In the preclassifier simple logic operations localized to the immediate neighbourhood of each pixel are used. These operations are also highly independent and easy to implement using VLSI. A graphical technique for image extraction and representation called the chain coded digraph representation is introduced. The technique uses global features such as nodes and the Freeman's chain codes for digital curves as branches. The chain coded digraph contains all the information that is present in the thinned image. This avoids using the image feature extraction approach for image description and data reduction (a difficult process to optimize) without sacrificing speed or complexity.
After preclassification, a second stage of the recognition process analyses the chain coded digraph using the concept of attributed relational graph (ARG). ARG representation of the image can be obtained readily through simple transformations or rewriting rules from the chain coded digraph. The ARG representation of an image describes the shape primitives in the image and their relationships. Final classification of the input image can be made by comparing its ARG with the ARGs of known characters. The final classification involves only the comparison of ARGs of a predetermined topology. This information is crucial to the design of a matching algorithm called the reference guided inexact matching procedure, designed for high speed matching of character image ARGs. This graph matching procedure is shown to be much faster than other conventional graph matching procedures. The designed recognizer is implemented in Pascal on the PDP11/23 and VAX 11/750 computer. Test using Munson's data shows a high recognition rate of 91.46%. However, the recognizer is designed with the aim of an eventual implementation using VLSI and also as a basic recognizer for further research in reading machines. Therefore its full potential is yet to be realized. Nevertheless, the experiments with Munson's data illustrates the effectiveness of the design approach and the advantages it offers as a basic system for future research. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
|
82 |
OCR modul pro rozpoznání písmen a číslic / OCR module for recognition of letters and numbersKapusta, Ján January 2010 (has links)
This paper describes basic methods used for optical character recognition. It explains all procedures of recognition from adjustment of picture, processing, feature extracting to matching algorithms. It compares methods and algorithms for character recognition obtained graphically distorted or else modified image so-called „captcha“, used in present. Further it compares method based on invariant moments and neural network as final classifier and method based on correlation between normals and recognized characters.
|
83 |
Handwritten digit recognition based on segmentation-free methodZhao, Mengqiao January 2020 (has links)
This thesis aims to implement a segmentation-free strategy in the context of handwritten multi-digit string recognition. Three models namely VGG-16, CRNN and 4C are built to be evaluated and benchmarked, also research about the effect of the different training set on model performance is carried out.
|
84 |
Evaluating Methods for Optical Character Recognition on a Mobile Platform : comparing standard computer vision techniques with deep learning in the context of scanning prescription medicine labelsBisiach, Jonathon, Zabkar, Matej January 2020 (has links)
Deep learning has become ubiquitous as part of Optical Character Recognition (OCR), but there are few examples of research into whether the two technologies are feasible for deployment on a mobile platform. This study examines which particular method of OCR would be best suited for a mobile platform in the specific context of a prescription medication label scanner. A case study using three different methods of OCR – classic computer vision techniques, standard deep learning and specialised deep learning – tested against 100 prescription medicine label images shows that the method that provides the best combination of accuracy, speed and resource using has proven to be standard seep learning, or Tesseract 4.1.1 in this particular case. Tesseract 4.1.1 tested with 76% accuracy with a further 10% of results being one character away from being accurate. Additionally, 9% of images were processed in less than one second and 41% were processed in less than 10 seconds. Tesseract 4.1.1 also had very reasonable resource costs, comparable to methods that did not utilise deep learning.
|
85 |
Form data enriching using a post OCR clustering process : Measuring accuracy of field names and field values clusteringAboulkacim, Adil January 2022 (has links)
Med OCR teknologier kan innehållet av ett formulär läsas in, positionen av varje ord och dess innehåll kan extraheras, dock kan relationen mellan orden ej förstås. Denna rapport siktar på att lösa problemet med att berika data från ett strukturerat formulär utan någon förinställd konfiguration genom användandet utav klustring. Detta görs med en kvantitativ metod där mätning av en utvecklad prototyp som räknar antal korrekt klustrade textrutor och en kvalitativ utvärdering. Prototypen fungerar genom att mata en bild av ett ofyllt formulär och en annan bild av ett ifyllt formulär och en annan bild av ett ifyllt formulär som innehåller informationen som ska berikas till en OCR-motor. Utdatan från OCR-motorn körs genom ett efterbearbetningssteg som tillsammans med en modifierad euklidisk algoritm och en oskarp strängsökningsalgoritm kan klustra fältnamn och fältvärden i den ifyllda formulärbilden. Resultatet av prototypen för tre olika formulärstrukturer och 15 olika bilder vardera gav en träffsäkerhet från 100% till 92% beroende på formulärstruktur. Denna rapport kunde visa möjligheten att grupper ihop fältnamn och fältvärden i ett formulera, med andra ord utvinna information från formuläret / With OCR technologies the text in a form can be read, the position of each word and its contents can be extracted, however the relation between the words cannot be understood. This thesis aims to solve the problem of enriching data from a structured form without any pre-set configuration using clustering. This is done using the method of a quantitative measurement of a developed prototype counting correctly clustered text boxes and a qualitative evaluation. The prototype works by feeding an image of an unfilled form and another image of a filled form which contains the data to be enriched to an OCR engine. The OCR engine extracts the text and its positions which is then run through a post-processing step which together with a modified Euclidean and fuzzy string search algorithm, both together is able to cluster field names and field values in the filled in form image. The result of the prototype for three different form structures and 15 different images for each structure ranges from 100% to 92% accuracy depending on form structure. This thesis successfully was able to show the possibility of clustering together names and values in a form i.e., enriching data from the form.
|
86 |
Detection and Recognition of U.S. Speed Signs from Grayscale Images for Intelligent VehiclesKanaparthi, Pradeep Kumar January 2012 (has links)
No description available.
|
87 |
Underwater Document RecognitionShah, Jaimin Nitesh 18 May 2021 (has links)
No description available.
|
88 |
The CAR (Confront, Address, Replace) Strategy: An Antiracist Engineering PedagogyAsfaw, Amman Fasil 01 June 2021 (has links) (PDF)
The CAR (confront, address, replace) Strategy is an antiracist pedagogy aiming to drive out exclusionary terminology in engineering education.
“Master-slave” terminology is still commonplace in engineering education and industry. However, questions have been raised about the negative impacts of such language. Usage of exclusionary terminology such as “master-slave” in academia can make students—especially those who identify as women and/or Black/African-American—feel uncomfortable, potentially evoking Stereotype Threat (Danowitz, 2020) and/or Curriculum Trauma (Buul, 2020). Indeed, prior research shows that students from a number of backgrounds find non-inclusive terminologies such as “master-slave” to be a major problem (Danowitz, 2020). Currently, women-identifying and gender nonbinary students are underrepresented in the engineering industry (ASEE, 2020) while Black/African-American students are underrepresented in the entire higher education system, including engineering fields (NSF, 2019).
The CAR Strategy, introduced here, stands for: 1) confront; 2) address; 3) replace and aims to provide a framework for driving out iniquitous terminologies in engineering education such as “master-slave.” The first step is to confront the historical significance of the terminology in question. The second step is to address the technical inaccuracies of the legacy terminology. Lastly, replace the problematic terminology with an optional but recommended replacement. This thesis reports on student perceptions and the effectiveness of The CAR Strategy piloted as a teaching framework in the computer engineering department of Cal Poly. Of 64 students surveyed: 70% either agree or strongly agree that The CAR Strategy is an effective framework for driving out exclusionary terminologies.
Amman Asfaw first presented certain portions of this thesis at the virtual 2021 American Society for Engineering Education (ASEE) Annual Conference and Exposition. The original publication’s copyright is held by ASEE (Asfaw, 2021); secondary authors included Storm Randolph, Victoria Siaumau, Yumi Aguilar, Emily Flores, Dr. Jane Lehr, and Dr. Andrew Danowitz.
|
89 |
A New Approach to Synthetic Image EvaluationMemari, Majid 01 December 2023 (has links) (PDF)
This study is dedicated to enhancing the effectiveness of Optical Character Recognition (OCR) systems, with a special emphasis on Arabic handwritten digit recognition. The choice to focus on Arabic handwritten digits is twofold: first, there has been relatively less research conducted in this area compared to its English counterparts; second, the recognition of Arabic handwritten digits presents more challenges due to the inherent similarities between different Arabic digits.OCR systems, engineered to decipher both printed and handwritten text, often face difficulties in accurately identifying low-quality or distorted handwritten text. The quality of the input image and the complexity of the text significantly influence their performance. However, data augmentation strategies can notably improve these systems' performance. These strategies generate new images that closely resemble the original ones, albeit with minor variations, thereby enriching the model's learning and enhancing its adaptability. The research found Conditional Variational Autoencoders (C-VAE) and Conditional Generative Adversarial Networks (C-GAN) to be particularly effective in this context. These two generative models stand out due to their superior image generation and feature extraction capabilities. A significant contribution of the study has been the formulation of the Synthetic Image Evaluation Procedure, a systematic approach designed to evaluate and amplify the generative models' image generation abilities. This procedure facilitates the extraction of meaningful features, computation of the Fréchet Inception Distance (LFID) score, and supports hyper-parameter optimization and model modifications.
|
90 |
Multiclassifier neural networks for handwritten character recognitionChai, Sin-Kuo January 1995 (has links)
No description available.
|
Page generated in 0.1076 seconds