31 |
Compound document retrieval in noisy environments /Jaisimha, M. Y. January 1996 (has links)
Thesis (Ph. D.)--University of Washington, 1996. / Vita. Includes bibliographical references (leaves [168]-173).
|
32 |
Quantifying the noise tolerance of the OCR engine Tesseract using a simulated environmentNell, Henrik January 2014 (has links)
->Context. Optical Character Recognition (OCR), having a computer recognize text from an image, is not as intuitive as human recognition. Even small (to human eyes) degradations can thwart the OCR result. The problem is that random unknown degradations are unavoidable in a real-world setting. ->Objectives. The noise tolerance of Tesseract, a state-of-the-art OCR engine, is evaluated in relation to how well it handles salt and pepper noise, a type of image degradation. Noise tolerance is measured as the percentage of aberrant pixels when comparing two images (one with noise and the other without noise). ->Methods. A novel systematic approach for finding the noise tolerance of an OCR engine is presented. A simulated environment is developed, where the test parameters, called test cases (font, font size, text string), can be modified. The simulation program creates a text string image (white background, black text), degrades it iteratively using salt and pepper noise, and lets Tesseract perform OCR on it, in each iteration. The iteration process is stopped when the comparison between the image text string and the OCR result of Tesseract mismatches. ->Results. Simulation results are given as changed pixels percentage (noise tolerance) between the clean text string image and the text string image the degradation iteration before Tesseract OCR failed to recognize all characters in the text string image. The results include 14400 test cases: 4 fonts (Arial, Calibri, Courier and Georgia), 100 font sizes (1-100) and 36 different strings (4*100*36=14400), resulting in about 1.8 million OCR attempts performed by Tesseract. ->Conclusions. The noise tolerance depended on the test parameters. Font sizes smaller than 7 were not recognized at all, even without noise applied. The font size interval 13-22 was the peak performance interval, i.e. the font size interval that had the highest noise tolerance, except for the only monospaced font tested, Courier, which had lower noise tolerance in the peak performance interval. The noise tolerance trend for the font size interval 22-100 was that the noise tolerance decreased for larger font sizes. The noise tolerance of Tesseract as a whole, given the experiment results, was circa 6.21 %, i.e. if 6.21 % of the pixel in the image has changed Tesseract can still recognize all text in the image. / <p>42</p>
|
33 |
Separation and recognition of connected handprinted capital English charactersTing, Voon-Cheung Roger January 1986 (has links)
The subject of machine recognition of connected characters is investigated. A generic single character recognizer (SCR) assumes there is only one character in the image. The goal of this project is to design a connected character segmentation algorithm (CCSA) without the above assumption. The newly designed CCSA will make use of a readily available SCR.
The input image (e.g. a word with touching letters) is first transformed (thinned) into its skeletal form. The CCSA will then extract the image features (nodes and branches) and store them in a hierarchical form. The hierarchy stems from the left-to-right rule of writing of the English language. The CCSA will first attempt to recognize the first letter. When this is done, the first letter is deleted and the algorithm repeats.
After extracting the image features, the CCSA starts to create a set of test images from the beginning of the word (i.e. beginning of the description). Each test image contains one more feature than its predecessor. The number of test images in the set is constrained by a predetermined fixed width or a fixed total number of features. The SCR is then called to examine each test image. The recognizable test image(s) in the set are extracted. Let each recognizable test image be denoted by C₁. For each C₁, a string of letters C₂, C₃, CL is formed. C₂ is the best recognized test image in a set of test images created after the deletion of C₁ from the beginning of the current word. C₃ through CL are created by the same method. All such strings are examined to determine which string contains the best recognized C₁.
Experimental results on test images with two characters yield a recognition rate of 72.66%. Examples with more than two characters are also shown. Furthermore, the experimental results suggested that topologically simple test images can be more difficult to recognize than those which are topologically more complex. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
|
34 |
A study in applying optical character recognition technology for the Foreign Broadcast Information Service field bureausStine, William V. 17 March 2010 (has links)
Master of Science
|
35 |
Adaptive optical music recognitionFujinaga, Ichiro January 1996 (has links)
No description available.
|
36 |
Optical character recognition using morphological operationsCastellanos, Francisco Alvaro 01 April 2000 (has links)
No description available.
|
37 |
Artificial intelligence application for feature extraction in annual reports : AI-pipeline for feature extraction in Swedish balance sheets from scanned annual reportsNilsson, Jesper January 2024 (has links)
Hantering av ostrukturerade och fysiska dokument inom vissa områden, såsom finansiell rapportering, medför betydande ineffektivitet i dagsläget. Detta examensarbete fokuserar på utmaningen att extrahera data från ostrukturerade finansiella dokument, specifikt balansräkningar i svenska årsredovisningar, genom att använda en AI-driven pipeline. Syftet är att utveckla en metod för att automatisera datautvinning och möjliggöra förbättrad dataanalys. Projektet fokuserade på att automatisera utvinning av finansiella poster från balansräkningar genom en kombination av Optical Character Recognition (OCR) och en modell för Named Entity Recognition (NER). TesseractOCR användes för att konvertera skannade dokument till digital text, medan en BERT-baserad NER-modell tränades för att identifiera och klassificera relevanta finansiella poster. Ett Python-skript användes för att extrahera de numeriska värdena som är associerade med dessa poster. Projektet fann att NER-modellen uppnådde hög prestanda, med ett F1-score på 0,95, vilket visar dess effektivitet i att identifiera finansiella poster. Den fullständiga pipelinen lyckades extrahera över 99% av posterna från balansräkningar med en träffsäkerhet på cirka 90% för numerisk data. Projektet drar slutsatsen att kombinationen av OCR och NER är en lovande lösning för att automatisera datautvinning från ostrukturerade dokument med liknande attribut som årsredovisningar. Framtida arbeten kan utforska att förbättra träffsäkerheten i OCR och utvidga utvinningen till andra sektioner av olika typer av ostrukturerade dokument. / The persistence of unstructured and physical document management in fields such as financial reporting presents notable inefficiencies. This thesis addresses the challenge of extracting valuable data from unstructured financial documents, specifically balance sheets in Swedish annual reports, using an AI-driven pipeline. The objective is to develop a method to automate data extraction, enabling enhanced data analysis capabilities. The project focused on automating the extraction of financial posts from balance sheets using a combination of Optical Character Recognition (OCR) and a Named Entity Recognition (NER) model. TesseractOCR was used to convert scanned documents into digital text, while a fine-tuned BERT-based NER model was trained to identify and classify relevant financial features. A Python script was employed to extract the numerical values associated with these features. The study found that the NER model achieved high performance metrics, with an F1-score of 0.95, demonstrating its effectiveness in identifying financial entities. The full pipeline successfully extracted over 99% of features from balance sheets with an accuracy of about 90% for numerical data. The project concludes that combining OCR and NER technologies could be a promising solution for automating data extraction from unstructured documents with similar attributes to annual reports. Future work could explore enhancing OCR accuracy and extending the methodology to other sections of different types of unstructured documents.
|
38 |
Off-line signature verificationCoetzer, Johannes 03 1900 (has links)
Thesis (PhD (Mathematical Sciences))--University of Stellenbosch, 2005. / A great deal of work has been done in the area of off-line signature verification over the
past two decades. Off-line systems are of interest in scenarios where only hard copies of
signatures are available, especially where a large number of documents need to be authenticated.
This dissertation is inspired by, amongst other things, the potential financial
benefits that the automatic clearing of cheques will have for the banking industry.
|
39 |
Freeform Cursive Handwriting Recognition Using a Clustered Neural NetworkBristow, Kelly H. 08 1900 (has links)
Optical character recognition (OCR) software has advanced greatly in recent years. Machine-printed text can be scanned and converted to searchable text with word accuracy rates around 98%. Reasonably neat hand-printed text can be recognized with about 85% word accuracy. However, cursive handwriting still remains a challenge, with state-of-the-art performance still around 75%. Algorithms based on hidden Markov models have been only moderately successful, while recurrent neural networks have delivered the best results to date. This thesis explored the feasibility of using a special type of feedforward neural network to convert freeform cursive handwriting to searchable text. The hidden nodes in this network were grouped into clusters, with each cluster being trained to recognize a unique character bigram. The network was trained on writing samples that were pre-segmented and annotated. Post-processing was facilitated in part by using the network to identify overlapping bigrams that were then linked together to form words and sentences. With dictionary assisted post-processing, the network achieved word accuracy of 66.5% on a small, proprietary corpus. The contributions in this thesis are threefold: 1) the novel clustered architecture of the feed-forward neural network, 2) the development of an expanded set of observers combining image masks, modifiers, and feature characterizations, and 3) the use of overlapping bigrams as the textual working unit to assist in context analysis and reconstruction.
|
40 |
Optiese tegnologie20 November 2014 (has links)
M.Com. (Informatics) / Please refer to full text to view abstract
|
Page generated in 0.1809 seconds