1 |
Underwater Document RecognitionShah, Jaimin Nitesh 18 May 2021 (has links)
No description available.
|
2 |
Ocr: A Statistical Model Of Multi-engine Ocr SystemsMcDonald, Mercedes Terre 01 January 2004 (has links)
This thesis is a benchmark performed on three commercial Optical Character Recognition (OCR) engines. The purpose of this benchmark is to characterize the performance of the OCR engines with emphasis on the correlation of errors between each engine. The benchmarks are performed for the evaluation of the effect of a multi-OCR system employing a voting scheme to increase overall recognition accuracy. This is desirable since currently OCR systems are still unable to recognize characters with 100% accuracy. The existing error rates of OCR engines pose a major problem for applications where a single error can possibly effect significant outcomes, such as in legal applications. The results obtained from this benchmark are the primary determining factor in the decision of implementing a voting scheme. The experiment performed displayed a very high accuracy rate for each of these commercial OCR engines. The average accuracy rate found for each engine was near 99.5% based on a less than 6,000 word document. While these error rates are very low, the goal is 100% accuracy in legal applications. Based on the work in this thesis, it has been determined that a simple voting scheme will help to improve the accuracy rate.
|
3 |
Retrofitting analogue meters with smart devices : A feasibility study of local OCR processes on an energy critical driven systemAndreasson, Joel, Ehrenbåge, Elin January 2023 (has links)
Internet of Things (IoT) are becoming increasingly popular replacements for their analogue counterparts. However, there is still demand to keep analogue equipment that is already installed, while also having automated monitoring of the equipment, such as analogue water meters. A proposed solution for this problem is to install a battery powered add-on component that can optically read meter values using Optical Character Recognition (OCR) and transmit the readings wirelessly. Two ways to do this could be to either offload the OCR process to a server, or to do the OCR processing locally on the add-on component. Since water meters are often located where reception is weak and the add-on component is battery powered, a suitable technology for data transmission could be Long Range (LoRa) because of its low-power and long-range capabilities. Since LoRa has low transfer rate there is a need to keep data transfers small in size, which could make offloading a less favorable alternative compared to local OCR processing. The purpose of this thesis is therefore to research the feasibility, in terms of energy efficiency, of doing local OCR processing on the add-on component. The feasibility condition of this study is defined as being able to continually read an analogue meter for a 10-year lifespan, while consuming under 2600 milliampere hours (mAh) of energy. The two OCR algorithms developed for this study are a specialized OCR algorithm that utilizes pattern matching principles, and a Sum of Absolute Differences (SAD) OCR algorithm. These two algorithms have been compared against each other, to determine which one is more suitable for the system. This comparison yielded that the SAD algorithm was more suitable, and was then studied further by using different image resolutions and settings to determine if it was possible to further reduce energy consumption. The results showed that it was possible to significantly reduce energy consumption by reducing the image resolution. The study also researched the possibility of reducing energy consumption further by not reading all digits on the tested water meter, depending on the measuring frequency and water flow. The study concluded that OCR processing is feasible on an energy critical driven system when reading analouge meters, depending on the measuring frequency.
|
4 |
Arabic text recognition of printed manuscripts : efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processingAl-Muhtaseb, Husni Abdulghani January 2010 (has links)
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%.
|
5 |
A Book Reader Design for Persons with Visual Impairment and BlindnessGalarza, Luis E. 16 November 2017 (has links)
The objective of this dissertation is to provide a new design approach to a fully automated book reader for individuals with visual impairment and blindness that is portable and cost effective. This approach relies on the geometry of the design setup and provides the mathematical foundation for integrating, in a unique way, a 3-D space surface map from a low-resolution time of flight (ToF) device with a high-resolution image as means to enhance the reading accuracy of warped images due to the page curvature of bound books and other magazines. The merits of this low cost, but effective automated book reader design include: (1) a seamless registration process of the two imaging modalities so that the low resolution (160 x 120 pixels) height map, acquired by an Argos3D-P100 camera, accurately covers the entire book spread as captured by the high resolution image (3072 x 2304 pixels) of a Canon G6 Camera; (2) a mathematical framework for overcoming the difficulties associated with the curvature of open bound books, a process referred to as the dewarping of the book spread images, and (3) image correction performance comparison between uniform and full height map to determine which map provides the highest Optical Character Recognition (OCR) reading accuracy possible. The design concept could also be applied to address the challenging process of book digitization. This method is dependent on the geometry of the book reader setup for acquiring a 3-D map that yields high reading accuracy once appropriately fused with the high-resolution image. The experiments were performed on a dataset consisting of 200 pages with their corresponding computed and co-registered height maps, which are made available to the research community (cate-book3dmaps.fiu.edu). Improvements to the characters reading accuracy, due to the correction steps, were quantified and measured by introducing the corrected images to an OCR engine and tabulating the number of miss-recognized characters. Furthermore, the resilience of the book reader was tested by introducing a rotational misalignment to the book spreads and comparing the OCR accuracy to those obtained with the standard alignment. The standard alignment yielded an average reading accuracy of 95.55% with the uniform height map (i.e., the height values of the central row of the 3-D map are replicated to approximate all other rows), and 96.11% with the full height maps (i.e., each row has its own height values as obtained from the 3D camera). When the rotational misalignments were taken into account, the results obtained produced average accuracies of 90.63% and 94.75% for the same respective height maps, proving added resilience of the full height map method to potential misalignments.
|
6 |
A Possibilistic Approach To Handwritten Script Identification Via Morphological Methods For Pattern RepresentationGhosh, Debashis 04 1900 (has links) (PDF)
No description available.
|
7 |
Arabic text recognition of printed manuscripts. Efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processing.Al-Muhtaseb, Husni A. January 2010 (has links)
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms.
This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems.
Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques.
Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time.
Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images.
In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected.
The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase.
Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%.
Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved.
To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%. / King Fahd University of Petroleum and Minerals (KFUPM)
|
8 |
Sistema de reconocimiento de texto mecanografiado mediante redes neuronales para la gestión de boletas de pago en la Ugel FerreñafeBonilla Vilchez, Jonathan Alonso January 2024 (has links)
En este proyecto, se llevó a cabo un estudio con el objetivo de desarrollar un sistema de reconocimiento óptico de caracteres (OCR) diseñado para identificar y almacenar la información de las boletas de pago de docentes en la UGEL Ferreñafe. Esto se debió a la necesidad de agilizar la búsqueda de boletas en formato físico, un proceso que, en ocasiones, podía llevar semanas y requerir la contratación de personal adicional. Esta problemática impulsó la búsqueda de una solución eficaz y rentable.
Siguiendo las metodologías SCRUM y CRISP-DM, se optó por utilizar Redes Neuronales (RN) como la técnica principal. Esta elección se basó en investigaciones previas y tendencias identificadas en Google Trends. El objetivo fundamental era alcanzar un porcentaje de error bajo en la tasa de caracteres reconocidos, y se logró un hito significativo del 1.8%, a pesar de la degradación de la tinta en muchas boletas debido al paso del tiempo.
Para evaluar la usabilidad del sistema, se aplicó la escala SUS (System Usability Scale), y el sistema obtuvo una puntuación de 80, superando las expectativas iniciales. Esto resalta la alta usabilidad y satisfacción de los usuarios finales con la aplicación desarrollada. / In this project, a study was carried out with the objective of developing an optical character recognition (OCR) system designed to identify and store information from teacher pay slips at UGEL Ferreñafe. This was due to the need to expedite the search for physical ballots, a process that could sometimes take weeks and require the hiring of additional staff. This problem prompted the search for an effective and profitable solution.
Following the SCRUM and CRISP-DM methodologies, it was decided to use Neural Networks (RN) as the main technique. This choice was based on previous research and trends identified in Google Trends. The fundamental objective was to achieve a low error rate in the rate of recognized characters, and a significant milestone of 1.8% was achieved, despite the degradation of the ink on many ballots due to the passage of time.
To evaluate the usability of the system, the SUS scale (System Usability Scale) was applied, and the system obtained a score of 80, exceeding initial expectations. This highlights the high usability and satisfaction of end users with the developed application.
|
9 |
Simultaneous Detection and Validation of Multiple Ingredients on Product Packages: An Automated Approach : Using CNN and OCR Techniques / Simultant detektering och validering av flertal ingredienser på produktförpackningar: Ett automatiserat tillvägagångssätt : Genom användning av CNN och OCR teknikerFarokhynia, Rodbeh, Krikeb, Mokhtar January 2024 (has links)
Manual proofreading of product packaging is a time-consuming and uncertain process that can pose significant challenges for companies, such as scalability issues, compliance risks and high costs. This thesis work introduces a novel solution by employing advanced computer vision and machine learning methods to automate the proofreading of multiple ingredients’ lists corresponding to multiple products simultaneously within a product package. By integrating Convolutional Neural Network (CNN) and Optical Character Recognition (OCR) techniques, this study examines the efficacy of automated proofreading in comparison to manual methods. The thesis involves analyzing product package artwork to identify ingredient lists utilizing the YOLOv5 object detection algorithm and the optical character recognition tool EasyOCR for ingredient extraction. Additionally, Python scripts are employed to extract ingredients from corresponding INCI PDF files (document that lists the standardized names of ingredients used in cosmetic products). A comprehensive comparison is then conducted to evaluate the accuracy and efficiency of automated proofreading. The comparison of the extracted ingredients from the product packages and their corresponding INCI PDF files yielded a match of 12.7%. Despite the suboptimal result, insights from the study highlights the limitations of current detection and recognition algorithms when applied to complex artwork. A few examples of the insights have been that the trained YOLOv5 model cuts through sentences in the ingredient list or that EasyOCR cannot extract ingredients from vertically aligned product package images. The findings underscore the need for advancements in detection algorithms and OCR tools to effectively handle objects like product packaging designs. The study also suggests that companies, such as H&M, consider updating their artwork and INCI PDF files to align with the capabilities of current AI-driven tools. By doing so, they can enhance the efficiency and overall effectiveness of automated proofreading processes, thereby reducing errors and improving accuracy. / Manuell korrekturläsning av produktförpackningar är en tidskrävande och osäker process som kan skapa betydande utmaningar för företag, såsom skalbarhetsproblem, efterlevnadsrisker och höga kostnader. Detta examensarbete presenterar en ny lösning genom att använda avancerade metoder inom datorseende och maskininlärning för att automatisera korrekturläsningen av flera ingredienslistor som motsvarar flera produkter samtidigt inom en produktförpackning. Genom att integrera Convolutional Neural Network (CNN) och Optical Character Recognition (OCR) utreder denna studie effektiviteten av automatiserad korrekturläsning i jämförelse med manuella metoder. Avhandlingen analyserar designen av produktförpackningar för att identifiera ingredienslistor med hjälp av objektdetekteringsalgoritmen YOLOv5 och det optiska teckenigenkänningsverktyget EasyOCR för extrahera enskilda ingredienser från listorna. Utöver detta används Python-skript för att extrahera ingredienser från motsvarande INCI-PDF filer (dokument med standardiserade namn på ingredienser som används i kosmetika produkter). En omfattande jämförelse genomförs sedan för att utvärdera noggrannheten och effektiviteten hos automatiserad korrekturläsning. Jämförelsen av de extraherade ingredienserna från produktförpackningarna och deras korresponderande INCI-PDF filer gav ett matchnings resultat på 12.7%. Trots de mindre optimala resultaten belyser studien de begränsningar som finns hos de nuvarande detekterings- och teckenigenkänningsalgoritmerna när de appliceras på komplexa verk av produktförpackningar. Ett fåtal exempel på insikterna är bland annat att den tränade YOLOv5 modellen skär igenom meningar i ingredienslistan eller att EasyOCR inte kan extrahera ingredienser från stående ingredienslistor på produktförpackningsbilder. Resultaten understryker behovet av framsteg inom detekteringsalgoritmer och OCR-verktyg för att effektivt kunna hantera komplexa objekt som produktförpackningar. Studien föreslår även att företag, såsom H&M, överväger att uppdatera sina design av produktförpackningar och INCI-PDF filer för att anpassa sig till kapaciteten hos aktuella AI-drivna verktyg. Genom att utföra detta kan de förbättra både effektiviteten och den övergripande kvaliteten hos de automatiserade korrekturläsningsprocesserna, vilket minskar fel och ökar noggrannheten.
|
10 |
Effektivisering av Tillverkningsprocesser med Artificiell Intelligens : Minskad Materialförbrukning och Förbättrad KvalitetskontrollAl-Saaid, Kasim, Holm, Daniel January 2024 (has links)
This report explores the implementation of AI techniques in the manufacturing process at Ovako, focusing on process optimization, individual traceability, and quality control. By integrating advanced AI models and techniques at various levels within the production process, Ovako can improve efficiency, reduce material consumption, and prevent production stops. For example, predictive maintenance can be applied to anticipate and prevent machine problems, while image recognition algorithms and optical character recognition enable individual traceability of each rod throughout the process. Furthermore, AI-based quality control can detect defects and deviations with high precision and speed, leading to reduced risk of faulty products and increased product quality. By carefully considering the role of the workforce, safety and ethical issues, and the benefits and challenges of AI implementation, Ovako can maximize the benefits of these techniques and enhance its competitiveness in the market. / Denna rapport utforskar implementeringen av AI-tekniker i tillverkningsprocessen hos Ovako, med fokus på processoptimering, individuell spårbarhet och kvalitetskontroll. Genom att integrera avancerade AI-modeller och tekniker på olika nivåer inom produktionsprocessen kan Ovako förbättra effektiviteten, minska materialförbrukningen och förhindra produktionsstopp. Exempelvis kan prediktivt underhåll tillämpas för att förutse och förebygga maskinproblem, medan bildigenkänningsalgoritmer och optisk teckenigenkänning möjliggör individuell spårbarhet av varje stång genom processen. Dessutom kan AI-baserad kvalitetskontroll detektera defekter och avvikelser med hög precision och hastighet, vilket leder till minskad risk för felaktiga produkter och ökad produktkvalitet. Genom att noggrant överväga arbetskraftens roll, säkerhets- och etikfrågor samt fördelarna och utmaningarna med AI-implementeringen kan Ovako maximera nyttan av dessa tekniker och förbättra sin konkurrenskraft på marknaden.
|
Page generated in 0.153 seconds