Global ETD Search

141	Study of augmentations on historical manuscripts using TrOCR Meoded, Erez 08 December 2023 (has links) (PDF) Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance. TrOCR Transformer Ensemble learning OCR Handwritten Text Recognition Deep Neural Networks Machine Learning Artificial Intelligence Huggingface Python Artificial Intelligence and Robotics Data Science
142	New Approaches to OCR for Early Printed Books Weichselbaumer, Nikolaus, Seuret, Mathias, Limbach, Saskia, Dong, Rui, Burghardt, Manuel, Christlein, Vincent 29 May 2024 (has links) Books printed before 1800 present major problems for OCR. One of the mainobstacles is the lack of diversity of historical fonts in training data. The OCR-D project, consisting of book historians and computer scientists, aims to address this deficiency by focussing on three major issues. Our first target wasto create a tool that identifies font groups automatically in images of histori-cal documents. We concentrated on Gothic font groups that were commonlyused in German texts printed in the 15thand 16th century: the well-known Fraktur and the lesser known Bastarda, Rotunda, Textura und Schwabacher. The tool was trained with 35,000 images and reaches an accuracy level of 98%. It can not only differentiate between the above-mentioned font groupsbut also Hebrew, Greek, Antiqua and Italic. It can also identify woodcut im-ages and irrelevant data (book covers, empty pages, etc.). In a second step,we created an online training infrastructure (okralact), which allows for theuse of various open source OCR engines such as Tesseract, OCRopus, Krakenand Calamari. At the same time, it facilitates training for specific models offont groups. The high accuracy of the recognition tool paves the way for theunprecedented opportunity to differentiate between the fonts used by individual printers. With more training data and further adjustments, the toolcould help to fill a major gap in historical research info:eu-repo/classification/ddc/002 ddc:002 info:eu-repo/classification/ddc/006 ddc:006
143	From Historical Newspapers to Machine-Readable Data: The Origami OCR Pipeline Liebl, Bernhard, Burghardt, Manuel 20 June 2024 (has links) While historical newspapers recently have gained a lot of attention in the digital humanities, transforming them into machine-readable data by means of OCR poses some major challenges. In order to address these challenges, we have developed an end-to-end OCR pipeline named Origami. This pipeline is part of a current project on the digitization and quantitative analysis of the German newspaper “Berliner Börsen-Zeitung” (BBZ), from 1872 to 1931. The Origami pipeline reuses existing open source OCR components and on top offers a new configurable architecture for layout detection, a simple table recognition, a two-stage X-Y cut for reading order detection, and a new robust implementation for document dewarping. In this paper we describe the different stages of the workflow and discuss how they meet the above-mentioned challenges posed by historical newspapers. info:eu-repo/classification/ddc/006 ddc:006 info:eu-repo/classification/ddc/800 ddc:800
144	Image Retrieval in Digital Libraries: A Large Scale Multicollection Experimentation of Machine Learning techniques Moreux, Jean-Philippe, Chiron, Guillaume 16 October 2017 (has links) While historically digital heritage libraries were first powered in image mode, they quickly took advantage of OCR technology to index printed collections and consequently improve the scope and performance of the information retrieval services offered to users. But the access to iconographic resources has not progressed in the same way, and the latter remain in the shadows: manual incomplete and heterogeneous indexation, data silos by iconographic genre. Today, however, it would be possible to make better use of these resources, especially by exploiting the enormous volumes of OCR produced during the last two decades, and thus valorize these engravings, drawings, photographs, maps, etc. for their own value but also as an attractive entry point into the collections, supporting discovery and serenpidity from document to document and collection to collection. This article presents an ETL (extract-transform-load) approach to this need, that aims to: Identify and extract iconography wherever it may be found, in image collections but also in printed materials (dailies, magazines, monographies); Transform, harmonize and enrich the image descriptive metadata (in particular with machine learning classification tools); Load it all into a web app dedicated to image retrieval. The approach is pragmatically dual, since it involves leveraging existing digital resources and (virtually) on-the-shelf technologies. / Si historiquement, les bibliothèques numériques patrimoniales furent d’abord alimentées par des images, elles profitèrent rapidement de la technologie OCR pour indexer les collections imprimées afin d’améliorer périmètre et performance du service de recherche d’information offert aux utilisateurs. Mais l’accès aux ressources iconographiques n’a pas connu les mêmes progrès et ces dernières demeurent dans l’ombre : indexation manuelle lacunaire, hétérogène et non viable à grande échelle ; silos documentaires par genre iconographique ; recherche par le contenu (CBIR, content-based image retrieval) encore peu opérationnelle sur les collections patrimoniales. Aujourd’hui, il serait pourtant possible de mieux valoriser ces ressources, en particulier en exploitant les énormes volumes d’OCR produits durant les deux dernières décennies (tant comme descripteur textuel que pour l’identification automatique des illustrations imprimées). Et ainsi mettre en valeur ces gravures, dessins, photographies, cartes, etc. pour leur valeur propre mais aussi comme point d’entrée dans les collections, en favorisant découverte et rebond de document en document, de collection à collection. Cet article décrit une approche ETL (extract-transform-load) appliquée aux images d’une bibliothèque numérique à vocation encyclopédique : identifier et extraire l’iconographie partout où elle se trouve (dans les collections image mais aussi dans les imprimés : presse, revue, monographie) ; transformer, harmoniser et enrichir ses métadonnées descriptives grâce à des techniques d’apprentissage machine – machine learning – pour la classification et l’indexation automatiques ; charger ces données dans une application web dédiée à la recherche iconographique (ou dans d’autres services de la bibliothèque). Approche qualifiée de pragmatique à double titre, puisqu’il s’agit de valoriser des ressources numériques existantes et de mettre à profit des technologies (quasiment) mâtures. info:eu-repo/classification/ddc/004 ddc:004
145	Simultaneous Detection and Validation of Multiple Ingredients on Product Packages: An Automated Approach : Using CNN and OCR Techniques / Simultant detektering och validering av flertal ingredienser på produktförpackningar: Ett automatiserat tillvägagångssätt : Genom användning av CNN och OCR tekniker Farokhynia, Rodbeh, Krikeb, Mokhtar January 2024 (has links) Manual proofreading of product packaging is a time-consuming and uncertain process that can pose significant challenges for companies, such as scalability issues, compliance risks and high costs. This thesis work introduces a novel solution by employing advanced computer vision and machine learning methods to automate the proofreading of multiple ingredients’ lists corresponding to multiple products simultaneously within a product package. By integrating Convolutional Neural Network (CNN) and Optical Character Recognition (OCR) techniques, this study examines the efficacy of automated proofreading in comparison to manual methods. The thesis involves analyzing product package artwork to identify ingredient lists utilizing the YOLOv5 object detection algorithm and the optical character recognition tool EasyOCR for ingredient extraction. Additionally, Python scripts are employed to extract ingredients from corresponding INCI PDF files (document that lists the standardized names of ingredients used in cosmetic products). A comprehensive comparison is then conducted to evaluate the accuracy and efficiency of automated proofreading. The comparison of the extracted ingredients from the product packages and their corresponding INCI PDF files yielded a match of 12.7%. Despite the suboptimal result, insights from the study highlights the limitations of current detection and recognition algorithms when applied to complex artwork. A few examples of the insights have been that the trained YOLOv5 model cuts through sentences in the ingredient list or that EasyOCR cannot extract ingredients from vertically aligned product package images. The findings underscore the need for advancements in detection algorithms and OCR tools to effectively handle objects like product packaging designs. The study also suggests that companies, such as H&M, consider updating their artwork and INCI PDF files to align with the capabilities of current AI-driven tools. By doing so, they can enhance the efficiency and overall effectiveness of automated proofreading processes, thereby reducing errors and improving accuracy. / Manuell korrekturläsning av produktförpackningar är en tidskrävande och osäker process som kan skapa betydande utmaningar för företag, såsom skalbarhetsproblem, efterlevnadsrisker och höga kostnader. Detta examensarbete presenterar en ny lösning genom att använda avancerade metoder inom datorseende och maskininlärning för att automatisera korrekturläsningen av flera ingredienslistor som motsvarar flera produkter samtidigt inom en produktförpackning. Genom att integrera Convolutional Neural Network (CNN) och Optical Character Recognition (OCR) utreder denna studie effektiviteten av automatiserad korrekturläsning i jämförelse med manuella metoder. Avhandlingen analyserar designen av produktförpackningar för att identifiera ingredienslistor med hjälp av objektdetekteringsalgoritmen YOLOv5 och det optiska teckenigenkänningsverktyget EasyOCR för extrahera enskilda ingredienser från listorna. Utöver detta används Python-skript för att extrahera ingredienser från motsvarande INCI-PDF filer (dokument med standardiserade namn på ingredienser som används i kosmetika produkter). En omfattande jämförelse genomförs sedan för att utvärdera noggrannheten och effektiviteten hos automatiserad korrekturläsning. Jämförelsen av de extraherade ingredienserna från produktförpackningarna och deras korresponderande INCI-PDF filer gav ett matchnings resultat på 12.7%. Trots de mindre optimala resultaten belyser studien de begränsningar som finns hos de nuvarande detekterings- och teckenigenkänningsalgoritmerna när de appliceras på komplexa verk av produktförpackningar. Ett fåtal exempel på insikterna är bland annat att den tränade YOLOv5 modellen skär igenom meningar i ingredienslistan eller att EasyOCR inte kan extrahera ingredienser från stående ingredienslistor på produktförpackningsbilder. Resultaten understryker behovet av framsteg inom detekteringsalgoritmer och OCR-verktyg för att effektivt kunna hantera komplexa objekt som produktförpackningar. Studien föreslår även att företag, såsom H&M, överväger att uppdatera sina design av produktförpackningar och INCI-PDF filer för att anpassa sig till kapaciteten hos aktuella AI-drivna verktyg. Genom att utföra detta kan de förbättra både effektiviteten och den övergripande kvaliteten hos de automatiserade korrekturläsningsprocesserna, vilket minskar fel och ökar noggrannheten. Proofreading product packaging automation computer vision machine learning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR accuracy Manuell korrekturläsning automatisering produktförpackningar datorseende maskininlärning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR noggrannhet Other Computer and Information Science Annan data- och informationsvetenskap
146	Automobilių registracijos numerių atpažinimo tyrimas / Analysis of car number plate recognition Laptik, Raimond 17 June 2005 (has links) In the presented master paper: Analysis of car number plate recognition, optical character recognition (OCR), OCR software, OCR devices and systems are reviewed. Image processing operators and artificial neural networks are presented. Analysis and application of image processing operators for detection of number plate is done. Experimental results of estimation of Kohonen and multilayer feedforward artificial neural network learning parameters are presented. Number plate recognition is performed by the use of multilayer feedforward artificial neural network. Model of number plate recognition system is created. Number plate recognition software works in Microsoft© Windows™ operating system. Software is written with C++ language. Experimental results of system model operation are presented. Electronics and Electrical Engineering Number plate recognition ANN Kohoneno tinklas Dirbtinių neuronų tinklai Multilayer feedforward network Daugiasluoksnis perceptronų tinklas Spausdintinių simbolių atpažinimas Transporto srauto kontrolė Number plate location Kohonen network OCR
147	Systémy třídění se zaměřením na třídění poštovních zásilek na třídicích strojích / Sorting systems focusing on mail sorting at the sorting machines VESELÝ, Milan January 2016 (has links) In the introduction there is described a history of the post office. There is also outlined a current state and a future intention of Czech Post (Česká pošta s. p.). Further there is explained an issue of the formatting of the address side of postcards and writing. In another part there is a job description of the sorting machine SIEMENS IRV 3000 and also information on the location of this sorting machine at each collecting transport nodes. In the conclusion there is described a consideration to increase a number of appropriate mail pieces for the sorting machine.
148	Mobile Real-Time License Plate Recognition Liaqat, Ahmad Gull January 2011 (has links) License plate recognition (LPR) system plays an important role in numerous applications, such as parking accounting systems, traffic law enforcement, road monitoring, expressway toll system, electronic-police system, and security systems. In recent years, there has been a lot of research in license plate recognition, and many recognition systems have been proposed and used. But these systems have been developed for computers. In this project, we developed a mobile LPR system for Android Operating System (OS). LPR involves three main components: license plate detection, character segmentation and Optical Character Recognition (OCR). For License Plate Detection and character segmentation, we used JavaCV and OpenCV libraries. And for OCR, we used tesseract-ocr. We obtained very good results by using these libraries. We also stored records of license numbers in database and for that purpose SQLite has been used. JavaCV OpenCV Android License Plate License Plate Recognition in Android Real Time License Plate Recognition LPR System in Android Character Segmentation Optical Character Recognition Haar-Training Computer Sciences Datavetenskap (datalogi)
149	Kontrola zobrazení textu ve formulářích / Quality Check of Text in Forms Moravec, Zbyněk January 2017 (has links) Purpose of this thesis is the quality check of correct button text display on photographed monitors. These photographs contain a variety of image distortions which complicates the following image graphic element recognition. This paper outlines several possibilities to detect buttons on forms and further elaborates on the implemented detection based on contour shapes description. After buttons are found, their defects are detected subsequently. Additionally, this thesis describes an automatic identification of picture with the highest quality for documentation purposes.
150	Rozpoznávání historických textů pomocí hlubokých neuronových sítí / Convolutional Networks for Historic Text Recognition Kišš, Martin January 2018 (has links) The aim of this work is to create a tool for automatic transcription of historical documents. The work is mainly focused on the recognition of texts from the period of modern times written using font Fraktur. The problem is solved with a newly designed recurrent convolutional neural networks and a Spatial Transformer Network. Part of the solution is also an implemented generator of artificial historical texts. Using this generator, an artificial data set is created on which the convolutional neural network for line recognition is trained. This network is then tested on real historical lines of text on which the network achieves up to 89.0 % of character accuracy. The contribution of this work is primarily the newly designed neural network for text line recognition and the implemented artificial text generator, with which it is possible to train the neural network to recognize real historical lines of text.

Search results