Global ETD Search

141	Wie sehr können maschinelle Indexierung und modernes Information Retrieval Bibliotheksrecherchen verbessern? Hauer, Manfred 30 November 2004 (has links) Mit maschinellen Verfahren lässt sich die Qualität der Inhaltserschließung dramatisch steigern. intelligentCAPTURE ist seit 2002 produktiv im Einsatz in Bibliotheken und Dokumentationszentren. Zu dessen Verfahren gehören Module für die Dokumentenakquisition, insbesondere Scanning und OCR, korrekte Textextraktion aus PDF-Dateien und Websites sowie Spracherkennung für "textlose" Objekte. Zusätzliche Verfahren zur Informationsextraktion können optional folgen. Als relevant erkannter Content wird mittels der CAI-Engine (Computer Aided Indexing) maschinell inhaltlich ausgewertet. Dort findet ein Zusammenspiel computerlinguistischer Verfahren (sprachabhängige Morphologie, Syntaxanalyse, Statistik) und semantischer Strukturen (Klassifikationen, Systematiken, Thesauri, Topic Maps, RDF, semantische Netze) statt. Aufbereitete Inhalte und fertige, human editierbare Indexate werden schließlich über frei definierbare Exportformate an die jeweiligen Bibliothekssysteme und in der Regel auch an intelligentSEARCH übergeben. intelligentSEARCH ist eine zentrale Verbunddatenbank zum Austausch zwischen allen produktiven Partnern weltweit aus dem öffentlichen und privatwirtschaftlichen Bereich. Der Austausch ist auf tauschbare Medien, bislang Inhaltsverzeichnisse, aus urheberrechtlichen Gründen begrenzt. Gleichzeitig ist diese Datenbank "Open Content" für die akademische Öffentlichkeit mit besonders leistungsstarken Retrieval-Funktionen, insbesondere mit semantischen Recherche-Möglichkeiten und der Visualisierung von semantischen Strukturen (http://www.agi-imc.de/intelligentSEARCH.nsf). Sowohl für die Indexierung als auch für die Recherche können unterschiedliche semantische Strukturen genutzt werden - je nach Erkenntnisinteresse, Weltsicht oder Sprache. info:eu-repo/classification/ddc/020 ddc:020 info:eu-repo/classification/ddc/004 ddc:004 Klassifikation Scanning Systematik Thesaurus OCR RDF computer aided indexing topic map
142	Analogue meters in a digital world : Minimizing data size when offloading OCR processes Davidsson, Robin, Sjölander, Fredrik January 2022 (has links) Introduction: Instead of replacing existing analogue water meters with Internet of Things (IoT) connected substitutes, an alternative would be to attach an IoT connected module to the analogue water meter that optically reads the meter value using Optical Character Recognition (OCR). Such a module would need to be battery-powered given that access to the electrical grid is typically limited near water meters. Research has shown that offloading the OCR process can reduce the power dissipation from the battery, and that this dissipation can be reduced even further by reducing the amount of data that is transmitted. Purpose: For the sake of minimising energy consumption in the proposed solution, the purpose of the study is to find out to what extent it is possible to reduce an input image’s file size by means of resolution, colour depth, and compression before the Google Cloud Vision OCR engine no longer returns feasible results. Method and implementation: 250 images of analogue water meter values were processed by the Google Vision Cloud OCR through 38 000 different combinations of resolution, colour depth, and upscaling. Results: The highest rate of successful OCR readings with a minimal file size were found among images within a range of resolutions between 133 x 22 to 163 x 27 pixels and colour depths between 1- and 2-bits/pixel. Conclusion: The study shows that there is a potential for minimising data sizes, and thereby energy consumption, by offloading the OCR process by means of transmitting images of minimal file size. Optical Character Recognition OCR Offloading Energy efficiency Compression Google Cloud Vision Water meters Internet of Things IoT Cloud Embedded Systems Inbäddad systemteknik Computer Systems Datorsystem Communication Systems Kommunikationssystem
143	Text-image Restoration And Text Alignment For Multi-engine Optical Character Recognition Systems Kozlovski, Nikolai 01 January 2006 (has links) Previous research showed that combining three different optical character recognition (OCR) engines (ExperVision® OCR, Scansoft OCR, and Abbyy® OCR) results using voting algorithms will get higher accuracy rate than each of the engines individually. While a voting algorithm has been realized, several aspects to automate and improve the accuracy rate needed further research. This thesis will focus on morphological image preprocessing and morphological text restoration that goes to OCR engines. This method is similar to the one used in restoration partial finger prints. Series of morphological dilating and eroding filters of various mask shapes and sizes were applied to text of different font sizes and types with various noises added. These images were then processed by the OCR engines, and based on these results successful combinations of text, noise, and filters were chosen. The thesis will also deal with the problem of text alignment. Each OCR engine has its own way of dealing with noise and corrupted characters; as a result, the output texts of OCR engines have different lengths and number of words. This in turn, makes it impossible to use spaces a delimiter as a method to separate the words for processing by the voting part of the system. Text aligning determines, using various techniques, what is an extra word, what is supposed to be two or more words instead of one, which words are missing in one document compared to the other, etc. Alignment algorithm is made up of a series of shifts in the two texts to determine which parts are similar and which are not. Since errors made by OCR engines are due to visual misrecognition, in addition to simple character comparison (equal or not), a technique was developed that allows comparison of characters based on how they look. IMAGE RESTORATION ALIGNMENT OCR MULTI-ENGINE MULTI ENGINE MULTIENGINE OPTICAL CHARACTER RECOGNITION VISUAL CHARACTER COMPARISON Electrical and Computer Engineering Electrical and Electronics Engineering
144	Arabic text recognition of printed manuscripts. Efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processing. Al-Muhtaseb, Husni A. January 2010 (has links) Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%. / King Fahd University of Petroleum and Minerals (KFUPM) Arabic text recognition Hidden Markov Models Feature extraction Omni font recognition Minimal Arabic script Bigram Statistical Language Model Optical character recognition (OCR) Statistical and analytical analysis
145	Study of augmentations on historical manuscripts using TrOCR Meoded, Erez 08 December 2023 (has links) (PDF) Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance. TrOCR Transformer Ensemble learning OCR Handwritten Text Recognition Deep Neural Networks Machine Learning Artificial Intelligence Huggingface Python Artificial Intelligence and Robotics Data Science
146	New Approaches to OCR for Early Printed Books Weichselbaumer, Nikolaus, Seuret, Mathias, Limbach, Saskia, Dong, Rui, Burghardt, Manuel, Christlein, Vincent 29 May 2024 (has links) Books printed before 1800 present major problems for OCR. One of the mainobstacles is the lack of diversity of historical fonts in training data. The OCR-D project, consisting of book historians and computer scientists, aims to address this deficiency by focussing on three major issues. Our first target wasto create a tool that identifies font groups automatically in images of histori-cal documents. We concentrated on Gothic font groups that were commonlyused in German texts printed in the 15thand 16th century: the well-known Fraktur and the lesser known Bastarda, Rotunda, Textura und Schwabacher. The tool was trained with 35,000 images and reaches an accuracy level of 98%. It can not only differentiate between the above-mentioned font groupsbut also Hebrew, Greek, Antiqua and Italic. It can also identify woodcut im-ages and irrelevant data (book covers, empty pages, etc.). In a second step,we created an online training infrastructure (okralact), which allows for theuse of various open source OCR engines such as Tesseract, OCRopus, Krakenand Calamari. At the same time, it facilitates training for specific models offont groups. The high accuracy of the recognition tool paves the way for theunprecedented opportunity to differentiate between the fonts used by individual printers. With more training data and further adjustments, the toolcould help to fill a major gap in historical research info:eu-repo/classification/ddc/002 ddc:002 info:eu-repo/classification/ddc/006 ddc:006
147	From Historical Newspapers to Machine-Readable Data: The Origami OCR Pipeline Liebl, Bernhard, Burghardt, Manuel 20 June 2024 (has links) While historical newspapers recently have gained a lot of attention in the digital humanities, transforming them into machine-readable data by means of OCR poses some major challenges. In order to address these challenges, we have developed an end-to-end OCR pipeline named Origami. This pipeline is part of a current project on the digitization and quantitative analysis of the German newspaper “Berliner Börsen-Zeitung” (BBZ), from 1872 to 1931. The Origami pipeline reuses existing open source OCR components and on top offers a new configurable architecture for layout detection, a simple table recognition, a two-stage X-Y cut for reading order detection, and a new robust implementation for document dewarping. In this paper we describe the different stages of the workflow and discuss how they meet the above-mentioned challenges posed by historical newspapers. info:eu-repo/classification/ddc/006 ddc:006 info:eu-repo/classification/ddc/800 ddc:800
148	Image Retrieval in Digital Libraries: A Large Scale Multicollection Experimentation of Machine Learning techniques Moreux, Jean-Philippe, Chiron, Guillaume 16 October 2017 (has links) While historically digital heritage libraries were first powered in image mode, they quickly took advantage of OCR technology to index printed collections and consequently improve the scope and performance of the information retrieval services offered to users. But the access to iconographic resources has not progressed in the same way, and the latter remain in the shadows: manual incomplete and heterogeneous indexation, data silos by iconographic genre. Today, however, it would be possible to make better use of these resources, especially by exploiting the enormous volumes of OCR produced during the last two decades, and thus valorize these engravings, drawings, photographs, maps, etc. for their own value but also as an attractive entry point into the collections, supporting discovery and serenpidity from document to document and collection to collection. This article presents an ETL (extract-transform-load) approach to this need, that aims to: Identify and extract iconography wherever it may be found, in image collections but also in printed materials (dailies, magazines, monographies); Transform, harmonize and enrich the image descriptive metadata (in particular with machine learning classification tools); Load it all into a web app dedicated to image retrieval. The approach is pragmatically dual, since it involves leveraging existing digital resources and (virtually) on-the-shelf technologies. / Si historiquement, les bibliothèques numériques patrimoniales furent d’abord alimentées par des images, elles profitèrent rapidement de la technologie OCR pour indexer les collections imprimées afin d’améliorer périmètre et performance du service de recherche d’information offert aux utilisateurs. Mais l’accès aux ressources iconographiques n’a pas connu les mêmes progrès et ces dernières demeurent dans l’ombre : indexation manuelle lacunaire, hétérogène et non viable à grande échelle ; silos documentaires par genre iconographique ; recherche par le contenu (CBIR, content-based image retrieval) encore peu opérationnelle sur les collections patrimoniales. Aujourd’hui, il serait pourtant possible de mieux valoriser ces ressources, en particulier en exploitant les énormes volumes d’OCR produits durant les deux dernières décennies (tant comme descripteur textuel que pour l’identification automatique des illustrations imprimées). Et ainsi mettre en valeur ces gravures, dessins, photographies, cartes, etc. pour leur valeur propre mais aussi comme point d’entrée dans les collections, en favorisant découverte et rebond de document en document, de collection à collection. Cet article décrit une approche ETL (extract-transform-load) appliquée aux images d’une bibliothèque numérique à vocation encyclopédique : identifier et extraire l’iconographie partout où elle se trouve (dans les collections image mais aussi dans les imprimés : presse, revue, monographie) ; transformer, harmoniser et enrichir ses métadonnées descriptives grâce à des techniques d’apprentissage machine – machine learning – pour la classification et l’indexation automatiques ; charger ces données dans une application web dédiée à la recherche iconographique (ou dans d’autres services de la bibliothèque). Approche qualifiée de pragmatique à double titre, puisqu’il s’agit de valoriser des ressources numériques existantes et de mettre à profit des technologies (quasiment) mâtures. info:eu-repo/classification/ddc/004 ddc:004
149	Simultaneous Detection and Validation of Multiple Ingredients on Product Packages: An Automated Approach : Using CNN and OCR Techniques / Simultant detektering och validering av flertal ingredienser på produktförpackningar: Ett automatiserat tillvägagångssätt : Genom användning av CNN och OCR tekniker Farokhynia, Rodbeh, Krikeb, Mokhtar January 2024 (has links) Manual proofreading of product packaging is a time-consuming and uncertain process that can pose significant challenges for companies, such as scalability issues, compliance risks and high costs. This thesis work introduces a novel solution by employing advanced computer vision and machine learning methods to automate the proofreading of multiple ingredients’ lists corresponding to multiple products simultaneously within a product package. By integrating Convolutional Neural Network (CNN) and Optical Character Recognition (OCR) techniques, this study examines the efficacy of automated proofreading in comparison to manual methods. The thesis involves analyzing product package artwork to identify ingredient lists utilizing the YOLOv5 object detection algorithm and the optical character recognition tool EasyOCR for ingredient extraction. Additionally, Python scripts are employed to extract ingredients from corresponding INCI PDF files (document that lists the standardized names of ingredients used in cosmetic products). A comprehensive comparison is then conducted to evaluate the accuracy and efficiency of automated proofreading. The comparison of the extracted ingredients from the product packages and their corresponding INCI PDF files yielded a match of 12.7%. Despite the suboptimal result, insights from the study highlights the limitations of current detection and recognition algorithms when applied to complex artwork. A few examples of the insights have been that the trained YOLOv5 model cuts through sentences in the ingredient list or that EasyOCR cannot extract ingredients from vertically aligned product package images. The findings underscore the need for advancements in detection algorithms and OCR tools to effectively handle objects like product packaging designs. The study also suggests that companies, such as H&M, consider updating their artwork and INCI PDF files to align with the capabilities of current AI-driven tools. By doing so, they can enhance the efficiency and overall effectiveness of automated proofreading processes, thereby reducing errors and improving accuracy. / Manuell korrekturläsning av produktförpackningar är en tidskrävande och osäker process som kan skapa betydande utmaningar för företag, såsom skalbarhetsproblem, efterlevnadsrisker och höga kostnader. Detta examensarbete presenterar en ny lösning genom att använda avancerade metoder inom datorseende och maskininlärning för att automatisera korrekturläsningen av flera ingredienslistor som motsvarar flera produkter samtidigt inom en produktförpackning. Genom att integrera Convolutional Neural Network (CNN) och Optical Character Recognition (OCR) utreder denna studie effektiviteten av automatiserad korrekturläsning i jämförelse med manuella metoder. Avhandlingen analyserar designen av produktförpackningar för att identifiera ingredienslistor med hjälp av objektdetekteringsalgoritmen YOLOv5 och det optiska teckenigenkänningsverktyget EasyOCR för extrahera enskilda ingredienser från listorna. Utöver detta används Python-skript för att extrahera ingredienser från motsvarande INCI-PDF filer (dokument med standardiserade namn på ingredienser som används i kosmetika produkter). En omfattande jämförelse genomförs sedan för att utvärdera noggrannheten och effektiviteten hos automatiserad korrekturläsning. Jämförelsen av de extraherade ingredienserna från produktförpackningarna och deras korresponderande INCI-PDF filer gav ett matchnings resultat på 12.7%. Trots de mindre optimala resultaten belyser studien de begränsningar som finns hos de nuvarande detekterings- och teckenigenkänningsalgoritmerna när de appliceras på komplexa verk av produktförpackningar. Ett fåtal exempel på insikterna är bland annat att den tränade YOLOv5 modellen skär igenom meningar i ingredienslistan eller att EasyOCR inte kan extrahera ingredienser från stående ingredienslistor på produktförpackningsbilder. Resultaten understryker behovet av framsteg inom detekteringsalgoritmer och OCR-verktyg för att effektivt kunna hantera komplexa objekt som produktförpackningar. Studien föreslår även att företag, såsom H&M, överväger att uppdatera sina design av produktförpackningar och INCI-PDF filer för att anpassa sig till kapaciteten hos aktuella AI-drivna verktyg. Genom att utföra detta kan de förbättra både effektiviteten och den övergripande kvaliteten hos de automatiserade korrekturläsningsprocesserna, vilket minskar fel och ökar noggrannheten. Proofreading product packaging automation computer vision machine learning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR accuracy Manuell korrekturläsning automatisering produktförpackningar datorseende maskininlärning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR noggrannhet Other Computer and Information Science Annan data- och informationsvetenskap
150	Automobilių registracijos numerių atpažinimo tyrimas / Analysis of car number plate recognition Laptik, Raimond 17 June 2005 (has links) In the presented master paper: Analysis of car number plate recognition, optical character recognition (OCR), OCR software, OCR devices and systems are reviewed. Image processing operators and artificial neural networks are presented. Analysis and application of image processing operators for detection of number plate is done. Experimental results of estimation of Kohonen and multilayer feedforward artificial neural network learning parameters are presented. Number plate recognition is performed by the use of multilayer feedforward artificial neural network. Model of number plate recognition system is created. Number plate recognition software works in Microsoft© Windows™ operating system. Software is written with C++ language. Experimental results of system model operation are presented. Electronics and Electrical Engineering Number plate recognition ANN Kohoneno tinklas Dirbtinių neuronų tinklai Multilayer feedforward network Daugiasluoksnis perceptronų tinklas Spausdintinių simbolių atpažinimas Transporto srauto kontrolė Number plate location Kohonen network OCR

Search results