Global ETD Search

131	Přístupy k řešení digitalizace dokumentů. / Approaches to document digitalization solutions. Novotný, Vladimír January 2011 (has links) The objective of this thesis is to provide a survey of document digitalization and to analyse the market of companies outsourcing the document digitalization. The first part of the thesis decribes the technology of scannig and the methods of document recognition and data minig. It also decribes the systems of barcodes used to identify documents. Furthermore, this thesis includes the principles of document saving and electronic (digital) signature issues from the viewpoint of Czech legislation. Its contribution lies in analysing the companies dealing with outsourcing of the document digitalization and in the view of a company using these services. A brief outlook to the future regarding this topic is included as well.
132	Arabic Text Recognition and Machine Translation Alkhoury, Ihab 13 July 2015 (has links) [EN] Research on Arabic Handwritten Text Recognition (HTR) and Arabic-English Machine Translation (MT) has been usually approached as two independent areas of study. However, the idea of creating one system that combines both areas together, in order to generate English translation out of images containing Arabic text, is still a very challenging task. This process can be interpreted as the translation of Arabic images. In this thesis, we propose a system that recognizes Arabic handwritten text images, and translates the recognized text into English. This system is built from the combination of an HTR system and an MT system. Regarding the HTR system, our work focuses on the use of Bernoulli Hidden Markov Models (BHMMs). BHMMs had proven to work very well with Latin script. Indeed, empirical results based on it were reported on well-known corpora, such as IAM and RIMES. In this thesis, these results are extended to Arabic script, in particular, to the well-known IfN/ENIT and NIST OpenHaRT databases for Arabic handwritten text. The need for transcribing Arabic text is not only limited to handwritten text, but also to printed text. Arabic printed text might be considered as a simple form of handwritten text version. Thus, for this kind of text, we also propose Bernoulli HMMs. In addition, we propose to compare BHMMs with state-of-the-art technology based on neural networks. A key idea that has proven to be very effective in this application of Bernoulli HMMs is the use of a sliding window of adequate width for feature extraction. This idea has allowed us to obtain very competitive results in the recognition of both Arabic handwriting and printed text. Indeed, a system based on it ranked first at the ICDAR 2011 Arabic recognition competition on the Arabic Printed Text Image (APTI) database. Moreover, this idea has been refined by using repositioning techniques for extracted windows, leading to further improvements in Arabic text recognition. In the case of handwritten text, this refinement improved our system which ranked first at the ICFHR 2010 Arabic handwriting recognition competition on IfN/ENIT. In the case of printed text, this refinement led to an improved system which ranked second at the ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text on APTI. Furthermore, this refinement was used with neural networks-based technology, which led to state-of-the-art results. For machine translation, the system was based on the combination of three state-of-the-art statistical models: the standard phrase-based models, the hierarchical phrase-based models, and the N-gram phrase-based models. This combination was done using the Recognizer Output Voting Error Reduction (ROVER) method. Finally, we propose three methods of combining HTR and MT to develop an Arabic image translation system. The system was evaluated on the NIST OpenHaRT database, where competitive results were obtained. / [ES] El reconocimiento de texto manuscrito (HTR) en árabe y la traducción automática (MT) del árabe al inglés se han tratado habitualmente como dos áreas de estudio independientes. De hecho, la idea de crear un sistema que combine las dos áreas, que directamente genere texto en inglés a partir de imágenes que contienen texto en árabe, sigue siendo una tarea difícil. Este proceso se puede interpretar como la traducción de imágenes de texto en árabe. En esta tesis, se propone un sistema que reconoce las imágenes de texto manuscrito en árabe, y que traduce el texto reconocido al inglés. Este sistema está construido a partir de la combinación de un sistema HTR y un sistema MT. En cuanto al sistema HTR, nuestro trabajo se enfoca en el uso de los Bernoulli Hidden Markov Models (BHMMs). Los modelos BHMMs ya han sido probados anteriormente en tareas con alfabeto latino obteniendo buenos resultados. De hecho, existen resultados empíricos publicados usando corpus conocidos, tales como IAM o RIMES. En esta tesis, estos resultados se han extendido al texto manuscrito en árabe, en particular, a las bases de datos IfN/ENIT y NIST OpenHaRT. En aplicaciones reales, la transcripción del texto en árabe no se limita únicamente al texto manuscrito, sino también al texto impreso. El texto impreso se puede interpretar como una forma simplificada de texto manuscrito. Por lo tanto, para este tipo de texto, también proponemos el uso de modelos BHMMs. Además, estos modelos se han comparado con tecnología del estado del arte basada en redes neuronales. Una idea clave que ha demostrado ser muy eficaz en la aplicación de modelos BHMMs es el uso de una ventana deslizante (sliding window) de anchura adecuada durante la extracción de características. Esta idea ha permitido obtener resultados muy competitivos tanto en el reconocimiento de texto manuscrito en árabe como en el de texto impreso. De hecho, un sistema basado en este tipo de extracción de características quedó en la primera posición en el concurso ICDAR 2011 Arabic recognition competition usando la base de datos Arabic Printed Text Image (APTI). Además, esta idea se ha perfeccionado mediante el uso de técnicas de reposicionamiento aplicadas a las ventanas extraídas, dando lugar a nuevas mejoras en el reconocimiento de texto árabe. En el caso de texto manuscrito, este refinamiento ha conseguido mejorar el sistema que ocupó el primer lugar en el concurso ICFHR 2010 Arabic handwriting recognition competition usando IfN/ENIT. En el caso del texto impreso, este refinamiento condujo a un sistema mejor que ocupó el segundo lugar en el concurso ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el que se usaba APTI. Por otro lado, esta técnica se ha evaluado también en tecnología basada en redes neuronales, lo que ha llevado a resultados del estado del arte. Respecto a la traducción automática, el sistema se ha basado en la combinación de tres tipos de modelos estadísticos del estado del arte: los modelos standard phrase-based, los modelos hierarchical phrase-based y los modelos N-gram phrase-based. Esta combinación se hizo utilizando el método Recognizer Output Voting Error Reduction (ROVER). Por último, se han propuesto tres métodos para combinar los sistemas HTR y MT con el fin de desarrollar un sistema de traducción de imágenes de texto árabe a inglés. El sistema se ha evaluado sobre la base de datos NIST OpenHaRT, donde se han obtenido resultados competitivos. / [CAT] El reconeixement de text manuscrit (HTR) en àrab i la traducció automàtica (MT) de l'àrab a l'anglès s'han tractat habitualment com dues àrees d'estudi independents. De fet, la idea de crear un sistema que combine les dues àrees, que directament genere text en anglès a partir d'imatges que contenen text en àrab, continua sent una tasca difícil. Aquest procés es pot interpretar com la traducció d'imatges de text en àrab. En aquesta tesi, es proposa un sistema que reconeix les imatges de text manuscrit en àrab, i que tradueix el text reconegut a l'anglès. Aquest sistema està construït a partir de la combinació d'un sistema HTR i d'un sistema MT. Pel que fa al sistema HTR, el nostre treball s'enfoca en l'ús dels Bernoulli Hidden Markov Models (BHMMs). Els models BHMMs ja han estat provats anteriorment en tasques amb alfabet llatí obtenint bons resultats. De fet, existeixen resultats empírics publicats emprant corpus coneguts, tals com IAM o RIMES. En aquesta tesi, aquests resultats s'han estès a la escriptura manuscrita en àrab, en particular, a les bases de dades IfN/ENIT i NIST OpenHaRT. En aplicacions reals, la transcripció de text en àrab no es limita únicament al text manuscrit, sinó també al text imprès. El text imprès es pot interpretar com una forma simplificada de text manuscrit. Per tant, per a aquest tipus de text, també proposem l'ús de models BHMMs. A més a més, aquests models s'han comparat amb tecnologia de l'estat de l'art basada en xarxes neuronals. Una idea clau que ha demostrat ser molt eficaç en l'aplicació de models BHMMs és l'ús d'una finestra lliscant (sliding window) d'amplària adequada durant l'extracció de característiques. Aquesta idea ha permès obtenir resultats molt competitius tant en el reconeixement de text àrab manuscrit com en el de text imprès. De fet, un sistema basat en aquest tipus d'extracció de característiques va quedar en primera posició en el concurs ICDAR 2011 Arabic recognition competition emprant la base de dades Arabic Printed Text Image (APTI). A més a més, aquesta idea s'ha perfeccionat mitjançant l'ús de tècniques de reposicionament aplicades a les finestres extretes, donant lloc a noves millores en el reconeixement de text en àrab. En el cas de text manuscrit, aquest refinament ha aconseguit millorar el sistema que va ocupar el primer lloc en el concurs ICFHR 2010 Arabic handwriting recognition competition usant IfN/ENIT. En el cas del text imprès, aquest refinament va conduir a un sistema millor que va ocupar el segon lloc en el concurs ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el qual s'usava APTI. D'altra banda, aquesta tècnica s'ha avaluat també en tecnologia basada en xarxes neuronals, el que ha portat a resultats de l'estat de l'art. Respecte a la traducció automàtica, el sistema s'ha basat en la combinació de tres tipus de models estadístics de l'estat de l'art: els models standard phrase-based, els models hierarchical phrase-based i els models N-gram phrase-based. Aquesta combinació es va fer utilitzant el mètode Recognizer Output Voting Errada Reduction (ROVER). Finalment, s'han proposat tres mètodes per combinar els sistemes HTR i MT amb la finalitat de desenvolupar un sistema de traducció d'imatges de text àrab a anglès. El sistema s'ha avaluat sobre la base de dades NIST OpenHaRT, on s'han obtingut resultats competitius. / Alkhoury, I. (2015). Arabic Text Recognition and Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/53029 / TESIS Arabic Image Translation Arabic OCR Arabic Recognition Bernoulli HMMs ESTADISTICA E INVESTIGACION OPERATIVA LENGUAJES Y SISTEMAS INFORMATICOS
133	Rozpoznání textu s využitím neuronových sítí / Text recognition with artificial neural networks Peřinová, Barbora January 2018 (has links) This master’s thesis deals with optical character recognition. The first part describes the basic types of optical character recognition tasks and divides algorithm into individual phases. For each phase the most commonly used methods are described in the next part. Within the character recognition phase the problematics of artificial neural networks and their usage in given phase is explained, specifically multilayer perceptron and convolutional neural networks. The second part deals with requirements definition for specific application to be used as feedback for robotic system. Convolution neural networks and CNTK library for deep learning using algorithm implementation in .NET is introduced. Finally, the test results of the individual phases of the proposed solution and the comparison with the open source Tesseract engine are discussed.
134	Detekce vad s využitím smart kamery / Defect detection using smart camera Hons, Viktor January 2021 (has links) This thesis deals with the application of smart cameras and verification of its functions. In the first part the term smart camera is defined, the parts of it and the most common applications are presented. A review of smart cameras from the different manufactures on the market is made. After selection of the proper camera model three task from real industrial application are specified – inspection of capacitor print, inspection of beer label and dimension measurement. With the picked camera the tasks are performed, including the layout of workplace, scene and lighting. Further the reliability is tested together with the successfulness and the speed of designed solution.
135	OCR cíleně znehodnocených textů / OCR of image based web form protection Peluch, Tibor January 2009 (has links) The thesis deals with programming of application in operating system Windows. Main features of application system Microsoft Foundation Class are resumed in brief here. In following part there is idea about implementing an application with graphic user interface that makes, using schema, work with data, possible. The third part deals with implementation of blocks into dynamic linked libraries and there is outlined a possibility to use data of this programme as an external module and a possibility of realtime data processing e.g. picture and sound. The verification of a good functionality of this application is in the last part. The application is really tested in diagnosing of devaluated texts for protecting web forms www.centrum.cz. There were designed blocks making picture read possible just from internet, preprocessing, segmentation, feature extraction, evaluationg in neural network and blocks that make possible to read and save processed data into the disc.
136	Rozpoznání SPZ/RZ / LPR detection and OCR Krajíček, Pavel January 2010 (has links) The theme of this thesi’s deals with the detection and recognition of car license plate from pictures made of screening machine situated on a crassing or inside a car. The thesis si divided into two basic parts. First deals with searching for presence of licence plate in the picture. If the marque was found, we continue the second part of the program which identificates the found license plate. The first part of program aspires to find the licence plate by the edge detectors. The second part classifies characters by the method based on an analytical description.
137	Jednoduché rozpoznávání písma / Simple Character Recognition Duba, Nikolas January 2011 (has links) This thesis is focused on optical character recognition and its processing. The goal of this application is to make it possible easily track daily expenses. It can be used by an individual or by a company as a monitoring tool. The main principle is to make this tool most as user friendly as it can be. The application gets its input from hardware, such as a scanner or camera, and analyzes the content of the cash voucher for further processing. To analyze the voucher, the application employs different optical character recognition methods. The result is subsequently parsed. Detailed explanations of used methods are inside the document. The application output is a filled database with cash voucher details. Another part of the work is an information system with the main purpose of displaying the collected data.
138	Zpracování obrazu v systému Android - odečet hodnoty plynoměru / Image processing using Android device - gas-meter value recognition Wertheim, Michal January 2016 (has links) This thesis describes the design of the image processing for Android system, consisting of the choice of the development environment and its implementation. Workflow soluti-on to the problem involves development of the Androidapplication and it’s graphical user interface. The text includes description of the application functionality, communica-tionwith a camera, storing and retrieving data. It also describes used algo-rithms and image processing methods used for detecting values from the counter of the gas meter.
139	Wie sehr können maschinelle Indexierung und modernes Information Retrieval Bibliotheksrecherchen verbessern? Hauer, Manfred 30 November 2004 (has links) Mit maschinellen Verfahren lässt sich die Qualität der Inhaltserschließung dramatisch steigern. intelligentCAPTURE ist seit 2002 produktiv im Einsatz in Bibliotheken und Dokumentationszentren. Zu dessen Verfahren gehören Module für die Dokumentenakquisition, insbesondere Scanning und OCR, korrekte Textextraktion aus PDF-Dateien und Websites sowie Spracherkennung für "textlose" Objekte. Zusätzliche Verfahren zur Informationsextraktion können optional folgen. Als relevant erkannter Content wird mittels der CAI-Engine (Computer Aided Indexing) maschinell inhaltlich ausgewertet. Dort findet ein Zusammenspiel computerlinguistischer Verfahren (sprachabhängige Morphologie, Syntaxanalyse, Statistik) und semantischer Strukturen (Klassifikationen, Systematiken, Thesauri, Topic Maps, RDF, semantische Netze) statt. Aufbereitete Inhalte und fertige, human editierbare Indexate werden schließlich über frei definierbare Exportformate an die jeweiligen Bibliothekssysteme und in der Regel auch an intelligentSEARCH übergeben. intelligentSEARCH ist eine zentrale Verbunddatenbank zum Austausch zwischen allen produktiven Partnern weltweit aus dem öffentlichen und privatwirtschaftlichen Bereich. Der Austausch ist auf tauschbare Medien, bislang Inhaltsverzeichnisse, aus urheberrechtlichen Gründen begrenzt. Gleichzeitig ist diese Datenbank "Open Content" für die akademische Öffentlichkeit mit besonders leistungsstarken Retrieval-Funktionen, insbesondere mit semantischen Recherche-Möglichkeiten und der Visualisierung von semantischen Strukturen (http://www.agi-imc.de/intelligentSEARCH.nsf). Sowohl für die Indexierung als auch für die Recherche können unterschiedliche semantische Strukturen genutzt werden - je nach Erkenntnisinteresse, Weltsicht oder Sprache. info:eu-repo/classification/ddc/020 ddc:020 info:eu-repo/classification/ddc/004 ddc:004 Klassifikation Scanning Systematik Thesaurus OCR RDF computer aided indexing topic map
140	Analogue meters in a digital world : Minimizing data size when offloading OCR processes Davidsson, Robin, Sjölander, Fredrik January 2022 (has links) Introduction: Instead of replacing existing analogue water meters with Internet of Things (IoT) connected substitutes, an alternative would be to attach an IoT connected module to the analogue water meter that optically reads the meter value using Optical Character Recognition (OCR). Such a module would need to be battery-powered given that access to the electrical grid is typically limited near water meters. Research has shown that offloading the OCR process can reduce the power dissipation from the battery, and that this dissipation can be reduced even further by reducing the amount of data that is transmitted. Purpose: For the sake of minimising energy consumption in the proposed solution, the purpose of the study is to find out to what extent it is possible to reduce an input image’s file size by means of resolution, colour depth, and compression before the Google Cloud Vision OCR engine no longer returns feasible results. Method and implementation: 250 images of analogue water meter values were processed by the Google Vision Cloud OCR through 38 000 different combinations of resolution, colour depth, and upscaling. Results: The highest rate of successful OCR readings with a minimal file size were found among images within a range of resolutions between 133 x 22 to 163 x 27 pixels and colour depths between 1- and 2-bits/pixel. Conclusion: The study shows that there is a potential for minimising data sizes, and thereby energy consumption, by offloading the OCR process by means of transmitting images of minimal file size. Optical Character Recognition OCR Offloading Energy efficiency Compression Google Cloud Vision Water meters Internet of Things IoT Cloud Embedded Systems Inbäddad systemteknik Computer Systems Datorsystem Communication Systems Kommunikationssystem

Search results