• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 80
  • 6
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 123
  • 123
  • 123
  • 55
  • 52
  • 37
  • 28
  • 25
  • 24
  • 23
  • 23
  • 21
  • 20
  • 17
  • 16
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Single image super-resolution based on neural networks for text and face recognition / Super-résolution d'image unique basée sur des réseaux de neurones pour la reconnaissance de texte et de visage

Peyrard, Clément 29 September 2017 (has links)
Cette thèse porte sur les méthodes de super-résolution (SR) pour l’amélioration des performances des systèmes de reconnaissance automatique (OCR, reconnaissance faciale). Les méthodes de Super-Résolution (SR) permettent de générer des images haute résolution (HR) à partir d’images basse résolution (BR). Contrairement à un rééchantillonage par interpolation, elles restituent les hautes fréquences spatiales et compensent les artéfacts (flou, crénelures). Parmi elles, les méthodes d’apprentissage automatique telles que les réseaux de neurones artificiels permettent d’apprendre et de modéliser la relation entre les images BR et HR à partir d’exemples. Ce travail démontre l’intérêt des méthodes de SR à base de réseaux de neurones pour les systèmes de reconnaissance automatique. Les réseaux de neurones à convolutions sont particulièrement adaptés puisqu’ils peuvent être entraînés à extraire des caractéristiques non-linéaires bidimensionnelles pertinentes tout en apprenant la correspondance entre les espaces BR et HR. Sur des images de type documents, la méthode proposée permet d’améliorer la précision en reconnaissance de caractère de +7.85 points par rapport à une simple interpolation. La création d’une base d’images annotée et l’organisation d’une compétition internationale (ICDAR2015) ont souligné l’intérêt et la pertinence de telles approches. Pour les images de visages, les caractéristiques faciales sont cruciales pour la reconnaissance automatique. Une méthode en deux étapes est proposée dans laquelle la qualité de l’image est d’abord globalement améliorée, pour ensuite se focaliser sur les caractéristiques essentielles grâce à des modèles spécifiques. Les performances d’un système de vérification faciale se trouvent améliorées de +6.91 à +8.15 points. Enfin, pour le traitement d’images BR en conditions réelles, l’utilisation de réseaux de neurones profonds permet d’absorber la variabilité des noyaux de flous caractérisant l’image BR, et produire des images HR ayant des statistiques naturelles sans connaissance du modèle d’observation exact. / This thesis is focussed on super-resolution (SR) methods for improving automatic recognition system (Optical Character Recognition, face recognition) in realistic contexts. SR methods allow to generate high resolution images from low resolution ones. Unlike upsampling methods such as interpolation, they restore spatial high frequencies and compensate artefacts such as blur or jaggy edges. In particular, example-based approaches learn and model the relationship between low and high resolution spaces via pairs of low and high resolution images. Artificial Neural Networks are among the most efficient systems to address this problem. This work demonstrate the interest of SR methods based on neural networks for improved automatic recognition systems. By adapting the data, it is possible to train such Machine Learning algorithms to produce high-resolution images. Convolutional Neural Networks are especially efficient as they are trained to simultaneously extract relevant non-linear features while learning the mapping between low and high resolution spaces. On document text images, the proposed method improves OCR accuracy by +7.85 points compared with simple interpolation. The creation of an annotated image dataset and the organisation of an international competition (ICDAR2015) highlighted the interest and the relevance of such approaches. Moreover, if a priori knowledge is available, it can be used by a suitable network architecture. For facial images, face features are critical for automatic recognition. A two step method is proposed in which image resolution is first improved, followed by specialised models that focus on the essential features. An off-the-shelf face verification system has its performance improved from +6.91 up to +8.15 points. Finally, to address the variability of real-world low-resolution images, deep neural networks allow to absorb the diversity of the blurring kernels that characterise the low-resolution images. With a single model, high-resolution images are produced with natural image statistics, without any knowledge of the actual observation model of the low-resolution image.
112

Mobile Real-Time License Plate Recognition

Liaqat, Ahmad Gull January 2011 (has links)
License plate recognition (LPR) system plays an important role in numerous applications, such as parking accounting systems, traffic law enforcement, road monitoring, expressway toll system, electronic-police system, and security systems. In recent years, there has been a lot of research in license plate recognition, and many recognition systems have been proposed and used. But these systems have been developed for computers. In this project, we developed a mobile LPR system for Android Operating System (OS). LPR involves three main components: license plate detection, character segmentation and Optical Character Recognition (OCR). For License Plate Detection and character segmentation, we used JavaCV and OpenCV libraries. And for OCR, we used tesseract-ocr. We obtained very good results by using these libraries. We also stored records of license numbers in database and for that purpose SQLite has been used.
113

運用光學字元辨識技術建置數位典藏全文資料庫之評估:以明人文集為例 / The Analysis of Use Optical Character Recognition to Establish the Full-text Retrieval Database:A Case Study of the Anthology of Chinese Literature in Ming

蔡瀚緯, Tsai, Han Wei Unknown Date (has links)
數位典藏是將物件以數位影像的形式進行典藏,並放置在網路系統供使用者瀏覽,能達到流通推廣與保存維護的效果。但在目前資訊爆炸的時代,數位典藏若僅透過詮釋資料描述是無法有效幫助使用者獲得內容資訊,唯有將之建置成全文檢索模式,才能方便使用者快速檢索到所需資訊,而光學字元辨識技術(簡稱OCR)能協助進行全文內容的輸出。 本研究藉由實際操作OCR軟體辨識明代古籍,探究古籍版式及影像對於軟體辨識結果之影響;藉由深度訪談訪問有實際參與數位典藏全文化經驗之機構人員,探究機構或個人對於計畫施行之觀點與考量。結果發現,雖然實際辨識結果顯示古籍版式與影像會對於OCR辨識有所影響,綜合訪談內容得知目前技術層面已克服古籍版式的侷限,但對於影像品質的要求仍然很高,意指古籍影像之品質對OCR的辨識影響程度最大;雖然OCR辨識技術已經有所突破,顯示能善用此技術協助進行全文資料庫的建立,但礙於技術陌生、經費預算、人力資源等因素,使得多數機構尚未運用此技術協助執行數位典藏全文化。 本研究建議,機構日後若有興趣執行數位典藏全文化計畫,首先,需要制定經常出適合機構執行的作業流程,並且瞭解自身欲處理物件之狀況,好挑選出適合的輸入處理模式;再者,需要多與技術廠商溝通協調,瞭解所挑選之物件是否符合處理上的成本效益;最後,綜合典藏機構與使用者之需求考量下,建議未來採取與OCR廠商合作的方式,由使用者自行挑選需要物件進行OCR辨識,校對完成後將全文內容回饋給典藏機構。這樣不僅能瞭解使用者需求為何,也能降低機構全文校對所耗費的成本。 / Digital Archives, placed in the network system for users to browse, change the collection into the digital images, and can help to preserve the collection and promote the content information. However, in the era of information explosion, Digital Archives can’t help users to retrieve the information in the collection by simply recording metadata. So, only when built into the full text retrieval can Digital Archives provide users with a quick retrieval of the information they want. And the Optical Character Recognition (OCR) can help to output the full text information. The study explores the ancient books’ format and impact of image quality on the recognition results by recognizing the ancient books of the Ming dynasty with the OCR software. The study also explores institutional as well as individual views and considerations by in-depth interviewing institutional staff with experiences in the full text of Digital Archives plan. From the result we can discover that though the ancient books’ format and image quality do have influences on the recognition results, the overall interview suggests that the technology has overcome the limitation of the format under the high requirement for the image quality; that is, the quality of ancient books’ images is the most influential factor in the recognition results. Although the OCR already has the breakthrough in assisting the establishment of the full text database, most institutions have not yet applied this technology to full-textualization of the Digital Archives due to technical unfamiliar, budget, human resources and other factors. The study suggests that if some day one institution is interested in working on the the full text of the Digital Archives project, it firstly needs to develop a proper SOP and needs to understand the conditions of their ready-to-be-textualized collections so that it can adopt a suitable input mode. Secondly, this institution needs to communicate with the OCR company more so that it can realize whether the chosen collection fits the cost-effectiveness. Finally, under the considerations of both the institution and users, the study suggests that institutions can cooperate with OCR companies in the future, so users can choose collections for OCR recognition on their own and give the full text to the institutions as feedback after proofreading. This can not only understand users’ needs but also reduce the cost of the proofreading for the institution.
114

Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images

Stewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
115

Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images

Stewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
116

Active Learning pro zpracování archivních pramenů / Active Learning for Processing of Archive Sources

Hříbek, David January 2021 (has links)
This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
117

Studie řízení plynulých materiálových toků s využitím značení produktů / The Study of Control of Continous Flows with Using of Products Identification

Dvořáková, Alena January 2008 (has links)
Master´s thesis analyses current methods and procedures of storing and marking of goods of Disk obchod & technika, spol. s.r.o. company. It includes the proposal of goods identification which leads to the optimizing of continuous flows from the point of view of both simplification and acceleration of work and simpler and more accurate ways of goods identification. The proposal is related to the choice of appropriate method of goods identification and the selection of particular type of barcodes, including the necessary hardware.
118

Detekce a rozpoznání registrační značky vozidla pro analýzu dopravy / License Plate Detection and Recognition for Traffic Analysis

Černá, Tereza January 2015 (has links)
This thesis describes the design and development of a system for detection and recognition of license plates. The work is divided into three basic parts: licence plates detection, finding of character positions and optical character recognition. To fullfill the goal of this work, a new dataset was taken. It contains 2814 license plates used for training classifiers and 2620 plates to evaluate the success rate of the system. Cascade Classifier was used to train detector of licence plates, which has success rate up to 97.8 %. After that, pozitions of individual characters were searched in detected pozitions of licence plates. If there was no character found, detected pozition was not the licence plate. Success rate of licence plates detection with all the characters found is up to 88.5 %. Character recognition is performed by SVM classifier. The system detects successfully with no errors up to 97.7 % of all licence plates.
119

Desenvolvimento de um sistema de visão de máquina para inspeção de conformidade em um produto industrial /

Poleto, Arthur Suzini. January 2019 (has links)
Orientador: João Antonio Pereira / Resumo: Visão de máquina é um campo multidisciplinar que vem crescendo na indústria, que está cada vez mais preocupada em reduzir custos, automatizar processos, e atender requisitos de qualidade do produto para atender seus clientes. Processos de montagem realizados de forma manual com inspeção e controle visual são tipicamente processos susceptíveis a erros, à utilização de peças não conformes na montagem do produto final. Este trabalho apresenta uma proposta de desenvolvimento de um sistema de visão de máquina com base no processamento e análise de imagens digitais para a inspeção das características e especificações das peças e componentes utilizados na montagem de capotas marítimas, objetivando verificar e garantir a conformidade do produto final. A inspeção e avaliação da conformidade do produto são feitas por etapas com a utilização de duas câmeras, uma captura a imagem do código de identificação alfanumérico do produto e a outra inspeciona o conjunto de elementos de fixação. As imagens passam por um processo de tratamento que envolve a filtragem espacial utilizando máscara de médias para suavização, alargamento de contraste para expandir a faixa de intensidades e segmentação para formação dos objetos de interesse. Uma função de OCR é utilizada para a extração de caracteres e reconhecimento do código do produto e a extração de características específicas do conjunto de componentes de fixação é feita por descritores de forma representados pelos invariantes de momento. As caracte... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Machine vision is a growing multidisciplinary field in the industry that is increasingly concerned with reducing costs, automating processes, and meeting product quality requirements to serve its customers. Manual assembly processes with inspection and visual control are typically error-prone processes using non-conforming parts in the final product assembly. This work presents a proposal for the development of a machine vision system based on digital image processing and analysis for the inspection of the characteristics and specifications of the parts and components used in the assembly of marine bonnets, aiming to verify and ensure the conformity of the final product. Inspection and conformity assessment of the product are done in stages using two cameras, one capturing the image of the alphanumeric identification code of the product and the other inspecting the set of fasteners. The images undergo a treatment process that involves spatial filtering using averaging masks for smoothing, contrast widening to expand the range of intensities, and segmentation to form the objects of interest. An OCR function is used for character extraction and product code recognition, and the extraction of specific features of the fastener assembly is done by shape descriptors represented by the moment invariants. The specific characteristics of the fasteners are used to assess the conformity of the product with its respective code. The presentation of data and results of the implemented prop... (Complete abstract click electronic access below) / Mestre
120

Prisestimering på bostadsrätter : Implementering av OCR-metoder och Random Forest regression för datadriven värdering / Price estimation in the housing cooperative market : Implementation of OCR methods and Random Forest regression for data-driven valuation

Lövgren, Sofia, Löthman, Marcus January 2023 (has links)
This thesis explores the implementation of Optical Character Recognition (OCR) – based text extraction and random forest regression analysis for housing market valuation, specifically focusing on the impact of value factors, derived from OCR-extracted economic values from housing cooperatives’ annual reports. The objective is to perform price estimations using the Random Forest model to identify the key value factors that influence the estimation process and examine how the economic values from annual reports affect the sales price. The thesis aims to highlight the often-overlooked aspect that when purchasing an apartment, one also assumes the liabilities of the housing cooperative. The motivation for utilizing OCR techniques stems from the difficulties associated with manual data collection, as there is a lack of readily accessible structured data on the subject, emphasizing the importance of automation for effective data extraction. The findings indicate that OCR can effectively extract data from annual reports, but with limitations due to variation in report structures. The regression analysis reveals the Random Forest model’s effectiveness in estimating prices, with location and construction year emerging as the most influential factors. Furthermore, incorporating the economic values from the annual reports enhances the accuracy of price estimation compared to the model that excluded such factors. However, definitive conclusions regarding the precise impact of these economic factors could not be drawn due to limited geographical spread of data points and potential hidden value factors. The study concludes that the machine learning model can be used to make a credible price estimate on cooperative apartments and that OCR methods prove valuable in automating data extraction from annual reports, although standardising report format would enhance their efficiency. The thesis highlights the significance of considering the housing cooperatives’ economic values when making property purchases.

Page generated in 0.1508 seconds