• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 81
  • 6
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 124
  • 124
  • 124
  • 55
  • 52
  • 37
  • 28
  • 25
  • 24
  • 23
  • 23
  • 21
  • 20
  • 17
  • 16
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Simultaneous Detection and Validation of Multiple Ingredients on Product Packages: An Automated Approach : Using CNN and OCR Techniques / Simultant detektering och validering av flertal ingredienser på produktförpackningar: Ett automatiserat tillvägagångssätt : Genom användning av CNN och OCR tekniker

Farokhynia, Rodbeh, Krikeb, Mokhtar January 2024 (has links)
Manual proofreading of product packaging is a time-consuming and uncertain process that can pose significant challenges for companies, such as scalability issues, compliance risks and high costs. This thesis work introduces a novel solution by employing advanced computer vision and machine learning methods to automate the proofreading of multiple ingredients’ lists corresponding to multiple products simultaneously within a product package. By integrating Convolutional Neural Network (CNN) and Optical Character Recognition (OCR) techniques, this study examines the efficacy of automated proofreading in comparison to manual methods. The thesis involves analyzing product package artwork to identify ingredient lists utilizing the YOLOv5 object detection algorithm and the optical character recognition tool EasyOCR for ingredient extraction. Additionally, Python scripts are employed to extract ingredients from corresponding INCI PDF files (document that lists the standardized names of ingredients used in cosmetic products). A comprehensive comparison is then conducted to evaluate the accuracy and efficiency of automated proofreading. The comparison of the extracted ingredients from the product packages and their corresponding INCI PDF files yielded a match of 12.7%. Despite the suboptimal result, insights from the study highlights the limitations of current detection and recognition algorithms when applied to complex artwork. A few examples of the insights have been that the trained YOLOv5 model cuts through sentences in the ingredient list or that EasyOCR cannot extract ingredients from vertically aligned product package images. The findings underscore the need for advancements in detection algorithms and OCR tools to effectively handle objects like product packaging designs. The study also suggests that companies, such as H&M, consider updating their artwork and INCI PDF files to align with the capabilities of current AI-driven tools. By doing so, they can enhance the efficiency and overall effectiveness of automated proofreading processes, thereby reducing errors and improving accuracy. / Manuell korrekturläsning av produktförpackningar är en tidskrävande och osäker process som kan skapa betydande utmaningar för företag, såsom skalbarhetsproblem, efterlevnadsrisker och höga kostnader. Detta examensarbete presenterar en ny lösning genom att använda avancerade metoder inom datorseende och maskininlärning för att automatisera korrekturläsningen av flera ingredienslistor som motsvarar flera produkter samtidigt inom en produktförpackning. Genom att integrera Convolutional Neural Network (CNN) och Optical Character Recognition (OCR) utreder denna studie effektiviteten av automatiserad korrekturläsning i jämförelse med manuella metoder. Avhandlingen analyserar designen av produktförpackningar för att identifiera ingredienslistor med hjälp av objektdetekteringsalgoritmen YOLOv5 och det optiska teckenigenkänningsverktyget EasyOCR för extrahera enskilda ingredienser från listorna. Utöver detta används Python-skript för att extrahera ingredienser från motsvarande INCI-PDF filer (dokument med standardiserade namn på ingredienser som används i kosmetika produkter). En omfattande jämförelse genomförs sedan för att utvärdera noggrannheten och effektiviteten hos automatiserad korrekturläsning. Jämförelsen av de extraherade ingredienserna från produktförpackningarna och deras korresponderande INCI-PDF filer gav ett matchnings resultat på 12.7%. Trots de mindre optimala resultaten belyser studien de begränsningar som finns hos de nuvarande detekterings- och teckenigenkänningsalgoritmerna när de appliceras på komplexa verk av produktförpackningar. Ett fåtal exempel på insikterna är bland annat att den tränade YOLOv5 modellen skär igenom meningar i ingredienslistan eller att EasyOCR inte kan extrahera ingredienser från stående ingredienslistor på produktförpackningsbilder. Resultaten understryker behovet av framsteg inom detekteringsalgoritmer och OCR-verktyg för att effektivt kunna hantera komplexa objekt som produktförpackningar. Studien föreslår även att företag, såsom H&M, överväger att uppdatera sina design av produktförpackningar och INCI-PDF filer för att anpassa sig till kapaciteten hos aktuella AI-drivna verktyg. Genom att utföra detta kan de förbättra både effektiviteten och den övergripande kvaliteten hos de automatiserade korrekturläsningsprocesserna, vilket minskar fel och ökar noggrannheten.
112

Single image super-resolution based on neural networks for text and face recognition / Super-résolution d'image unique basée sur des réseaux de neurones pour la reconnaissance de texte et de visage

Peyrard, Clément 29 September 2017 (has links)
Cette thèse porte sur les méthodes de super-résolution (SR) pour l’amélioration des performances des systèmes de reconnaissance automatique (OCR, reconnaissance faciale). Les méthodes de Super-Résolution (SR) permettent de générer des images haute résolution (HR) à partir d’images basse résolution (BR). Contrairement à un rééchantillonage par interpolation, elles restituent les hautes fréquences spatiales et compensent les artéfacts (flou, crénelures). Parmi elles, les méthodes d’apprentissage automatique telles que les réseaux de neurones artificiels permettent d’apprendre et de modéliser la relation entre les images BR et HR à partir d’exemples. Ce travail démontre l’intérêt des méthodes de SR à base de réseaux de neurones pour les systèmes de reconnaissance automatique. Les réseaux de neurones à convolutions sont particulièrement adaptés puisqu’ils peuvent être entraînés à extraire des caractéristiques non-linéaires bidimensionnelles pertinentes tout en apprenant la correspondance entre les espaces BR et HR. Sur des images de type documents, la méthode proposée permet d’améliorer la précision en reconnaissance de caractère de +7.85 points par rapport à une simple interpolation. La création d’une base d’images annotée et l’organisation d’une compétition internationale (ICDAR2015) ont souligné l’intérêt et la pertinence de telles approches. Pour les images de visages, les caractéristiques faciales sont cruciales pour la reconnaissance automatique. Une méthode en deux étapes est proposée dans laquelle la qualité de l’image est d’abord globalement améliorée, pour ensuite se focaliser sur les caractéristiques essentielles grâce à des modèles spécifiques. Les performances d’un système de vérification faciale se trouvent améliorées de +6.91 à +8.15 points. Enfin, pour le traitement d’images BR en conditions réelles, l’utilisation de réseaux de neurones profonds permet d’absorber la variabilité des noyaux de flous caractérisant l’image BR, et produire des images HR ayant des statistiques naturelles sans connaissance du modèle d’observation exact. / This thesis is focussed on super-resolution (SR) methods for improving automatic recognition system (Optical Character Recognition, face recognition) in realistic contexts. SR methods allow to generate high resolution images from low resolution ones. Unlike upsampling methods such as interpolation, they restore spatial high frequencies and compensate artefacts such as blur or jaggy edges. In particular, example-based approaches learn and model the relationship between low and high resolution spaces via pairs of low and high resolution images. Artificial Neural Networks are among the most efficient systems to address this problem. This work demonstrate the interest of SR methods based on neural networks for improved automatic recognition systems. By adapting the data, it is possible to train such Machine Learning algorithms to produce high-resolution images. Convolutional Neural Networks are especially efficient as they are trained to simultaneously extract relevant non-linear features while learning the mapping between low and high resolution spaces. On document text images, the proposed method improves OCR accuracy by +7.85 points compared with simple interpolation. The creation of an annotated image dataset and the organisation of an international competition (ICDAR2015) highlighted the interest and the relevance of such approaches. Moreover, if a priori knowledge is available, it can be used by a suitable network architecture. For facial images, face features are critical for automatic recognition. A two step method is proposed in which image resolution is first improved, followed by specialised models that focus on the essential features. An off-the-shelf face verification system has its performance improved from +6.91 up to +8.15 points. Finally, to address the variability of real-world low-resolution images, deep neural networks allow to absorb the diversity of the blurring kernels that characterise the low-resolution images. With a single model, high-resolution images are produced with natural image statistics, without any knowledge of the actual observation model of the low-resolution image.
113

Mobile Real-Time License Plate Recognition

Liaqat, Ahmad Gull January 2011 (has links)
License plate recognition (LPR) system plays an important role in numerous applications, such as parking accounting systems, traffic law enforcement, road monitoring, expressway toll system, electronic-police system, and security systems. In recent years, there has been a lot of research in license plate recognition, and many recognition systems have been proposed and used. But these systems have been developed for computers. In this project, we developed a mobile LPR system for Android Operating System (OS). LPR involves three main components: license plate detection, character segmentation and Optical Character Recognition (OCR). For License Plate Detection and character segmentation, we used JavaCV and OpenCV libraries. And for OCR, we used tesseract-ocr. We obtained very good results by using these libraries. We also stored records of license numbers in database and for that purpose SQLite has been used.
114

運用光學字元辨識技術建置數位典藏全文資料庫之評估:以明人文集為例 / The Analysis of Use Optical Character Recognition to Establish the Full-text Retrieval Database:A Case Study of the Anthology of Chinese Literature in Ming

蔡瀚緯, Tsai, Han Wei Unknown Date (has links)
數位典藏是將物件以數位影像的形式進行典藏,並放置在網路系統供使用者瀏覽,能達到流通推廣與保存維護的效果。但在目前資訊爆炸的時代,數位典藏若僅透過詮釋資料描述是無法有效幫助使用者獲得內容資訊,唯有將之建置成全文檢索模式,才能方便使用者快速檢索到所需資訊,而光學字元辨識技術(簡稱OCR)能協助進行全文內容的輸出。 本研究藉由實際操作OCR軟體辨識明代古籍,探究古籍版式及影像對於軟體辨識結果之影響;藉由深度訪談訪問有實際參與數位典藏全文化經驗之機構人員,探究機構或個人對於計畫施行之觀點與考量。結果發現,雖然實際辨識結果顯示古籍版式與影像會對於OCR辨識有所影響,綜合訪談內容得知目前技術層面已克服古籍版式的侷限,但對於影像品質的要求仍然很高,意指古籍影像之品質對OCR的辨識影響程度最大;雖然OCR辨識技術已經有所突破,顯示能善用此技術協助進行全文資料庫的建立,但礙於技術陌生、經費預算、人力資源等因素,使得多數機構尚未運用此技術協助執行數位典藏全文化。 本研究建議,機構日後若有興趣執行數位典藏全文化計畫,首先,需要制定經常出適合機構執行的作業流程,並且瞭解自身欲處理物件之狀況,好挑選出適合的輸入處理模式;再者,需要多與技術廠商溝通協調,瞭解所挑選之物件是否符合處理上的成本效益;最後,綜合典藏機構與使用者之需求考量下,建議未來採取與OCR廠商合作的方式,由使用者自行挑選需要物件進行OCR辨識,校對完成後將全文內容回饋給典藏機構。這樣不僅能瞭解使用者需求為何,也能降低機構全文校對所耗費的成本。 / Digital Archives, placed in the network system for users to browse, change the collection into the digital images, and can help to preserve the collection and promote the content information. However, in the era of information explosion, Digital Archives can’t help users to retrieve the information in the collection by simply recording metadata. So, only when built into the full text retrieval can Digital Archives provide users with a quick retrieval of the information they want. And the Optical Character Recognition (OCR) can help to output the full text information. The study explores the ancient books’ format and impact of image quality on the recognition results by recognizing the ancient books of the Ming dynasty with the OCR software. The study also explores institutional as well as individual views and considerations by in-depth interviewing institutional staff with experiences in the full text of Digital Archives plan. From the result we can discover that though the ancient books’ format and image quality do have influences on the recognition results, the overall interview suggests that the technology has overcome the limitation of the format under the high requirement for the image quality; that is, the quality of ancient books’ images is the most influential factor in the recognition results. Although the OCR already has the breakthrough in assisting the establishment of the full text database, most institutions have not yet applied this technology to full-textualization of the Digital Archives due to technical unfamiliar, budget, human resources and other factors. The study suggests that if some day one institution is interested in working on the the full text of the Digital Archives project, it firstly needs to develop a proper SOP and needs to understand the conditions of their ready-to-be-textualized collections so that it can adopt a suitable input mode. Secondly, this institution needs to communicate with the OCR company more so that it can realize whether the chosen collection fits the cost-effectiveness. Finally, under the considerations of both the institution and users, the study suggests that institutions can cooperate with OCR companies in the future, so users can choose collections for OCR recognition on their own and give the full text to the institutions as feedback after proofreading. This can not only understand users’ needs but also reduce the cost of the proofreading for the institution.
115

Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images

Stewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
116

Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images

Stewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
117

Active Learning pro zpracování archivních pramenů / Active Learning for Processing of Archive Sources

Hříbek, David January 2021 (has links)
This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
118

Studie řízení plynulých materiálových toků s využitím značení produktů / The Study of Control of Continous Flows with Using of Products Identification

Dvořáková, Alena January 2008 (has links)
Master´s thesis analyses current methods and procedures of storing and marking of goods of Disk obchod & technika, spol. s.r.o. company. It includes the proposal of goods identification which leads to the optimizing of continuous flows from the point of view of both simplification and acceleration of work and simpler and more accurate ways of goods identification. The proposal is related to the choice of appropriate method of goods identification and the selection of particular type of barcodes, including the necessary hardware.
119

Detekce a rozpoznání registrační značky vozidla pro analýzu dopravy / License Plate Detection and Recognition for Traffic Analysis

Černá, Tereza January 2015 (has links)
This thesis describes the design and development of a system for detection and recognition of license plates. The work is divided into three basic parts: licence plates detection, finding of character positions and optical character recognition. To fullfill the goal of this work, a new dataset was taken. It contains 2814 license plates used for training classifiers and 2620 plates to evaluate the success rate of the system. Cascade Classifier was used to train detector of licence plates, which has success rate up to 97.8 %. After that, pozitions of individual characters were searched in detected pozitions of licence plates. If there was no character found, detected pozition was not the licence plate. Success rate of licence plates detection with all the characters found is up to 88.5 %. Character recognition is performed by SVM classifier. The system detects successfully with no errors up to 97.7 % of all licence plates.
120

Desenvolvimento de um sistema de visão de máquina para inspeção de conformidade em um produto industrial /

Poleto, Arthur Suzini. January 2019 (has links)
Orientador: João Antonio Pereira / Resumo: Visão de máquina é um campo multidisciplinar que vem crescendo na indústria, que está cada vez mais preocupada em reduzir custos, automatizar processos, e atender requisitos de qualidade do produto para atender seus clientes. Processos de montagem realizados de forma manual com inspeção e controle visual são tipicamente processos susceptíveis a erros, à utilização de peças não conformes na montagem do produto final. Este trabalho apresenta uma proposta de desenvolvimento de um sistema de visão de máquina com base no processamento e análise de imagens digitais para a inspeção das características e especificações das peças e componentes utilizados na montagem de capotas marítimas, objetivando verificar e garantir a conformidade do produto final. A inspeção e avaliação da conformidade do produto são feitas por etapas com a utilização de duas câmeras, uma captura a imagem do código de identificação alfanumérico do produto e a outra inspeciona o conjunto de elementos de fixação. As imagens passam por um processo de tratamento que envolve a filtragem espacial utilizando máscara de médias para suavização, alargamento de contraste para expandir a faixa de intensidades e segmentação para formação dos objetos de interesse. Uma função de OCR é utilizada para a extração de caracteres e reconhecimento do código do produto e a extração de características específicas do conjunto de componentes de fixação é feita por descritores de forma representados pelos invariantes de momento. As caracte... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Machine vision is a growing multidisciplinary field in the industry that is increasingly concerned with reducing costs, automating processes, and meeting product quality requirements to serve its customers. Manual assembly processes with inspection and visual control are typically error-prone processes using non-conforming parts in the final product assembly. This work presents a proposal for the development of a machine vision system based on digital image processing and analysis for the inspection of the characteristics and specifications of the parts and components used in the assembly of marine bonnets, aiming to verify and ensure the conformity of the final product. Inspection and conformity assessment of the product are done in stages using two cameras, one capturing the image of the alphanumeric identification code of the product and the other inspecting the set of fasteners. The images undergo a treatment process that involves spatial filtering using averaging masks for smoothing, contrast widening to expand the range of intensities, and segmentation to form the objects of interest. An OCR function is used for character extraction and product code recognition, and the extraction of specific features of the fastener assembly is done by shape descriptors represented by the moment invariants. The specific characteristics of the fasteners are used to assess the conformity of the product with its respective code. The presentation of data and results of the implemented prop... (Complete abstract click electronic access below) / Mestre

Page generated in 0.1154 seconds