Global ETD Search

111	The impact of different reading/writing media on the education and employment of blind persons Moodley, Sivalingum 30 June 2004 (has links) Particularly in recent years, prompted by the need to gain greater independent access to a wider range of information, many persons who are blind make extensive use of screen access technology, optical character recognition devices, refreshable Braille displays and electronic notetakers in a variety of contexts. These reading and writing media have proved to be so useful and effective, raising debates in the literature on whether there is a decline in the use of Braille, or whether Braille as a reading and writing medium would become obsolete. Following a discussion on the development of tactual reading and writing media as part of an historical background to blindness, as well as an evaluation of the various reading and writing media used in South Africa by persons who are blind, this study, using a quantitative approach with a survey design, aimed to determine the impact of the various reading and writing media on the education and employment of persons who are blind. Based on the findings of the study, what emerges forcefully with regard to the preference of a medium for reading or writing is that a greater number of persons who are blind prefer Braille and computers with speech output. Notwithstanding this, there is support for the need to provide instruction in the use of the various reading and writing media, highlighting the critical value and role of the various media. Additionally, while persons who are blind appear to be convinced that computers will not replace Braille, they were, however, divided on whether there is a decline in the use of Braille, and whether computers would replace audiotapes. Finally, conclusions, based mainly on the findings of the study are drawn, and recommendations, both for future research, and for an integrated reading and writing model, are made. / Educational Studies / D.Ed.(Special Needs Educstion) Blindness Reading and writing media Impact Braille Audiotapes and talking books The Optacon Computers with speech output Optical character recognition devices Refreshable Braille displays 371.9115 Blind -- Education Blind -- Employment Blind -- Printing and writing systems Blind -- Services for Talking books Optical character recognition devices Braille Optacon
112	The impact of different reading/writing media on the education and employment of blind persons Moodley, Sivalingum 30 June 2004 (has links) Particularly in recent years, prompted by the need to gain greater independent access to a wider range of information, many persons who are blind make extensive use of screen access technology, optical character recognition devices, refreshable Braille displays and electronic notetakers in a variety of contexts. These reading and writing media have proved to be so useful and effective, raising debates in the literature on whether there is a decline in the use of Braille, or whether Braille as a reading and writing medium would become obsolete. Following a discussion on the development of tactual reading and writing media as part of an historical background to blindness, as well as an evaluation of the various reading and writing media used in South Africa by persons who are blind, this study, using a quantitative approach with a survey design, aimed to determine the impact of the various reading and writing media on the education and employment of persons who are blind. Based on the findings of the study, what emerges forcefully with regard to the preference of a medium for reading or writing is that a greater number of persons who are blind prefer Braille and computers with speech output. Notwithstanding this, there is support for the need to provide instruction in the use of the various reading and writing media, highlighting the critical value and role of the various media. Additionally, while persons who are blind appear to be convinced that computers will not replace Braille, they were, however, divided on whether there is a decline in the use of Braille, and whether computers would replace audiotapes. Finally, conclusions, based mainly on the findings of the study are drawn, and recommendations, both for future research, and for an integrated reading and writing model, are made. / Educational Studies / D.Ed.(Special Needs Educstion) Blindness Reading and writing media Impact Braille Audiotapes and talking books The Optacon Computers with speech output Optical character recognition devices Refreshable Braille displays 371.9115 Blind -- Education Blind -- Employment Blind -- Printing and writing systems Blind -- Services for Talking books Optical character recognition devices Braille Optacon
113	Simultaneous Detection and Validation of Multiple Ingredients on Product Packages: An Automated Approach : Using CNN and OCR Techniques / Simultant detektering och validering av flertal ingredienser på produktförpackningar: Ett automatiserat tillvägagångssätt : Genom användning av CNN och OCR tekniker Farokhynia, Rodbeh, Krikeb, Mokhtar January 2024 (has links) Manual proofreading of product packaging is a time-consuming and uncertain process that can pose significant challenges for companies, such as scalability issues, compliance risks and high costs. This thesis work introduces a novel solution by employing advanced computer vision and machine learning methods to automate the proofreading of multiple ingredients’ lists corresponding to multiple products simultaneously within a product package. By integrating Convolutional Neural Network (CNN) and Optical Character Recognition (OCR) techniques, this study examines the efficacy of automated proofreading in comparison to manual methods. The thesis involves analyzing product package artwork to identify ingredient lists utilizing the YOLOv5 object detection algorithm and the optical character recognition tool EasyOCR for ingredient extraction. Additionally, Python scripts are employed to extract ingredients from corresponding INCI PDF files (document that lists the standardized names of ingredients used in cosmetic products). A comprehensive comparison is then conducted to evaluate the accuracy and efficiency of automated proofreading. The comparison of the extracted ingredients from the product packages and their corresponding INCI PDF files yielded a match of 12.7%. Despite the suboptimal result, insights from the study highlights the limitations of current detection and recognition algorithms when applied to complex artwork. A few examples of the insights have been that the trained YOLOv5 model cuts through sentences in the ingredient list or that EasyOCR cannot extract ingredients from vertically aligned product package images. The findings underscore the need for advancements in detection algorithms and OCR tools to effectively handle objects like product packaging designs. The study also suggests that companies, such as H&M, consider updating their artwork and INCI PDF files to align with the capabilities of current AI-driven tools. By doing so, they can enhance the efficiency and overall effectiveness of automated proofreading processes, thereby reducing errors and improving accuracy. / Manuell korrekturläsning av produktförpackningar är en tidskrävande och osäker process som kan skapa betydande utmaningar för företag, såsom skalbarhetsproblem, efterlevnadsrisker och höga kostnader. Detta examensarbete presenterar en ny lösning genom att använda avancerade metoder inom datorseende och maskininlärning för att automatisera korrekturläsningen av flera ingredienslistor som motsvarar flera produkter samtidigt inom en produktförpackning. Genom att integrera Convolutional Neural Network (CNN) och Optical Character Recognition (OCR) utreder denna studie effektiviteten av automatiserad korrekturläsning i jämförelse med manuella metoder. Avhandlingen analyserar designen av produktförpackningar för att identifiera ingredienslistor med hjälp av objektdetekteringsalgoritmen YOLOv5 och det optiska teckenigenkänningsverktyget EasyOCR för extrahera enskilda ingredienser från listorna. Utöver detta används Python-skript för att extrahera ingredienser från motsvarande INCI-PDF filer (dokument med standardiserade namn på ingredienser som används i kosmetika produkter). En omfattande jämförelse genomförs sedan för att utvärdera noggrannheten och effektiviteten hos automatiserad korrekturläsning. Jämförelsen av de extraherade ingredienserna från produktförpackningarna och deras korresponderande INCI-PDF filer gav ett matchnings resultat på 12.7%. Trots de mindre optimala resultaten belyser studien de begränsningar som finns hos de nuvarande detekterings- och teckenigenkänningsalgoritmerna när de appliceras på komplexa verk av produktförpackningar. Ett fåtal exempel på insikterna är bland annat att den tränade YOLOv5 modellen skär igenom meningar i ingredienslistan eller att EasyOCR inte kan extrahera ingredienser från stående ingredienslistor på produktförpackningsbilder. Resultaten understryker behovet av framsteg inom detekteringsalgoritmer och OCR-verktyg för att effektivt kunna hantera komplexa objekt som produktförpackningar. Studien föreslår även att företag, såsom H&M, överväger att uppdatera sina design av produktförpackningar och INCI-PDF filer för att anpassa sig till kapaciteten hos aktuella AI-drivna verktyg. Genom att utföra detta kan de förbättra både effektiviteten och den övergripande kvaliteten hos de automatiserade korrekturläsningsprocesserna, vilket minskar fel och ökar noggrannheten. Proofreading product packaging automation computer vision machine learning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR accuracy Manuell korrekturläsning automatisering produktförpackningar datorseende maskininlärning Convolutional Neural Network (CNN) Optical Character Recognition (OCR) YOLOv5 EasyOCR noggrannhet Other Computer and Information Science Annan data- och informationsvetenskap
114	Single image super-resolution based on neural networks for text and face recognition / Super-résolution d'image unique basée sur des réseaux de neurones pour la reconnaissance de texte et de visage Peyrard, Clément 29 September 2017 (has links) Cette thèse porte sur les méthodes de super-résolution (SR) pour l’amélioration des performances des systèmes de reconnaissance automatique (OCR, reconnaissance faciale). Les méthodes de Super-Résolution (SR) permettent de générer des images haute résolution (HR) à partir d’images basse résolution (BR). Contrairement à un rééchantillonage par interpolation, elles restituent les hautes fréquences spatiales et compensent les artéfacts (flou, crénelures). Parmi elles, les méthodes d’apprentissage automatique telles que les réseaux de neurones artificiels permettent d’apprendre et de modéliser la relation entre les images BR et HR à partir d’exemples. Ce travail démontre l’intérêt des méthodes de SR à base de réseaux de neurones pour les systèmes de reconnaissance automatique. Les réseaux de neurones à convolutions sont particulièrement adaptés puisqu’ils peuvent être entraînés à extraire des caractéristiques non-linéaires bidimensionnelles pertinentes tout en apprenant la correspondance entre les espaces BR et HR. Sur des images de type documents, la méthode proposée permet d’améliorer la précision en reconnaissance de caractère de +7.85 points par rapport à une simple interpolation. La création d’une base d’images annotée et l’organisation d’une compétition internationale (ICDAR2015) ont souligné l’intérêt et la pertinence de telles approches. Pour les images de visages, les caractéristiques faciales sont cruciales pour la reconnaissance automatique. Une méthode en deux étapes est proposée dans laquelle la qualité de l’image est d’abord globalement améliorée, pour ensuite se focaliser sur les caractéristiques essentielles grâce à des modèles spécifiques. Les performances d’un système de vérification faciale se trouvent améliorées de +6.91 à +8.15 points. Enfin, pour le traitement d’images BR en conditions réelles, l’utilisation de réseaux de neurones profonds permet d’absorber la variabilité des noyaux de flous caractérisant l’image BR, et produire des images HR ayant des statistiques naturelles sans connaissance du modèle d’observation exact. / This thesis is focussed on super-resolution (SR) methods for improving automatic recognition system (Optical Character Recognition, face recognition) in realistic contexts. SR methods allow to generate high resolution images from low resolution ones. Unlike upsampling methods such as interpolation, they restore spatial high frequencies and compensate artefacts such as blur or jaggy edges. In particular, example-based approaches learn and model the relationship between low and high resolution spaces via pairs of low and high resolution images. Artificial Neural Networks are among the most efficient systems to address this problem. This work demonstrate the interest of SR methods based on neural networks for improved automatic recognition systems. By adapting the data, it is possible to train such Machine Learning algorithms to produce high-resolution images. Convolutional Neural Networks are especially efficient as they are trained to simultaneously extract relevant non-linear features while learning the mapping between low and high resolution spaces. On document text images, the proposed method improves OCR accuracy by +7.85 points compared with simple interpolation. The creation of an annotated image dataset and the organisation of an international competition (ICDAR2015) highlighted the interest and the relevance of such approaches. Moreover, if a priori knowledge is available, it can be used by a suitable network architecture. For facial images, face features are critical for automatic recognition. A two step method is proposed in which image resolution is first improved, followed by specialised models that focus on the essential features. An off-the-shelf face verification system has its performance improved from +6.91 up to +8.15 points. Finally, to address the variability of real-world low-resolution images, deep neural networks allow to absorb the diversity of the blurring kernels that characterise the low-resolution images. With a single model, high-resolution images are produced with natural image statistics, without any knowledge of the actual observation model of the low-resolution image. Informatique Traitement d'images Reconnaissance faciale Reconnaissance optique de caractères Réseaux de neurones Apprentissage automatique Apprentissage profond Information Technology Image Processing Face recognition Optical character recognition Neural Network Machine learning Deep Learning 006.420 72
115	Mobile Real-Time License Plate Recognition Liaqat, Ahmad Gull January 2011 (has links) License plate recognition (LPR) system plays an important role in numerous applications, such as parking accounting systems, traffic law enforcement, road monitoring, expressway toll system, electronic-police system, and security systems. In recent years, there has been a lot of research in license plate recognition, and many recognition systems have been proposed and used. But these systems have been developed for computers. In this project, we developed a mobile LPR system for Android Operating System (OS). LPR involves three main components: license plate detection, character segmentation and Optical Character Recognition (OCR). For License Plate Detection and character segmentation, we used JavaCV and OpenCV libraries. And for OCR, we used tesseract-ocr. We obtained very good results by using these libraries. We also stored records of license numbers in database and for that purpose SQLite has been used. JavaCV OpenCV Android License Plate License Plate Recognition in Android Real Time License Plate Recognition LPR System in Android Character Segmentation Optical Character Recognition Haar-Training Computer Sciences Datavetenskap (datalogi)
116	運用光學字元辨識技術建置數位典藏全文資料庫之評估：以明人文集為例 / The Analysis of Use Optical Character Recognition to Establish the Full-text Retrieval Database：A Case Study of the Anthology of Chinese Literature in Ming 蔡瀚緯, Tsai, Han Wei Unknown Date (has links) 數位典藏是將物件以數位影像的形式進行典藏，並放置在網路系統供使用者瀏覽，能達到流通推廣與保存維護的效果。但在目前資訊爆炸的時代，數位典藏若僅透過詮釋資料描述是無法有效幫助使用者獲得內容資訊，唯有將之建置成全文檢索模式，才能方便使用者快速檢索到所需資訊，而光學字元辨識技術（簡稱OCR）能協助進行全文內容的輸出。本研究藉由實際操作OCR軟體辨識明代古籍，探究古籍版式及影像對於軟體辨識結果之影響；藉由深度訪談訪問有實際參與數位典藏全文化經驗之機構人員，探究機構或個人對於計畫施行之觀點與考量。結果發現，雖然實際辨識結果顯示古籍版式與影像會對於OCR辨識有所影響，綜合訪談內容得知目前技術層面已克服古籍版式的侷限，但對於影像品質的要求仍然很高，意指古籍影像之品質對OCR的辨識影響程度最大；雖然OCR辨識技術已經有所突破，顯示能善用此技術協助進行全文資料庫的建立，但礙於技術陌生、經費預算、人力資源等因素，使得多數機構尚未運用此技術協助執行數位典藏全文化。本研究建議，機構日後若有興趣執行數位典藏全文化計畫，首先，需要制定經常出適合機構執行的作業流程，並且瞭解自身欲處理物件之狀況，好挑選出適合的輸入處理模式；再者，需要多與技術廠商溝通協調，瞭解所挑選之物件是否符合處理上的成本效益；最後，綜合典藏機構與使用者之需求考量下，建議未來採取與OCR廠商合作的方式，由使用者自行挑選需要物件進行OCR辨識，校對完成後將全文內容回饋給典藏機構。這樣不僅能瞭解使用者需求為何，也能降低機構全文校對所耗費的成本。 / Digital Archives, placed in the network system for users to browse, change the collection into the digital images, and can help to preserve the collection and promote the content information. However, in the era of information explosion, Digital Archives can’t help users to retrieve the information in the collection by simply recording metadata. So, only when built into the full text retrieval can Digital Archives provide users with a quick retrieval of the information they want. And the Optical Character Recognition (OCR) can help to output the full text information. The study explores the ancient books’ format and impact of image quality on the recognition results by recognizing the ancient books of the Ming dynasty with the OCR software. The study also explores institutional as well as individual views and considerations by in-depth interviewing institutional staff with experiences in the full text of Digital Archives plan. From the result we can discover that though the ancient books’ format and image quality do have influences on the recognition results, the overall interview suggests that the technology has overcome the limitation of the format under the high requirement for the image quality; that is, the quality of ancient books’ images is the most influential factor in the recognition results. Although the OCR already has the breakthrough in assisting the establishment of the full text database, most institutions have not yet applied this technology to full-textualization of the Digital Archives due to technical unfamiliar, budget, human resources and other factors. The study suggests that if some day one institution is interested in working on the the full text of the Digital Archives project, it firstly needs to develop a proper SOP and needs to understand the conditions of their ready-to-be-textualized collections so that it can adopt a suitable input mode. Secondly, this institution needs to communicate with the OCR company more so that it can realize whether the chosen collection fits the cost-effectiveness. Finally, under the considerations of both the institution and users, the study suggests that institutions can cooperate with OCR companies in the future, so users can choose collections for OCR recognition on their own and give the full text to the institutions as feedback after proofreading. This can not only understand users’ needs but also reduce the cost of the proofreading for the institution. 數位典藏光學字元辨識全文資料庫明人文集 Digital archives Optical character recognition Full-Text database Anthology of Chinese Literature in Ming
117	Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images Stewart, Seth Andrew 01 October 2018 (has links) We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place. Convolutional Neural Networks Document Image Analysis Fully Convolutional Neural Networks Layout Analysis Page Segmentation Pixel-Labeling Region Classification Semantic Segmentation Data Augmentation Historical Document Processing Optical Character Recognition Handwriting Recognition Computer Sciences
118	Fully Convolutional Neural Networks for Pixel Classification in Historical Document Images Stewart, Seth Andrew 01 October 2018 (has links) We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place. Convolutional Neural Networks Document Image Analysis Fully Convolutional Neural Networks Layout Analysis Page Segmentation Pixel-Labeling Region Classification Semantic Segmentation Data Augmentation Historical Document Processing Optical Character Recognition Handwriting Recognition Physical Sciences and Mathematics
119	Active Learning pro zpracování archivních pramenů / Active Learning for Processing of Archive Sources Hříbek, David January 2021 (has links) This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
120	Studie řízení plynulých materiálových toků s využitím značení produktů / The Study of Control of Continous Flows with Using of Products Identification Dvořáková, Alena January 2008 (has links) Master´s thesis analyses current methods and procedures of storing and marking of goods of Disk obchod & technika, spol. s.r.o. company. It includes the proposal of goods identification which leads to the optimizing of continuous flows from the point of view of both simplification and acceleration of work and simpler and more accurate ways of goods identification. The proposal is related to the choice of appropriate method of goods identification and the selection of particular type of barcodes, including the necessary hardware.

Search results