Global ETD Search

1	Text Segmentation of Historical Degraded Handwritten Documents Nina, Oliver 05 August 2010 (has links) (PDF) The use of digital images of handwritten historical documents has increased in recent years. This has been possible through the Internet, which allows users to access a vast collection of historical documents and makes historical and data research more attainable. However, the insurmountable number of images available in these digital libraries is cumbersome for a single user to read and process. Computers could help read these images through methods known as Optical Character Recognition (OCR), which have had significant success for printed materials but only limited success for handwritten ones. Most of these OCR methods work well only when the images have been preprocessed by getting rid of anything in the image that is not text. This preprocessing step is usually known as binarization. The binarization of images of historical documents that have been affected by degradation and that are of poor image quality is difficult and continues to be a focus of research in the field of image processing. We propose two novel approaches to attempt to solve this problem. One combines recursive Otsu thresholding and selective bilateral filtering to allow automatic binarization and segmentation of handwritten text images. The other adds background normalization and a post-processing step to the algorithm to make it more robust and to work even for images that present bleed-through artifacts. Our results show that these techniques help segment the text in historical documents better than traditional binarization techniques. thresholding binarization thresholding of historical documents text segmentation binarization algorithm binarization for OCR Computer Sciences
2	Vyhodnocení testových formulářů pomocí OCR / Test form evaluation by OCR Noghe, Petr January 2013 (has links) This thesis deals with the evaluation forms using optical character recognition. Image processing and methods used for OCR is described in the first part of thesis. In the practical part is created database of sample characters. The chosen method is based on correlation between patterns and recognized characters. The program is designed in a graphical environment MATLAB. Finally, several forms are evaluated and success rate of the proposed program is detected.
3	雲端筆記之混合式文字切割與辨識 / Segmentation and recognition of mixed characters for cloud-based notes 王冠智, Wang, Guan Jhih Unknown Date (has links) 文字辨識為常見的電腦視覺應用之一，隨著正確率逐漸的上升，許多新的服務相繼出現，本論文改善了筆記管理軟體最主要的問題－文字切割，並提出兩種新的中文印刷體及手寫體的分類方法。我們將筆記文件中較常見的重點標記過濾後，再使用新核心的文字結構濾波取得筆記文件中的文字區塊，新的核心數據大幅降低原始核心的計算時間。本論文也使用文字結構濾波作為分辨印刷體、手寫體的特徵值，由於文字結構濾波會依據筆畫結構給予能量回饋，使得較工整的印刷體與手寫體能有所區別，此外也使用Sobel搭配不同角度範圍進行字體辨識，實驗結果證實了本論文所提出的文字切割及字體分類方法對於筆記文件資訊的處理是有效的。 / Character recognition is an important and practical application of computer vision. With the advance of this technology, more and more services embedding text recognition functionality have become available. However, segmentation is still the central issue in many situations. In this thesis, we tackle the character segmentation problem in note taking and management applications. We propose novel methods for the discrimination of handwritten and machine-printed Chinese characters. First, we perform noise removal using heuristics and apply a stroke filter with modified kernels to efficiently compute the bounding box for the text area. The responses of the stroke filter also serve as clues for differentiating machine-printed and handwritten texts. They are further enhanced using a SVM-based classifier that employs aggregated directional responses of edge detectors as input. Experiment results have validated the efficacy of the proposed approaches in terms of text localization and style recognition. 文字結構濾波字體分類字體分類消除雜訊 stroke filter text font discrimination text segmentation noise removal
4	結合中文斷詞系統與雙分群演算法於音樂相關臉書粉絲團之分析：以KKBOX為例 / Combing Chinese text segmentation system and co-clustering algorithm for analysis of music related Facebook fan page: A case of KKBOX 陳柏羽, Chen, Po Yu Unknown Date (has links) 近年智慧型手機與網路的普及，使得社群網站與線上串流音樂蓬勃發展。臉書（Facebook）用戶截至去年止每月總體平均用戶高達18.6億人，粉絲專頁成為公司企業特別關注的行銷手段。粉絲專頁上的貼文能夠在短時間內經過點閱、分享傳播至用戶的頁面，達到比起電視廣告更佳的效果，也節省了許多的成本。本研究提供了一套針對臉書粉絲專頁貼文的分群流程，考量到貼文字詞的複雜性，除了抓取了臉書粉絲專頁的貼文外，也抓取了與其相關的KKBOX網頁資訊，整合KKBOX網頁中的資料，對中文斷詞系統（Jieba）的語料庫進行擴充，以提高斷詞的正確性，接著透過雙分群演算法（Minimum Squared Residue Co-Clustering Algorithm）對貼文進行分群，並利用鑑別率（Discrimination Rate）與凝聚率（Agglomerate Rate）配合主成份分析（Principal Component Analysis）所產生的分佈圖來對分群結果進行評估，選出較佳的分群結果進一步去分析，進而找出分類的根據。在結果中，發現本研究的方法能夠有效的區分出不同類型的貼文，甚至能夠依據使用字詞、語法或編排格式的不同來進行分群。 / In recent years, because both smartphones and the Internet have become more popular, social network sites and music streaming services have grown vigorously. The monthly average of Facebook users hit 1.86 billion last years and Facebook Fan Page has become a popular marketing tool. Posts on Facebook can be broadcasted to millions of people in a short period of time by LIKEing and SHAREing pages. Using Facebook Fan Page as a marketing tool is more effective than advertising on television and can definitely reduce the costs. This study presents a process to cluster posts on Facebook Fan Page. Considering the complicated word usage, we grasped information on Facebook Fan Page and related information on the KKBOX website. First, we integrated the information on the website of KKBOX and expanded the text corpus of Jibea to enhance the accuracy of word segmentation. Then, we clustered the posts into several groups through Minimum Squared Residue Co-Clustering Algorithm and used discrimination Rate and Agglomerate Rate to analyze the distribution chart of Principal Component Analysis. After that, we found the suitable classification and could further analyze it. How posts are classified can then be found. As a result, we found that the method of this study can effectively cluster different kinds of posts and even cluster these posts according to its words, syntax and arrangement. 雙分群中文斷詞臉書粉絲專頁貼文 Co-clustering Chinese text segmentation system Facebook fan page
5	Rozpoznání textu s využitím neuronových sítí / Text recognition with artificial neural networks Peřinová, Barbora January 2018 (has links) This master’s thesis deals with optical character recognition. The first part describes the basic types of optical character recognition tasks and divides algorithm into individual phases. For each phase the most commonly used methods are described in the next part. Within the character recognition phase the problematics of artificial neural networks and their usage in given phase is explained, specifically multilayer perceptron and convolutional neural networks. The second part deals with requirements definition for specific application to be used as feedback for robotic system. Convolution neural networks and CNTK library for deep learning using algorithm implementation in .NET is introduced. Finally, the test results of the individual phases of the proposed solution and the comparison with the open source Tesseract engine are discussed.
6	Nástroj pro analýzu psaní uživatele na klávesnici / A Tool for Analysis of User's Typing Skills Moltaš, Jaroslav January 2012 (has links) This master's thesis deals with analysis of user's typing skills in Windows system. Touch method technique for fast keyboard typing is described. The possibilities of Windows keyboard hooking and GUI making are mentioned. The description of segmentation of written text and writing level evaluation techniques are part of this thesis. Another part deals with the system implementation. The result application collects data about user's typing skills and evaluates the quality of typing and number of mistakes.
7	Разработка системы автоматического распознавания автомобильных номеров в реальных дорожных условиях : магистерская диссертация / Development of a system for automatic recognition of license plates in real road conditions Зайкис, Д. В., Zaikis, D. V. January 2023 (has links) Цель работы – разработка автоматической системы распознавания номерных знаков автомобилей, в естественных дорожных условиях, в том числе в сложных погодных и физических условиях, таких как недостаточная видимость, загрязнение, умышленное или непреднамеренное частичное скрытие символов. Объектом исследования являются цифровые изображения автомобилей в естественной среде. Методы исследования: сверточные нейронные сети, в том числе одноэтапные детекторы (SSOD), комбинации сетей с промежуточными связями между слоями - Cross Stage Partial Network (CSPNet) и сети, объединяющей информацию с разных уровней сети – Path Aggregation Network (PANet), преобразования изображений с помощью библиотеки OpenCV, включая фильтры Собеля и Гауса, преобразование Кэнни, методы глубокого машинного обучения для обработки последовательностей LSTM, CRNN, CRAFT. В рамках данной работы разработана система распознавания автомобильных номеров, переводящая графические данные из цифрового изображения или видеопотока в текст в виде файлов различных форматов. Задача детекции автомобильных номеров на изображениях решена с помощью глубокой нейронной сети YoLo v5, представляющая собой современную модель обнаружения объектов, основанную на архитектуре с использованием CSPNet и PANet. Она обеспечивает высокую скорость и точность при обнаружении объектов на изображениях. Благодаря своей эффективности и масштабируемости, YoLov5 стала популярным выбором для решения задач компьютерного зрения в различных областях. Для решения задачи распознавания текса на обнаруженных объектах используется алгоритм детектирования объектов, основанный на преобразованиях Кэнни, фильтрах Собеля и Гаусса и нейронная сеть keras-ocr, на основе фреймворка keras, представляющая собой комбинацию сверточной нейронной сети (CNN) и рекуррентной нейронной сети (RNN), решающая задачу распознавания печатного текста. Созданный метод способен безошибочно распознавать 85 % предоставленных номеров, преимущественно российского стандарта. Полученный функционал может быть внедрен в существующую системы фото- или видео-фиксации трафика и использоваться в рамках цифровизации систем трекинга и контроля доступа и безопасности на дорогах и объектах транспортной инфраструктуры. Выпускная квалификационная работа в теоретической и описательной части выполнена в текстовом редакторе Microsoft Word и представлена в электронном формате. Практическая часть выполнялась в jupiter-ноутбуке на платформе облачных вычислений Google Collaboratory. / The goal of the work is to develop an automatic system for recognizing car license plates in natural road conditions, including difficult weather and physical conditions, such as insufficient visibility, pollution, intentional or unintentional partial hiding of symbols. The object of the study is digital images of cars in their natural environment. Research methods: convolutional neural networks, including single-stage detectors (SSOD), combinations of networks with intermediate connections between layers - Cross Stage Partial Network (CSPNet) and networks that combine information from different levels of the network - Path Aggregation Network (PANet), image transformations using the OpenCV library, including Sobel and Gauss filters, Canny transform, deep machine learning methods for processing LSTM, CRNN, CRAFT sequences. As part of this work, a license plate recognition system has been developed that converts graphic data from a digital image or video stream into text in the form of files in various formats. The problem of detecting license plates in images is solved using the YoLo v5 deep neural network, which is a modern object detection model based on an architecture using CSPNet and PANet. It provides high speed and accuracy in detecting objects in images. Due to its efficiency and scalability, YoLov5 has become a popular choice for solving computer vision problems in various fields. To solve the problem of text recognition on detected objects, an object detection algorithm is used, based on Canny transforms, Sobel and Gaussian filters, and the keras-ocr neural network, based on the keras framework, which is a combination of a convolutional neural network (CNN) and a recurrent neural network (RNN) , which solves the problem of recognizing printed text. The created method is capable of accurately recognizing 85% of the provided numbers, mainly of the Russian standard. The resulting functionality can be implemented into existing systems for photo or video recording of traffic and used as part of the digitalization of tracking systems and access control and security on roads and transport infrastructure facilities. The final qualifying work in the theoretical and descriptive parts was completed in the text editor Microsoft Word and presented in electronic format. The practical part was carried out on a jupiter laptop on the Google Collaboratory cloud computing platform. СЕГМЕНТАЦИЯ ТЕКСТА MASTER'S THESIS OBJECT AND PRINTED TEXT RECOGNITION TEXT SEGMENTATION
8	Methods for Text Segmentation from Scene Images Kumar, Deepak January 2014 (has links) (PDF) Recognition of text from camera-captured scene/born-digital images help in the development of aids for the blind, unmanned navigation systems and spam filters. However, text in such images is not confined to any page layout, and its location within in the image is random in nature. In addition, motion blur, non-uniform illumination, skew, occlusion and scale-based degradations increase the complexity in locating and recognizing the text in a scene/born-digital image. Text localization and segmentation techniques are proposed for the born-digital image data set. The proposed OTCYMIST technique won the first place and placed in the third position for its performance on the text segmentation task in ICDAR 2011 and ICDAR 2013 robust reading competitions for born-digital image data set, respectively. Here, Otsu’s binarization and Canny edge detection are separately carried out on the three colour planes of the image. Connected components (CC’s) obtained from the segmented image are pruned based on thresholds applied on their area and aspect ratio. CC’s with sufficient edge pixels are retained. The centroids of the individual CC’s are used as nodes of a graph. A minimum spanning tree is built using these nodes of the graph. Long edges are broken from the minimum spanning tree of the graph. Pairwise height ratio is used to remove likely non-text components. CC’s are grouped based on their proximity in the horizontal direction to generate bounding boxes (BB’s) of text strings. Overlapping BB’s are removed using an overlap area threshold. Non-overlapping and minimally overlapping BB’s are used for text segmentation. These BB’s are split vertically to localize text at the word level. A word cropped from a document image can easily be recognized using a traditional optical character recognition (OCR) engine. However, recognizing a word, obtained by manually cropping a scene/born-digital image, is not trivial. Existing OCR engines do not handle these kinds of scene word images effectively. Our intention is to first segment the word image and then pass it to the existing OCR engines for recognition. In two aspects, it is advantageous: it avoids building a character classifier from scratch and reduces the word recognition task to a word segmentation task. Here, we propose two bottom-up approaches for the task of word segmentation. These approaches choose different features at the initial stage of segmentation. Power-law transform (PLT) was applied to the pixels of the gray scale born-digital images to non-linearly modify the histogram. The recognition rate achieved on born-digital word images is 82.9%, which is 20% more than the top performing entry (61.5%) in ICDAR 2011 robust reading competition. In addition, we explored applying PLT to the colour planes such as red, green, blue, intensity and lightness plane by varying the gamma value. We call this technique as Nonlinear enhancement and selection of plane (NESP) for optimal segmentation, which is an improvement over PLT. NESP chooses a particular plane with a proper gamma value based on Fisher discrimination factor. The recognition rate is 72.8% for scene images of ICDAR 2011 robust reading competition, which is 30% higher than the best entry (41.2%). The recognition rate is 81.7% and 65.9% for born-digital and scene images of ICDAR 2013 robust reading competition, respectively, using NESP. Another technique, midline analysis and propagation of segmentation (MAPS), has also been proposed. Here, the middle row pixels of the gray scale image are first segmented and the statistics of the segmented pixels are used to assign text and non-text labels to the rest of the image pixels using min-cut method. Gaussian model is fitted on the middle row segmented pixels before the assignment of other pixels. In MAPS, we assume the middle row pixels are least affected by any of the degradations. This assumption is validated by the good word recognition rate of 71.7% on ICDAR 2011 robust reading competition for scene images. The recognition rate is 83.8% and 66.0% for born-digital and scene images of ICDAR 2013 robust reading competition, respectively, using MAPS. The best reported results for ICDAR 2003 word images is 61.1% using custom lexicons containing the list of test words. On the other hand, NESP and MAPS achieve 66.2% and 64.5% for ICDAR 2003 word images without using any lexicon. By using similar custom lexicon, the recognition rates for ICDAR 2003 word images go up to 74.9% and 74.2% for NESP and MAPS methods, respectively. In place of passing an image segmented by a method, manually segmented word image is submitted to an OCR engine for benchmarking maximum possible recognition rate for each database. The recognition rates of the proposed methods and the benchmark results are reported on the seven publicly available word image data sets and compared with these of reported results in the literature. Since no good Kannada OCR is available, a classifier is designed to recognize Kannada characters and words from Chars74k data set and our own image collection, respectively. Discrete cosine transform (DCT) and block DCT are used as features to train separate classifiers. Kannada words are segmented using the same techniques (MAPS and NESP) and further segmented into groups of components, since a Kannada character may be represented by a single component or a group of components in an image. The recognition rate on Kannada words is reported for different features with and without the use of a lexicon. The obtained recognition performance for Kannada character recognition (11.4%) is three times the best performance (3.5%) reported in the literature. Text Recognition Digital Images Scene Images Text Segmentation Kannada Word Recognition Born-Digital Images Scene Word Images Recognition Text Segmentation Scene Images Camera-Captured Scene Image Analysis Segmented Images Multi-Script Annotation Toolkit (MAST) Scenic Text Born-Digital Word Images Computer Science
9	Découpage textuel dans la traduction assistée par les systèmes de mémoire de traduction / Text segmentation in human translation assisted by translation memory systems Popis, Anna 13 December 2013 (has links) L’objectif des études théoriques et expérimentales présentées dans ce travail était de cerner à l’aide des critères objectifs fiables un niveau optimum de découpage textuel pour la traduction spécialisée assistée par un système de mémoire de traduction (SMT) pour les langues française et polonaise. Afin de réaliser cet objectif, nous avons élaboré notre propre approche : une nouvelle combinaison des méthodes de recherche et des outils d’analyse proposés surtout dans les travaux de Simard (2003), Langlais et Simard (2001, 2003) et Dragsted (2004) visant l’amélioration de la viabilité des SMT à travers des modifications apportées à la segmentation phrastique considérée comme limitant cette viabilité. A la base des observations de quelques réalisations effectives du processus de découpage textuel dans la traduction spécialisée effectuée par l’homme sans aide informatique à la traduction, nous avons déterminé trois niveaux de segmentation potentiellement applicables dans les SMT tels que phrase, proposition, groupes verbal et nominal. Nous avons ensuite réalisé une analyse comparative des taux de réutilisabilité des MT du système WORDFAST et de l’utilité des traductions proposées par le système pour chacun de ces trois niveaux de découpage textuel sur un corpus de douze textes de spécialité. Cette analyse a permis de constater qu’il n’est pas possible de déterminer un seul niveau de segmentation textuelle dont l’application améliorerait la viabilité des SMT de façon incontestable. Deux niveaux de segmentation textuelle, notamment en phrases et en propositions, permettent en effet d’assurer une viabilité comparable des SMT. / The aim of the theoretical and experimental study presented in this work was to define with objective and reliable criterion an optimal level of textual segmentation for specialized translation from French into Polish assisted by a translation memory system (TMS). In this aim, we created our own approach based on research methods and analysis tools proposed particularly by Simard (2003), Langlais and Simard (2001, 2003) and Dragsted (2004). In order to increase the SMT performances, they proposed to eliminate a sentence segmentation level from SMT which is considered an obstacle to achieve satisfying SMT performances. On the basis of the observations of text segmentation process realized during a specialized translation made by a group of students without any computer aid, we defined three segmentation levels which can be potentially used in SMT such as sentences, clauses and noun and verb phrases. We realized then a comparative study of the influence of each of these levels on the reusability of WORDFAST translation memories and on the utility of translations proposed by the system for a group of twelve specialized texts. This study showed that it is not possible to define a unique text segmentation level which would unquestionably increase the SMT performances. Sentences and clauses are in fact two text segmentation levels which ensure the comparable SMT performances. Segmentation textuelle Mémoires de traduction Systèmes de mémoire de traduction Traduction assistée par ordinateur Text segmentation 400
10	[en] EXTRACTING SECTION STRUCTURE FROM RESUMES IN BRAZILIAN PORTUGUESE / [pt] EXTRAINDO A ESTRUTURA DE SEÇÃO DE CURRÍCULOS EM PORTUGUÊS MATHEUS TELLES WERNER 18 March 2025 (has links) [pt] Esta tese apresenta um novo analisador de currículos projetado para reorganizar o conteúdo textual de qualquer currículo em sua estrutura de seção original. Nosso trabalho aborda dois desafios práticos negligenciados pela literatura existente: (i) garantir a ordem de leitura correta do texto recuperado do arquivo de currículo e (ii) extrair individualmente todas as seções, bem como as subseções de experiências de trabalho e educação. Levando em consideração a observação de que a maioria dos currículos adere a modelos básicos de documentos, reformulamos o problema da ordem de leitura como uma tarefa de identificação de modelos de documento. Nossos experimentos sugerem que mesmo um pequeno modelo amplamente utilizado como o EfficientNet-B0 pode identificar com precisão modelos de documento comuns. Além disso, propomos uma abordagem de rotulação de sequências que identifica simultaneamente todas as seções do currículo e algumas subseções. Implementamos e comparamos duas soluções baseados nos conhecidos modelos CRF e BERT. Nossa avaliação fornece fortes evidências de que o CRF pode servir como uma alternativa prática ao BERT, dependendo do hardware e das restrições orçamentárias. Eles produzem resultados comparáveis em termos de identificação de seções de currículo, enquanto o BERT demonstra uma vantagem substancial ao identificar as subseções de educação e experiências de trabalho. / [en] This thesis presents a novel resume parser designed to effectively reorganize the textual content of any resume into its original section structure. Our work addresses two practical challenges overlooked by the existing literature: (i) ensuring the correct reading order of text retrieved from resume files and (ii) extracting individually all sections, as well as work experience and education subsections. By taking into account the observation that most resumes adhere to basic document templates, we reframe the reading order problem as a template identification task. Our experiments suggest that even a widely-used small model like EfficientNet-B0 can accurately identify common templates. Additionally, we propose a sequence tagging approach that simultaneously identifies all resume sections and some subsections. We implement and compare two solutions based on the well-known CRF and BERT models. Our evaluation provides strong evidence that the CRF can serve as a practical alternative to BERT, depending on hardware and budget constraints. They yield comparable results in terms of identifying resume sections, while BERT displays a substantial advantage when identifying education and work experience subsections. [pt] RECURSO HUMANO [pt] ANALISADOR DE CURRICULO [pt] SEGMENTACAO DE TEXTO [pt] CLASSIFICACAO DE IMAGEM [pt] PROCESSAMENTO DE LINGUAGEM NATURAL [pt] EXTRACAO DE INFORMACAO [en] HUMAN RESOURCE [en] RESUME PARSER [en] TEXT SEGMENTATION [en] IMAGE CLASSIFICATION [en] NATURAL LANGUAGE PROCESSING [en] EXTRACTION OF INFORMATION

Search results