• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 7
  • 3
  • 3
  • 1
  • Tagged with
  • 34
  • 14
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Building Extraction in 2D Imagery Using Hough Transform

Zou, Rucong, Sun, Hong January 2014 (has links)
The purpose of this paper is to find out whether Hough transform if it is helpful to building extraction or not. This paper is written with the intention to come up with a building extraction algorithm that captures building areas in images as accurately as possible and eliminates background interference information, allowing the extracted contour area to be slightly larger than the building area itself. The core algorithm in this paper is based on the linear feature of the building edge and it removes interference information from the background. Through the test with ZuBuD database in Matlab, we can detect images successfully.  So according to this study, the Hough transform works for extracting building in 2D images.
12

Extraction hybride et description structurelle de caractères pour une reconnaissance efficace de texte dans les documents hétérogènes scannés : Méthodes et Algorithmes parallèles / Hybrid extraction and structural description of characters for effective text recognition in heterogeneous scanned documents : Methods and Parallel Algorithms

Soua, Mahmoud 08 November 2016 (has links)
La Reconnaissance Optique de Caractères (OCR) est un processus qui convertit les images textuelles en documents textes éditables. De nos jours, ces systèmes sont largement utilisés dans les applications de dématérialisation tels que le tri de courriers, la gestion de factures, etc. Dans ce cadre, l'objectif de cette thèse est de proposer un système OCR qui assure un meilleur compromis entre le taux de reconnaissance et la vitesse de traitement ce qui permet de faire une dématérialisation de documents fiable et temps réel. Pour assurer sa reconnaissance, le texte est d'abord extrait à partir de l'arrière-plan. Ensuite, il est segmenté en caractères disjoints qui seront décrits ultérieurement en se basant sur leurs caractéristiques structurelles. Finalement, les caractères sont reconnus suite à la mise en correspondance de leurs descripteurs avec ceux d'une base prédéfinie. L'extraction du texte, reste difficile dans les documents hétérogènes scannés avec un arrière-plan complexe et bruité où le texte risque d'être confondu avec un fond texturé/varié en couleurs ou distordu à cause du bruit de la numérisation. D'autre part, la description des caractères, extraits et segmentés, se montre souvent complexe (calcul de transformations géométriques, utilisation d'un grand nombre de caractéristiques) ou peu discriminante si les caractéristiques des caractères choisies sont sensibles à la variation de l'échelle, de la fonte, de style, etc. Pour ceci, nous adaptons la binarisation au type de documents hétérogènes scannés. Nous assurons également une description hautement discriminante entre les caractères se basant sur l'étude de la structure des caractères selon leurs projections horizontale et verticale dans l'espace. Pour assurer un traitement temps réel, nous parallélisons les algorithmes développés sur la plateforme du processeur graphique (GPU). Nos principales contributions dans notre système OCR proposé sont comme suit :Une nouvelle méthode d'extraction de texte à partir des documents hétérogènes scannés incluant des régions de texte avec un fond complexe ou homogène. Dans cette méthode, un processus d'analyse d’image est employé suivi d’une classification des régions du document en régions d’images (texte avec un fond complexe) et de textes (texte avec un fond homogène). Pour les régions de texte on extrait l'information textuelle en utilisant une méthode de classification hybride basée sur l'algorithme Kmeans (CHK) que nous avons développé. Les régions d'images sont améliorées avec une Correction Gamma (CG) avant d'appliquer CHK. Les résultats obtenus d'expérimentations, montrent que notre méthode d'extraction de texte permet d'attendre un taux de reconnaissance de caractères de 98,5% sur des documents hétérogènes scannés.Un Descripteur de Caractère Unifié basé sur l'étude de la structure des caractères. Il emploie un nombre suffisant de caractéristiques issues de l'unification des descripteurs de la projection horizontale et verticale des caractères réalisantune discrimination plus efficace. L'avantage de ce descripteur est à la fois sa haute performance et sa simplicité en termes de calcul. Il supporte la reconnaissance des reconnaissance de caractère de 100% pour une fonte et une taille données.Une parallélisation du système de reconnaissance de caractères. Le processeur graphique GPU a été employé comme une plateforme de parallélisation. Flexible et puissante, cette architecture offre une solution efficace pour l'accélération des algorithmesde traitement intensif d'images. Notre mise en oeuvre, combine les stratégies de parallélisation à fins et gros grains pour accélérer les étapes de la chaine OCR. En outre, les coûts de communication CPU-GPU sont évités et une bonne gestion mémoire est assurée. L'efficacité de notre mise en oeuvre est validée par une expérimentation approfondie / The Optical Character Recognition (OCR) is a process that converts text images into editable text documents. Today, these systems are widely used in the dematerialization applications such as mail sorting, bill management, etc. In this context, the aim of this thesis is to propose an OCR system that provides a better compromise between recognition rate and processing speed which allows to give a reliable and a real time documents dematerialization. To ensure its recognition, the text is firstly extracted from the background. Then, it is segmented into disjoint characters that are described based on their structural characteristics. Finally, the characters are recognized when comparing their descriptors with a predefined ones.The text extraction, based on binarization methods remains difficult in heterogeneous and scanned documents with a complex and noisy background where the text may be confused with a textured background or because of the noise. On the other hand, the description of characters, and the extraction of segments, are often complex using calculation of geometricaltransformations, polygon, including a large number of characteristics or gives low discrimination if the characteristics of the selected type are sensitive to variation of scale, style, etc. For this, we adapt our algorithms to the type of heterogeneous and scanned documents. We also provide a high discriminatiobn between characters that descriptionis based on the study of the structure of the characters according to their horizontal and vertical projections. To ensure real-time processing, we parallelise algorithms developed on the graphics processor (GPU). Our main contributions in our proposed OCR system are as follows:A new binarisation method for heterogeneous and scanned documents including text regions with complex or homogeneous background. In this method, an image analysis process is used followed by a classification of the document areas into images (text with a complex background) and text (text with a homogeneous background). For text regions is performed text extraction using a hybrid method based on classification algorithm Kmeans (CHK) that we have developed for this aim. This method combines local and global approaches. It improves the quality of separation text/background, while minimizing the amount of distortion for text extraction from the scanned document and noisy because of the process of digitization. The image areas are improved with Gamma Correction (CG) before applying HBK. According to our experiment, our text extraction method gives 98% of character recognition rate on heterogeneous scanned documents.A Unified Character Descriptor based on the study of the character structure. It employs a sufficient number of characteristics resulting from the unification of the descriptors of the horizontal and vertical projection of the characters for efficient discrimination. The advantage of this descriptor is both on its high performance and its simple computation. It supports the recognition of alphanumeric and multiscale characters. The proposed descriptor provides a character recognition 100% for a given Face-type and Font-size.Parallelization of the proposed character recognition system. The GPU graphics processor has been used as a platform of parallelization. Flexible and powerful, this architecture provides an effective solution for accelerating intensive image processing algorithms. Our implementation, combines coarse/fine-grained parallelization strategies to speed up the steps of the OCR chain. In addition, the CPU-GPU communication overheads are avoided and a good memory management is assured. The effectiveness of our implementation is validated through extensive experiments
13

Fingerprint image enhancement and minutiae extraction algorithm

Cao, Letian, Wang, Yazhou January 2017 (has links)
This work aims to study the procedures of fingerprint identification system and to present some efficient algorithms for pre-processing and minutiae extraction. Most pre-processing steps consist of normalization, segmentation and orientation estimation, and background which focus on decreasing the variance of fingerprints, separating fore and background areas and tracking the direction of ridge lines, respectively. Minutiae extractionis typically divided into two approaches: binarization based method and directgray scale extraction. However, we put emphasis on binarization based method in thisresearch since it is more commonly used method in research papers. The results of simulationbased on a set of fingerprints downloaded from FVC 2006 database showed thatalgorithms we used are accurate and reliable.
14

Handwritten Document Binarization Using Deep Convolutional Features with Support Vector Machine Classifier

Lai, Guojun, Li, Bing January 2020 (has links)
Background. Since historical handwritten documents have played important roles in promoting the development of human civilization, many of them have been preserved through digital versions for more scientific researches. However, various degradations always exist in these documents, which could interfere in normal reading. But, binarized versions can keep meaningful contents without degradations from original document images. Document image binarization always works as a pre-processing step before complex document analysis and recognition. It aims to extract texts from a document image. A desirable binarization performance can promote subsequent processing steps positively. For getting better performance for document image binarization, efficient binarization methods are needed. In recent years, machine learning centered on deep learning has gathered substantial attention in document image binarization, for example, Convolutional Neural Networks (CNNs) are widely applied in document image binarization because of the powerful ability of feature extraction and classification. Meanwhile, Support Vector Machine (SVM) is also used in image binarization. Its objective is to build an optimal hyperplane that could maximize the margin between negative samples and positive samples, which can separate the foreground pixels and the background pixels of the image distinctly. Objectives. This thesis aims to explore how the CNN based process of deep convolutional feature extraction and an SVM classifier can be integrated well to binarize handwritten document images, and how the results are, compared with some state-of-the-art document binarization methods. Methods. To investigate the effect of the proposed method on document image binarization, it is implemented and trained. In the architecture, CNN is used to extract features from input images, afterwards these features are fed into SVM for classification. The model is trained and tested with six different datasets. Then, there is a performance comparison between the proposed model and other binarization methods, including some state-of-the-art methods on other three different datasets. Results. The performance results indicate that the proposed model not only can work well but also perform better than some other novel handwritten document binarization method. Especially, evaluation of the results on DIBCO 2013 dataset indicates that our method fully outperforms other chosen binarization methods on all the four evaluation metrics. Besides, it also has the ability to deal with some degradations, which demonstrates its generalization and learning ability are excellent. When a new kind of degradation appears, the proposed method can address it properly even though it never appears in the training datasets. Conclusions. This thesis concludes that the CNN based component and SVM can be combined together for handwritten document binarization. Additionally, in certain datasets, it outperforms some other state-of-the-art binarization methods. Meanwhile, its generalization and learning ability is outstanding when dealing with some degradations.
15

Deep Learning for Document Image Analysis

Tensmeyer, Christopher Alan 01 April 2019 (has links)
Automatic machine understanding of documents from image inputs enables many applications in modern document workflows, digital archives of historical documents, and general machine intelligence, among others. Together, the techniques for understanding document images comprise the field of Document Image Analysis (DIA). Within DIA, the research community has identified several sub-problems, such as page segmentation and Optical Character Recognition (OCR). As the field has matured, there has been a trend of moving away from heuristic-based methods, designed for particular tasks and domains of documents, and moving towards machine learning methods that learn to solve tasks from examples of input/output pairs. Within machine learning, a particular class of models, known as deep learning models, have established themselves as the state-of-the-art for many image-based applications, including DIA. While traditional machine learning models typically operate on features designed by researchers, deep learning models are able to learn task-specific features directly from raw pixel inputs.This dissertation is collection of papers that proposes several deep learning models to solve a variety of tasks within DIA. The first task is historical document binarization, where an input image of a degraded historical document is converted to a bi-tonal image to separate foreground text from background regions. The next part of the dissertation considers document segmentation problems, including identifying the boundary between the document page and its background, as well as segmenting an image of a data table into rows, columns, and cells. Finally, a variety of deep models are proposed to solve recognition tasks. These tasks include whole document image classification, identifying the font of a given piece of text, and transcribing handwritten text in low-resource languages.
16

Evaluation of biometric security systems against artificial fingers

Blommé, Johan January 2003 (has links)
<p>Verification of users’ identities are normally carried out via PIN-codes or ID- cards. Biometric identification, identification of unique body features, offers an alternative solution to these methods. </p><p>Fingerprint scanning is the most common biometric identification method used today. It uses a simple and quick method of identification and has therefore been favored instead of other biometric identification methods such as retina scan or signature verification. </p><p>In this report biometric security systems have been evaluated based on fingerprint scanners. The evaluation method focuses on copies of real fingers, artificial fingers, as intrusion method but it also mentions currently used algorithms for identification and strengths and weaknesses in hardware solutions used. </p><p>The artificial fingers used in the evaluation were made of gelatin, as it resembles the surface of human skin in ways of moisture, electric resistance and texture. Artificial fingers were based on ten subjects whose real fingers and artificial counterpart were tested on three different fingerprint scanners. All scanners tested accepted artificial fingers as substitutes for real fingers. Results varied between users and scanners but the artificial fingers were accepted between about one forth and half of the times. </p><p>Techniques used in image enhancement, minutiae analysis and pattern matching are analyzed. Normalization, binarization, quality markup and low pass filtering are described within image enhancement. In minutiae analysis connectivity numbers, point identification and skeletonization (thinning algorithms) are analyzed. Within pattern matching, direction field analysis and principal component analysis are described. Finally combinations of both minutiae analysis and pattern matching, hybrid models, are mentioned. </p><p>Based on experiments made and analysis of used techniques a recommendation for future use and development of fingerprint scanners is made.</p>
17

Projeto da arquitetura de hardware para binarização e modelagem de contextos para o CABAC do padrão de compressão de vídeo H.264/AVC / Hardware architecture design for binarization and context modeling for CABAC of H.264/AVC video compression

Martins, André Luis Del Mestre January 2011 (has links)
O codificador aritmético binário adaptativo ao contexto adotado (CABAC – Context-based Adaptive Binary Arithmetic Coding) pelo padrão H.264/AVC a partir de perfil Main é o estado-da-arte em termos de eficiência de taxa de bits. Entretanto, o CABAC ocupa 9.6% do tempo total de processamento e seu throughput é limitado pelas dependências de dados no nível de bit (LIN, 2010). Logo, atingir os requisitos de desempenho em tempo real nos níveis mais altos do padrão H.264/AVC se torna uma tarefa árdua em software, sendo necesário então, a aceleração do CABAC através de implementações em hardware. As arquiteturas de hardware encontradas na literatura para o CABAC focam no Codificador Aritmético Binário (BAE - Binary Arithmetic Encoder) enquanto que a Binarização e Modelagem de Contextos (BCM – Binarization and Context Modeling) fica em segundo plano ou nem é apresentada. O BCM e o BAE juntos constituem o CABAC. Esta dissertação descreve detalhadamente o conjunto de algoritmos que compõem o BCM do padrão H.264/AVC. Em seguida, o projeto de uma arquitetura de hardware específica para o BCM é apresentada. A solução proposta é descrita em VHDL e os resultados de síntese mostram que a arquitetura alcança desempenho suficiente, em FPGA e ASIC, para processar vídeos no nível 5 do padrão H.264/AVC. A arquitetura proposta é 13,3% mais rápida e igualmente eficiente em área que os melhores trabalhos relacionados nestes quesitos. / Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted in the H.264/AVC main profile is the state-of-art in terms of bit-rate efficiency. However, CABAC takes 9.6% of the total encoding time and its throughput is limited by bit-level data dependency (LIN, 2010). Moreover, meeting real-time requirement for a pure software CABAC encoder is difficult at the highest levels of the H.264/AVC standard. Hence, speeding up the CABAC by hardware implementation is required. The CABAC hardware architectures found in the literature focus on the Binary Arithmetic Encoder (BAE), while the Binarization and Context Modeling (BCM) is a secondary issue or even absent in the literature. Integrated, the BCM and the BAE constitute the CABAC. This dissertation presents the set of algorithms that describe the BCM of the H.264/AVC standard. Then, a novel hardware architecture design for the BCM is presented. The proposed design is described in VHDL and the synthesis results show that the proposed architecture reaches sufficiently high performance in FPGA and ASIC to process videos in real-time at the level 5 of H.264/AVC standard. The proposed design is 13.3% faster than the best works in these items, while being equally efficient in area.
18

Efficient Document Image Binarization using Heterogeneous Computing and Interactive Machine Learning

Westphal, Florian January 2018 (has links)
Large collections of historical document images have been collected by companies and government institutions for decades. More recently, these collections have been made available to a larger public via the Internet. However, to make accessing them truly useful, the contained images need to be made readable and searchable. One step in that direction is document image binarization, the separation of text foreground from page background. This separation makes the text shown in the document images easier to process by humans and other image processing algorithms alike. While reasonably well working binarization algorithms exist, it is not sufficient to just being able to perform the separation of foreground and background well. This separation also has to be achieved in an efficient manner, in terms of execution time, but also in terms of training data used by machine learning based methods. This is necessary to make binarization not only theoretically possible, but also practically viable. In this thesis, we explore different ways to achieve efficient binarization in terms of execution time by improving the implementation and the algorithm of a state-of-the-art binarization method. We find that parameter prediction, as well as mapping the algorithm onto the graphics processing unit (GPU) help to improve its execution performance. Furthermore, we propose a binarization algorithm based on recurrent neural networks and evaluate the choice of its design parameters with respect to their impact on execution time and binarization quality. Here, we identify a trade-off between binarization quality and execution performance based on the algorithm’s footprint size and show that dynamically weighted training loss tends to improve the binarization quality. Lastly, we address the problem of training data efficiency by evaluating the use of interactive machine learning for reducing the required amount of training data for our recurrent neural network based method. We show that user feedback can help to achieve better binarization quality with less training data and that visualized uncertainty helps to guide users to give more relevant feedback. / Scalable resource-efficient systems for big data analytics
19

Projeto da arquitetura de hardware para binarização e modelagem de contextos para o CABAC do padrão de compressão de vídeo H.264/AVC / Hardware architecture design for binarization and context modeling for CABAC of H.264/AVC video compression

Martins, André Luis Del Mestre January 2011 (has links)
O codificador aritmético binário adaptativo ao contexto adotado (CABAC – Context-based Adaptive Binary Arithmetic Coding) pelo padrão H.264/AVC a partir de perfil Main é o estado-da-arte em termos de eficiência de taxa de bits. Entretanto, o CABAC ocupa 9.6% do tempo total de processamento e seu throughput é limitado pelas dependências de dados no nível de bit (LIN, 2010). Logo, atingir os requisitos de desempenho em tempo real nos níveis mais altos do padrão H.264/AVC se torna uma tarefa árdua em software, sendo necesário então, a aceleração do CABAC através de implementações em hardware. As arquiteturas de hardware encontradas na literatura para o CABAC focam no Codificador Aritmético Binário (BAE - Binary Arithmetic Encoder) enquanto que a Binarização e Modelagem de Contextos (BCM – Binarization and Context Modeling) fica em segundo plano ou nem é apresentada. O BCM e o BAE juntos constituem o CABAC. Esta dissertação descreve detalhadamente o conjunto de algoritmos que compõem o BCM do padrão H.264/AVC. Em seguida, o projeto de uma arquitetura de hardware específica para o BCM é apresentada. A solução proposta é descrita em VHDL e os resultados de síntese mostram que a arquitetura alcança desempenho suficiente, em FPGA e ASIC, para processar vídeos no nível 5 do padrão H.264/AVC. A arquitetura proposta é 13,3% mais rápida e igualmente eficiente em área que os melhores trabalhos relacionados nestes quesitos. / Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted in the H.264/AVC main profile is the state-of-art in terms of bit-rate efficiency. However, CABAC takes 9.6% of the total encoding time and its throughput is limited by bit-level data dependency (LIN, 2010). Moreover, meeting real-time requirement for a pure software CABAC encoder is difficult at the highest levels of the H.264/AVC standard. Hence, speeding up the CABAC by hardware implementation is required. The CABAC hardware architectures found in the literature focus on the Binary Arithmetic Encoder (BAE), while the Binarization and Context Modeling (BCM) is a secondary issue or even absent in the literature. Integrated, the BCM and the BAE constitute the CABAC. This dissertation presents the set of algorithms that describe the BCM of the H.264/AVC standard. Then, a novel hardware architecture design for the BCM is presented. The proposed design is described in VHDL and the synthesis results show that the proposed architecture reaches sufficiently high performance in FPGA and ASIC to process videos in real-time at the level 5 of H.264/AVC standard. The proposed design is 13.3% faster than the best works in these items, while being equally efficient in area.
20

Projeto da arquitetura de hardware para binarização e modelagem de contextos para o CABAC do padrão de compressão de vídeo H.264/AVC / Hardware architecture design for binarization and context modeling for CABAC of H.264/AVC video compression

Martins, André Luis Del Mestre January 2011 (has links)
O codificador aritmético binário adaptativo ao contexto adotado (CABAC – Context-based Adaptive Binary Arithmetic Coding) pelo padrão H.264/AVC a partir de perfil Main é o estado-da-arte em termos de eficiência de taxa de bits. Entretanto, o CABAC ocupa 9.6% do tempo total de processamento e seu throughput é limitado pelas dependências de dados no nível de bit (LIN, 2010). Logo, atingir os requisitos de desempenho em tempo real nos níveis mais altos do padrão H.264/AVC se torna uma tarefa árdua em software, sendo necesário então, a aceleração do CABAC através de implementações em hardware. As arquiteturas de hardware encontradas na literatura para o CABAC focam no Codificador Aritmético Binário (BAE - Binary Arithmetic Encoder) enquanto que a Binarização e Modelagem de Contextos (BCM – Binarization and Context Modeling) fica em segundo plano ou nem é apresentada. O BCM e o BAE juntos constituem o CABAC. Esta dissertação descreve detalhadamente o conjunto de algoritmos que compõem o BCM do padrão H.264/AVC. Em seguida, o projeto de uma arquitetura de hardware específica para o BCM é apresentada. A solução proposta é descrita em VHDL e os resultados de síntese mostram que a arquitetura alcança desempenho suficiente, em FPGA e ASIC, para processar vídeos no nível 5 do padrão H.264/AVC. A arquitetura proposta é 13,3% mais rápida e igualmente eficiente em área que os melhores trabalhos relacionados nestes quesitos. / Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted in the H.264/AVC main profile is the state-of-art in terms of bit-rate efficiency. However, CABAC takes 9.6% of the total encoding time and its throughput is limited by bit-level data dependency (LIN, 2010). Moreover, meeting real-time requirement for a pure software CABAC encoder is difficult at the highest levels of the H.264/AVC standard. Hence, speeding up the CABAC by hardware implementation is required. The CABAC hardware architectures found in the literature focus on the Binary Arithmetic Encoder (BAE), while the Binarization and Context Modeling (BCM) is a secondary issue or even absent in the literature. Integrated, the BCM and the BAE constitute the CABAC. This dissertation presents the set of algorithms that describe the BCM of the H.264/AVC standard. Then, a novel hardware architecture design for the BCM is presented. The proposed design is described in VHDL and the synthesis results show that the proposed architecture reaches sufficiently high performance in FPGA and ASIC to process videos in real-time at the level 5 of H.264/AVC standard. The proposed design is 13.3% faster than the best works in these items, while being equally efficient in area.

Page generated in 0.1033 seconds