• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 12
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 71
  • 42
  • 26
  • 18
  • 17
  • 16
  • 13
  • 12
  • 11
  • 11
  • 11
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

La syllabe dans la production écrite de mots / Syllable in handwritten word production

Sausset, Solen 06 November 2013 (has links)
L'objectif général de cette thèse est de préciser le niveau de traitement auquel la syllabe intervient au cours de la production écrite de mots, le rôle qu'elle joue, ainsi que la dynamique de mobilisation de cette unité. Dans le premier chapitre expérimental nous explorons les relations entre l'activation des syllabes et les traitements graphomoteurs. L'activation syllabique apparaît dissociée des traitements graphomoteurs (Expériences 1a et 1b), et la dynamique d'activation des syllabes est sous l'influence des contraintes qui pèsent sur ces traitements graphomoteurs uniquement quand les contraintes sont très fortes (Expériences 2a, 2b, 2c). Les relations entre l'activation des syllabes et le traitement orthographique font l'objet du deuxième chapitre expérimental. Nos résultats montrent que les deux processus semblent dissociés (Expérience 3a et contrôle), et que la dynamique d'activation des syllabes apparaît modifiée en fonction des contraintes orthographiques, étudiée ici à travers la fréquence lexicale (Expérience 3b). L'ensemble de nos données tend à confirmer l'idée selon laquelle les syllabes sont des unités mobilisées à l'interface des traitements orthographiques et graphomoteurs, i.e., dans le buffer graphémique. Ces résultats sont discutés dans le cadre d'un modèle en cascade de la production écrite, auquel il semble que nous ajoutons un niveau de traitement spécifique à la syllabe. / This research aims at specifying the processing level at which the syllable is involved during handwriting, the role that it plays, as well as the dynamics of its activation. In the first experimental chapter, we explore about relations between syllable activation and graphomotor processing. Our results show that syllable activation and graphomotor processing appear to be distinct (Experiments 1a and 1b), and that the dynamics of syllable activation vary as function of graphomotor constraints when these constraints are very strong. The relations between syllable activation and spelling are addressed is the second experimental chapter. The results show that both processes are distinct (Experiment 3a and control), and that the dynamics of syllable activation change according to spelling constraints, studied here via lexical frequency (Experiment 3b). Taken together, all these data support the assumption that syllables are activated between spelling and graphomotor processing, i.e., in the graphemic buffer. These results are discussed in a cascade model of handwriting, in which might be integrated a specific level of processing devoted to the syllable.
32

Mário de Andrade e a literatura surrealista / Mário de Andrade and the surrealist literature

Isabel Gasparri 07 July 2008 (has links)
Esta pesquisa tenciona compreender as relações do escritor Mário de Andrade (1893-1945) com a literatura surrealista, considerando manuscritos e livros em seu acervo, no patrimônio do Instituto de Estudos Brasileiros da Universidade de São Paulo (IEB-USP), bem como a sua obra publicada. A pesquisa reúne e transcreve textos literários e ensaísticos atinentes ao Surrealismo, mencionados nos manuscritos do Fichário Analítico de Mário de Andrade; promove o levantamento de obras e de periódico surrealistas franceses presentes na biblioteca do criador de Macunaíma, recuperando anotações de leitura, as quais sugerem diálogos de criação na esfera da crítica literária e da literatura; congrega igualmente juízos críticos de Mário de Andrade sobre o Surrealismo expressos em sua obra édita e correspondência / This search intends to understand the relationship of the writer Mário de Andrade (1893 1945) with the Surrealist literature, considering handwritten and books in his collection, in the property of Institute of Brazilian Studies of São Paulo University (IEB-USP), as like his published work. The search gathers and transcripts literary and essayist texts pertinent to surrealism, mentioned in the handwritten of Mário de Andrades Analytical Card index; it organizes the survey of works and the French Surrealists periodic presents in the library of Macunaimas creator, recapturing reading notes, which suggest dialogues of creation in the literary critic sphere and the literature; it gathers equally Mário de Andrades criticizes judges about Surrealism expressed in his published work and correspondence.
33

Vyhodnocení testových formulářů pomocí OCR / Test form evaluation by OCR

Noghe, Petr January 2013 (has links)
This thesis deals with the evaluation forms using optical character recognition. Image processing and methods used for OCR is described in the first part of thesis. In the practical part is created database of sample characters. The chosen method is based on correlation between patterns and recognized characters. The program is designed in a graphical environment MATLAB. Finally, several forms are evaluated and success rate of the proposed program is detected.
34

Handwritten Document Binarization Using Deep Convolutional Features with Support Vector Machine Classifier

Lai, Guojun, Li, Bing January 2020 (has links)
Background. Since historical handwritten documents have played important roles in promoting the development of human civilization, many of them have been preserved through digital versions for more scientific researches. However, various degradations always exist in these documents, which could interfere in normal reading. But, binarized versions can keep meaningful contents without degradations from original document images. Document image binarization always works as a pre-processing step before complex document analysis and recognition. It aims to extract texts from a document image. A desirable binarization performance can promote subsequent processing steps positively. For getting better performance for document image binarization, efficient binarization methods are needed. In recent years, machine learning centered on deep learning has gathered substantial attention in document image binarization, for example, Convolutional Neural Networks (CNNs) are widely applied in document image binarization because of the powerful ability of feature extraction and classification. Meanwhile, Support Vector Machine (SVM) is also used in image binarization. Its objective is to build an optimal hyperplane that could maximize the margin between negative samples and positive samples, which can separate the foreground pixels and the background pixels of the image distinctly. Objectives. This thesis aims to explore how the CNN based process of deep convolutional feature extraction and an SVM classifier can be integrated well to binarize handwritten document images, and how the results are, compared with some state-of-the-art document binarization methods. Methods. To investigate the effect of the proposed method on document image binarization, it is implemented and trained. In the architecture, CNN is used to extract features from input images, afterwards these features are fed into SVM for classification. The model is trained and tested with six different datasets. Then, there is a performance comparison between the proposed model and other binarization methods, including some state-of-the-art methods on other three different datasets. Results. The performance results indicate that the proposed model not only can work well but also perform better than some other novel handwritten document binarization method. Especially, evaluation of the results on DIBCO 2013 dataset indicates that our method fully outperforms other chosen binarization methods on all the four evaluation metrics. Besides, it also has the ability to deal with some degradations, which demonstrates its generalization and learning ability are excellent. When a new kind of degradation appears, the proposed method can address it properly even though it never appears in the training datasets. Conclusions. This thesis concludes that the CNN based component and SVM can be combined together for handwritten document binarization. Additionally, in certain datasets, it outperforms some other state-of-the-art binarization methods. Meanwhile, its generalization and learning ability is outstanding when dealing with some degradations.
35

Semi-automatic Segmentation & Alignment of Handwritten Historical Text Images with the use of Bayesian Optimisation

MacCormack, Philip January 2023 (has links)
To effortlessly digitise historical documents has risen to be of great interest for some time. Part of the digitisation is what is called annotating of the data. Such data annotations are obtained in a process called alignment which links words in an image to the transcript. Annotated data have many use cases such as being used in the training of handwritten text recognition models. Relevant to the application above, this project aimed to develop an interactive algorithm for the segmentation and alignment of historical document images. Two different developed methods (referred to as method 1 and method 2) were evaluated and compared on two different data sets Labour’sMemory and IAM. A method to incorporate self-learning was also developed and evaluated with Bayesian optimisation aimed at automatically setting parameters for the algorithm. The results proved that the algorithms perform better on the IAM data set, which could partly be explained by the difference in quality of the ground truth used for calculation of the performance metrics. Moreover, method 2 slightly outperformed method 1 for both data sets. Bayesian optimisation proved to be a reasonable, and more time efficient way of effectively setting parameters compared to manually finding parameters for each document. The work done in this project could serve as the basis for the future development of a useful and interactive tool for the alignment of text documents.
36

Multivariate analysis of the parameters in a handwritten digit recognition LSTM system / Multivariat analys av parametrarna i ett LSTM-system för igenkänning av handskrivna siffror

Zervakis, Georgios January 2019 (has links)
Throughout this project, we perform a multivariate analysis of the parameters of a long short-term memory (LSTM) system for handwritten digit recognition in order to understand the model’s behaviour. In particular, we are interested in explaining how this behaviour precipitate from its parameters, and what in the network is responsible for the model arriving at a certain decision. This problem is often referred to as the interpretability problem, and falls under scope of Explainable AI (XAI). The motivation is to make AI systems more transparent, so that we can establish trust between humans. For this purpose, we make use of the MNIST dataset, which has been successfully used in the past for tackling digit recognition problem. Moreover, the balance and the simplicity of the data makes it an appropriate dataset for carrying out this research. We start by investigating the linear output layer of the LSTM, which is directly associated with the models’ predictions. The analysis includes several experiments, where we apply various methods from linear algebra such as principal component analysis (PCA) and singular value decomposition (SVD), to interpret the parameters of the network. For example, we experiment with different setups of low-rank approximations of the weight output matrix, in order to see the importance of each singular vector for each class of the digits. We found out that cutting off the fifth left and right singular vectors the model practically losses its ability to predict eights. Finally, we present a framework for analysing the parameters of the hidden layer, along with our implementation of an LSTM based variational autoencoder that serves this purpose. / I det här projektet utför vi en multivariatanalys av parametrarna för ett long short-term memory system (LSTM) för igenkänning av handskrivna siffror för att förstå modellens beteende. Vi är särskilt intresserade av att förklara hur detta uppträdande kommer ur parametrarna, och vad i nätverket som ligger bakom den modell som kommer fram till ett visst beslut. Detta problem kallas ofta för interpretability problem och omfattas av förklarlig AI (XAI). Motiveringen är att göra AI-systemen öppnare, så att vi kan skapa förtroende mellan människor. I detta syfte använder vi MNIST-datamängden, som tidigare framgångsrikt har använts för att ta itu med problemet med igenkänning av siffror. Dessutom gör balansen och enkelheten i uppgifterna det till en lämplig uppsättning uppgifter för att utföra denna forskning. Vi börjar med att undersöka det linjära utdatalagret i LSTM, som är direkt kopplat till modellernas förutsägelser. Analysen omfattar flera experiment, där vi använder olika metoder från linjär algebra, som principalkomponentanalys (PCA) och singulärvärdesfaktorisering (SVD), för att tolka nätverkets parametrar. Vi experimenterar till exempel med olika uppsättningar av lågrangordnade approximationer av viktutmatrisen för att se vikten av varje enskild vektor för varje klass av siffrorna. Vi upptäckte att om man skär av den femte vänster och högervektorn förlorar modellen praktiskt taget sin förmåga att förutsäga siffran åtta. Slutligen lägger vi fram ett ramverk för analys av parametrarna för det dolda lagret, tillsammans med vårt genomförande av en LSTM-baserad variational autoencoder som tjänar detta syfte.
37

Handwritten Recognition for Ethiopic (Ge’ez) Ancient Manuscript Documents / Handskrivet erkännande för etiopiska (Ge’ez) Forntida manuskriptdokument

Terefe, Adisu Wagaw January 2020 (has links)
The handwritten recognition system is a process of learning a pattern from a given image of text. The recognition process usually combines a computer vision task with sequence learning techniques. Transcribing texts from the scanned image remains a challenging problem, especially when the documents are highly degraded, or have excessive dusty noises. Nowadays, there are several handwritten recognition systems both commercially and in free versions, especially for Latin based languages. However, there is no prior study that has been built for Ge’ez handwritten ancient manuscript documents. In contrast, the language has many mysteries of the past, in human history of science, architecture, medicine and astronomy. In this thesis, we present two separate recognition systems. (1) A character-level recognition system which combines computer vision for character segmentation from ancient books and a vanilla Convolutional Neural Network (CNN) to recognize characters. (2) An end- to- end segmentation free handwritten recognition system using CNN, Multi-Dimensional Recurrent Neural Network (MDRNN) with Connectionist Temporal Classification (CTC) for the Ethiopic (Ge’ez) manuscript documents. The proposed character label recognition model outperforms 97.78% accuracy. In contrast, the second model provides an encouraging result which indicates to further study the language properties for better recognition of all the ancient books. / Det handskrivna igenkännings systemet är en process för att lära sig ett mönster från en viss bild av text. Erkännande Processen kombinerar vanligtvis en datorvisionsuppgift med sekvens inlärningstekniker. Transkribering av texter från den skannade bilden är fortfarande ett utmanande problem, särskilt när dokumenten är mycket försämrad eller har för omåttlig dammiga buller. Nuförtiden finns det flera handskrivna igenkänningar system både kommersiellt och i gratisversionen, särskilt för latin baserade språk. Det finns dock ingen tidigare studie som har byggts för Ge’ez handskrivna gamla manuskript dokument. I motsats till detta språk har många mysterier från det förflutna, i vetenskapens mänskliga historia, arkitektur, medicin och astronomi. I denna avhandling presenterar vi två separata igenkänningssystem. (1) Ett karaktärs nivå igenkänningssystem som kombinerar bildigenkänning för karaktär segmentering från forntida böcker och ett vanilj Convolutional Neural Network (CNN) för att erkänna karaktärer. (2) Ett änd-till-slut-segmentering fritt handskrivet igenkänningssystem som använder CNN, Multi-Dimensional Recurrent Neural Network (MDRNN) med Connectionist Temporal Classification (CTC) för etiopiska (Ge’ez) manuskript dokument. Den föreslagna karaktär igenkännings modellen överträffar 97,78% noggrannhet. Däremot ger den andra modellen ett uppmuntrande resultat som indikerar att ytterligare studera språk egenskaperna för bättre igenkänning av alla antika böcker.
38

Advances on the Transcription of Historical Manuscripts based on Multimodality, Interactivity and Crowdsourcing

Granell Romero, Emilio 01 September 2017 (has links)
Natural Language Processing (NLP) is an interdisciplinary research field of Computer Science, Linguistics, and Pattern Recognition that studies, among others, the use of human natural languages in Human-Computer Interaction (HCI). Most of NLP research tasks can be applied for solving real-world problems. This is the case of natural language recognition and natural language translation, that can be used for building automatic systems for document transcription and document translation. Regarding digitalised handwritten text documents, transcription is used to obtain an easy digital access to the contents, since simple image digitalisation only provides, in most cases, search by image and not by linguistic contents (keywords, expressions, syntactic or semantic categories). Transcription is even more important in historical manuscripts, since most of these documents are unique and the preservation of their contents is crucial for cultural and historical reasons. The transcription of historical manuscripts is usually done by paleographers, who are experts on ancient script and vocabulary. Recently, Handwritten Text Recognition (HTR) has become a common tool for assisting paleographers in their task, by providing a draft transcription that they may amend with more or less sophisticated methods. This draft transcription is useful when it presents an error rate low enough to make the amending process more comfortable than a complete transcription from scratch. Thus, obtaining a draft transcription with an acceptable low error rate is crucial to have this NLP technology incorporated into the transcription process. The work described in this thesis is focused on the improvement of the draft transcription offered by an HTR system, with the aim of reducing the effort made by paleographers for obtaining the actual transcription on digitalised historical manuscripts. This problem is faced from three different, but complementary, scenarios: · Multimodality: The use of HTR systems allow paleographers to speed up the manual transcription process, since they are able to correct on a draft transcription. Another alternative is to obtain the draft transcription by dictating the contents to an Automatic Speech Recognition (ASR) system. When both sources (image and speech) are available, a multimodal combination is possible and an iterative process can be used in order to refine the final hypothesis. · Interactivity: The use of assistive technologies in the transcription process allows one to reduce the time and human effort required for obtaining the actual transcription, given that the assistive system and the palaeographer cooperate to generate a perfect transcription. Multimodal feedback can be used to provide the assistive system with additional sources of information by using signals that represent the whole same sequence of words to transcribe (e.g. a text image, and the speech of the dictation of the contents of this text image), or that represent just a word or character to correct (e.g. an on-line handwritten word). · Crowdsourcing: Open distributed collaboration emerges as a powerful tool for massive transcription at a relatively low cost, since the paleographer supervision effort may be dramatically reduced. Multimodal combination allows one to use the speech dictation of handwritten text lines in a multimodal crowdsourcing platform, where collaborators may provide their speech by using their own mobile device instead of using desktop or laptop computers, which makes it possible to recruit more collaborators. / El Procesamiento del Lenguaje Natural (PLN) es un campo de investigación interdisciplinar de las Ciencias de la Computación, Lingüística y Reconocimiento de Patrones que estudia, entre otros, el uso del lenguaje natural humano en la interacción Hombre-Máquina. La mayoría de las tareas de investigación del PLN se pueden aplicar para resolver problemas del mundo real. Este es el caso del reconocimiento y la traducción del lenguaje natural, que se pueden utilizar para construir sistemas automáticos para la transcripción y traducción de documentos. En cuanto a los documentos manuscritos digitalizados, la transcripción se utiliza para facilitar el acceso digital a los contenidos, ya que la simple digitalización de imágenes sólo proporciona, en la mayoría de los casos, la búsqueda por imagen y no por contenidos lingüísticos. La transcripción es aún más importante en el caso de los manuscritos históricos, ya que la mayoría de estos documentos son únicos y la preservación de su contenido es crucial por razones culturales e históricas. La transcripción de manuscritos históricos suele ser realizada por paleógrafos, que son personas expertas en escritura y vocabulario antiguos. Recientemente, los sistemas de Reconocimiento de Escritura (RES) se han convertido en una herramienta común para ayudar a los paleógrafos en su tarea, la cual proporciona un borrador de la transcripción que los paleógrafos pueden corregir con métodos más o menos sofisticados. Este borrador de transcripción es útil cuando presenta una tasa de error suficientemente reducida para que el proceso de corrección sea más cómodo que una completa transcripción desde cero. Por lo tanto, la obtención de un borrador de transcripción con una baja tasa de error es crucial para que esta tecnología de PLN sea incorporada en el proceso de transcripción. El trabajo descrito en esta tesis se centra en la mejora del borrador de transcripción ofrecido por un sistema RES, con el objetivo de reducir el esfuerzo realizado por los paleógrafos para obtener la transcripción de manuscritos históricos digitalizados. Este problema se enfrenta a partir de tres escenarios diferentes, pero complementarios: · Multimodalidad: El uso de sistemas RES permite a los paleógrafos acelerar el proceso de transcripción manual, ya que son capaces de corregir en un borrador de la transcripción. Otra alternativa es obtener el borrador de la transcripción dictando el contenido a un sistema de Reconocimiento Automático de Habla. Cuando ambas fuentes están disponibles, una combinación multimodal de las mismas es posible y se puede realizar un proceso iterativo para refinar la hipótesis final. · Interactividad: El uso de tecnologías asistenciales en el proceso de transcripción permite reducir el tiempo y el esfuerzo humano requeridos para obtener la transcripción correcta, gracias a la cooperación entre el sistema asistencial y el paleógrafo para obtener la transcripción perfecta. La realimentación multimodal se puede utilizar en el sistema asistencial para proporcionar otras fuentes de información adicionales con señales que representen la misma secuencia de palabras a transcribir (por ejemplo, una imagen de texto, o la señal de habla del dictado del contenido de dicha imagen de texto), o señales que representen sólo una palabra o carácter a corregir (por ejemplo, una palabra manuscrita mediante una pantalla táctil). · Crowdsourcing: La colaboración distribuida y abierta surge como una poderosa herramienta para la transcripción masiva a un costo relativamente bajo, ya que el esfuerzo de supervisión de los paleógrafos puede ser drásticamente reducido. La combinación multimodal permite utilizar el dictado del contenido de líneas de texto manuscrito en una plataforma de crowdsourcing multimodal, donde los colaboradores pueden proporcionar las muestras de habla utilizando su propio dispositivo móvil en lugar de usar ordenadores, / El Processament del Llenguatge Natural (PLN) és un camp de recerca interdisciplinar de les Ciències de la Computació, la Lingüística i el Reconeixement de Patrons que estudia, entre d'altres, l'ús del llenguatge natural humà en la interacció Home-Màquina. La majoria de les tasques de recerca del PLN es poden aplicar per resoldre problemes del món real. Aquest és el cas del reconeixement i la traducció del llenguatge natural, que es poden utilitzar per construir sistemes automàtics per a la transcripció i traducció de documents. Quant als documents manuscrits digitalitzats, la transcripció s'utilitza per facilitar l'accés digital als continguts, ja que la simple digitalització d'imatges només proporciona, en la majoria dels casos, la cerca per imatge i no per continguts lingüístics (paraules clau, expressions, categories sintàctiques o semàntiques). La transcripció és encara més important en el cas dels manuscrits històrics, ja que la majoria d'aquests documents són únics i la preservació del seu contingut és crucial per raons culturals i històriques. La transcripció de manuscrits històrics sol ser realitzada per paleògrafs, els quals són persones expertes en escriptura i vocabulari antics. Recentment, els sistemes de Reconeixement d'Escriptura (RES) s'han convertit en una eina comuna per ajudar els paleògrafs en la seua tasca, la qual proporciona un esborrany de la transcripció que els paleògrafs poden esmenar amb mètodes més o menys sofisticats. Aquest esborrany de transcripció és útil quan presenta una taxa d'error prou reduïda perquè el procés de correcció siga més còmode que una completa transcripció des de zero. Per tant, l'obtenció d'un esborrany de transcripció amb un baixa taxa d'error és crucial perquè aquesta tecnologia del PLN siga incorporada en el procés de transcripció. El treball descrit en aquesta tesi se centra en la millora de l'esborrany de la transcripció ofert per un sistema RES, amb l'objectiu de reduir l'esforç realitzat pels paleògrafs per obtenir la transcripció de manuscrits històrics digitalitzats. Aquest problema s'enfronta a partir de tres escenaris diferents, però complementaris: · Multimodalitat: L'ús de sistemes RES permet als paleògrafs accelerar el procés de transcripció manual, ja que són capaços de corregir un esborrany de la transcripció. Una altra alternativa és obtenir l'esborrany de la transcripció dictant el contingut a un sistema de Reconeixement Automàtic de la Parla. Quan les dues fonts (imatge i parla) estan disponibles, una combinació multimodal és possible i es pot realitzar un procés iteratiu per refinar la hipòtesi final. · Interactivitat: L'ús de tecnologies assistencials en el procés de transcripció permet reduir el temps i l'esforç humà requerits per obtenir la transcripció real, gràcies a la cooperació entre el sistema assistencial i el paleògraf per obtenir la transcripció perfecta. La realimentació multimodal es pot utilitzar en el sistema assistencial per proporcionar fonts d'informació addicionals amb senyals que representen la mateixa seqüencia de paraules a transcriure (per exemple, una imatge de text, o el senyal de parla del dictat del contingut d'aquesta imatge de text), o senyals que representen només una paraula o caràcter a corregir (per exemple, una paraula manuscrita mitjançant una pantalla tàctil). · Crowdsourcing: La col·laboració distribuïda i oberta sorgeix com una poderosa eina per a la transcripció massiva a un cost relativament baix, ja que l'esforç de supervisió dels paleògrafs pot ser reduït dràsticament. La combinació multimodal permet utilitzar el dictat del contingut de línies de text manuscrit en una plataforma de crowdsourcing multimodal, on els col·laboradors poden proporcionar les mostres de parla utilitzant el seu propi dispositiu mòbil en lloc d'utilitzar ordinadors d'escriptori o portàtils, la qual cosa permet ampliar el nombr / Granell Romero, E. (2017). Advances on the Transcription of Historical Manuscripts based on Multimodality, Interactivity and Crowdsourcing [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86137
39

OCR of hand-written transcriptions of hieroglyphic text

Nederhof, Mark-Jan 20 April 2016 (has links) (PDF)
Encoding hieroglyphic texts is time-consuming. If a text already exists as hand-written transcription, there is an alternative, namely OCR. Off-the-shelf OCR systems seem difficult to adapt to the peculiarities of Ancient Egyptian. Presented is a proof-of-concept tool that was designed to digitize texts of Urkunden IV in the hand-writing of Kurt Sethe. It automatically recognizes signs and produces a normalized encoding, suitable for storage in a database, or for printing on a screen or on paper, requiring little manual correction. The encoding of hieroglyphic text is RES (Revised Encoding Scheme) rather than (common dialects of) MdC (Manuel de Codage). Earlier papers argued against MdC and in favour of RES for corpus development. Arguments in favour of RES include longevity of the encoding, as its semantics are font-independent. The present study provides evidence that RES is also much preferable to MdC in the context of OCR. With a well-understood parsing technique, relative positioning of scanned signs can be straightforwardly mapped to suitable primitives of the encoding.
40

Rotulação de símbolos matemáticos manuscritos via casamento de expressões / Labeling of Handwritten Mathematical Symbols via Expression Matching

Honda, Willian Yukio 23 January 2013 (has links)
O problema de reconhecimento de expressões matemáticas manuscritas envolve três subproblemas importantes: segmentação de símbolos, reconhecimento de símbolos e análise estrutural de expressões. Para avaliar métodos e técnicas de reconhecimento, eles precisam ser testados sobre conjuntos de amostras representativos do domínio de aplicação. Uma das preocupações que tem sido apontada ultimamente é a quase inexistência de base de dados pública de expressões matemáticas, o que dificulta o desenvolvimento e comparação de diferentes abordagens. Em geral, os resultados de reconhecimento apresentados na literatura restringem-se a conjuntos de dados pequenos, não disponíveis publicamente, e muitas vezes formados por dados que visam avaliar apenas alguns aspectos específicos do reconhecimento. No caso de expressões online, para treinar e testar reconhecedores de símbolos, as amostras são em geral obtidas solicitando-se que as pessoas escrevam uma série de símbolos individualmente e repetidas vezes. Tal tarefa é monótona e cansativa. Uma abordagem alternativa para obter amostras de símbolos seria solicitar aos usuários a transcrição de expressões modelo previamente definidas. Dessa forma, a escrita dos símbolos seria realizada de forma natural, menos monótona, e várias amostras de símbolos poderiam ser obtidas de uma única expressão. Para evitar o trabalho de anotar manualmente cada símbolo das expressões transcritas, este trabalho propõe um método para casamento de expressões matemáticas manuscritas, no qual símbolos de uma expressão transcrita por um usuário são associados aos correspondentes símbolos (previamente identificados) da expressão modelo. O método proposto é baseado em uma formulação que reduz o problema a um problema de associação simples, no qual os custos são definidos em termos de características dos símbolos e estrutura da expressão. Resultados experimentais utilizando o método proposto mostram taxas médias de associação correta superiores a 99%. / The problem of recognizing handwritten mathematical expressions includes three important subproblems: symbol segmentation, symbol recognition, and structural analysis of expressions. In order to evaluate recognition methods and techniques, they should be tested on representative sample sets of the application domain. One of the concerns that are being repeatedly pointed recently is the almost non-existence of public representative datasets of mathematical expressions, which makes difficult the development and comparison of distinct approaches. In general, recognition results reported in the literature are restricted to small datasets, not publicly available, and often consisting of data aiming only evaluation of some specific aspects of the recognition. In the case of online expressions, to train and test symbol recognizers, samples are in general obtained asking users to write a series of symbols individually and repeatedly. Such task is boring and tiring. An alternative approach for obtaining samples of symbols would be to ask users to transcribe previously defined model expressions. By doing so, writing would be more natural and less boring, and several symbol samples could be obtained from one transcription. To avoid the task of manually labeling the symbols of the transcribed expressions, in this work a method for handwritten expression matching, in which symbols of a transcribed expression are assigned to the corresponding ones in the model expression, is proposed. The proposed method is based on a formulation that reduces the matching problem to a linear assignment problem, where costs are defined based on symbol features and expression structure. Experimental results using the proposed method show that mean correct assignment rate superior to 99% is achieved.

Page generated in 0.0502 seconds