• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 57
  • 46
  • 25
  • 24
  • 23
  • 16
  • 12
  • 5
  • 4
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 248
  • 49
  • 48
  • 48
  • 48
  • 46
  • 46
  • 45
  • 44
  • 44
  • 44
  • 44
  • 44
  • 44
  • 42
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

O processo de construção das fontes digitais de simulação caligráfica / The development of digital typefaces simulating calligraphy

Fabio Pinto Lopes de Lima 05 March 2009 (has links)
Conceituação do objeto pesquisado: a definição de fontes digitais de simulação caligráfica, origem e evolução tecnológica. A identificação das categorias de compreensão da forma caligráfica: as características visuais associadas ao universo instrumental da caligrafia; ao processo de construção da escrita; à perícia ferramental e à caligrafia como ocorrência espaço-temporal. O processo de construção das fontes digitais de simulação caligráfica. As etapas de construção das fontes de simulação baseadas em referências concretas: análise do original, digitalização, vetorização, métrica e fechamento do arquivo. As estratégias de construção das fontes baseadas em referências conceituais. Articulação de características estruturais e expressivas associadas ao instrumental caligráfico. A sugestão do ductus caligráfico em fontes digitais de simulação: construção contínua e interrompida. As estratégias associadas à representação visual da habilidade ferramental: ornamentação, integração e imperícia. O conceito de variância aplicado à tipografia digital: variância manual, aleatória e planejada. Apresentação dos projetos Zapfino, Bickham e Champion Script. Comentários a respeito da relevância das fontes digitais de simulação caligráfica: conservação, interação, compartilhamento e valorização da prática caligráfica. / Conceptualization of the studys object: the definition of digital typefaces simulating calligraphy, their origin and technological evolution. Identification of categories for understading calligraphic shapes: visual characteristics associated with the instrumental universe of calligaphy; the process of writing; craft skills and calligraphy as a temporal-spatial event. The process of manufacturing digital typefaces simulating calligraphy.The steps of development of digital typefaces simulating calligraphy based on concrete references: source analysis, digitalization, translation to outlines, spacing and font generation. Strategies for designing typefaces simulating calligraphy based on conceptual references. Articulation of structural and expressive features associated with the calligraphic craft. Suggestion of calligraphic ductus in digital typefaces simulating calligraphy: continuous and interrupted construction. Strategies associated with the visual representation of instrumental skills: ornamentation, integration and lack of skills. The concept of variation applied to digital type: manual, random and planned variation. Presentation of the typefaces Zapfino, Bickham and Champion Script. Comments related to the relevance of digital typefaces simulating calligraphy: conservation, interaction, sharing and increasing appreciation for the calligraphic practice
182

Habilidades de processamento fonológico e de escrita em crianças com distúrbio específico de linguagem: um estudo comparativo com a normalidade / Phonological processing and writing skills in children with specific language impairment - a comparative study

Paula Renata Pedott 24 March 2016 (has links)
Introdução: Crianças com distúrbio específico de linguagem (DEL) são propensas a apresentar dificuldade no processo de alfabetização devido às múltiplas alterações de linguagem que possuem. Este estudo comparou e caracterizou o desempenho de crianças com DEL e em desenvolvimento típico de linguagem em atividades de aliteração, rima, memória de curto prazo fonológica, ditado de palavras e de pseudopalavras. A principal hipótese do estudo era de que o grupo DEL apresentaria desempenho inferior do que o grupo em desenvolvimento típico em todas as habilidades estudadas. Método: Participaram do estudo 12 crianças com DEL (GP) e 48 em desenvolvimento típico (GC) com idade entre 7 anos e 9 anos e 11 meses. Todos os sujeitos cursavam o 2º ou 3º ano do ensino fundamental I e apresentavam audição e rendimento intelectual não-verbal preservados. Para a seleção dos grupos foram utilizadas medidas de vocabulário receptivo, fonologia e nível socioeconômico. Já as medidas experimentais avaliadas foram testes padronizados de aliteração, rima, memória de curto prazo fonológica e a aplicação de um ditado de palavras e de pseudopalavras elaborados para esta pesquisa. Resultados: ambos os grupos apresentaram pior desempenho em tarefas de rima do que de aliteração e o GP apresentou desempenho inferior em ambas as tarefas quando comparado ao GC. A análise dos distratores nas atividades de aliteração e rima apontou que em tarefas de aliteração, o GP cometeu mais erros de tipologia semântico enquanto na prova de rima foram mais erros de tipologia fonológico. O GP obteve desempenho inferior ao GC nas avaliações da memória de curto prazo fonológica, ditado de palavras e de pseudopalavras. O GP evidenciou maior dificuldade no ditado de pseudopalavras no que no de palavras e o GC não apresentou diferença significativa no desempenho dos ditados. No ditado de palavras, o GP cometeu mais erros na palavra toda enquanto no ditado de pseudopalavras ocorreram mais erros na palavra toda e na sílaba final. Na comparação do desempenho dos grupos de acordo com a escolaridade, notou-se que os sujeitos do GC do 2º e 3º ano não evidenciaram diferença significativa em seu desempenho nas tarefas, enquanto os sujeitos do GP do 3º ano apresentaram melhor desempenho do que os do 2º ano em todas as medidas experimentais, com exceção da memória de curto prazo fonológica. Conclusões: o GP apresentou dificuldade em tarefas de processamento fonológico e de escrita que foram realizadas com relativa facilidade pelo GC. Os sujeitos com DEL evidenciaram uma análise mais global dos estímulos apresentados nas tarefas de consciência fonológica, o que os fez desprezar aspectos segmentais importantes. A dificuldade em abordar as informações de modo analítico, somado a alterações linguísticas e do processamento fonológico, levou o GP a apresentar maior taxa de erros nas tarefas de ditado. Apesar das alterações apontadas, os sujeitos do GP do 3º ano obtiveram melhor desempenho do que os do 2º ano em todas as habilidades com exceção da memória de curto prazo fonológica, que é sua marca clínica. Estes dados reforçam a necessidade do diagnóstico e intervenção precoces para esta população, onde as habilidades abordadas neste estudo devem ser incluídas no processo terapêutico / Introduction: Children with specific language impairment (SLI) are likely to experience difficulty in literacy development due to several language alterations they have. This study compared and characterized the performance of children with SLI to ones with typical language development in activities involving alliteration, rhyme, phonological short-term memory, and spelling of words and pseudowords. Our main hypothesis was that the group with SLI would have an inferior performance than the typical language development one in all the capacities studied. Methods: Participants were 12 children with SLI (study group - SG) and 48 in typical language development (control group - CG) aged 7-to-9 years. All children were on 2nd or 3rd grade and presented hearing thresholds within normal limits and appropriate nonverbal intellectual performance. In order to characterize the children, we assessed receptive vocabulary, phonology and socioeconomic status. The experimental assessment was composed by alliteration and rhyme tests, short-term memory test and by a spelling of words and pseudowords. Results: Both groups presented an inferior performance in rhyme activities compared to the alliteration, and the SG had an inferior performance in both tasks in comparison to CG. The analysis of distractors in alliteration and rhyme activities pointed out that in alliteration tests, SG made more errors of the semantics typology; whereas in rhyme tests, the errors regarded to phonological typology. SG had an inferior performance compared to CG in phonological short-term memory evaluation, as well as word and pseudoword spelling. SG evinced more difficulty in pseudoword spelling than in word spelling, and CG did not present any significant difference in spelling performance. Concerning word spelling, SG made more mistakes in entire words, whereas the pseudoword spelling mistakes were more frequent in the entire word and final syllable. Comparing the performance of the groups regarding schooling, it was noticed individuals from CG at 2nd and 3rd grade did not evince significant difference in their performance, whilst 3rd graders from SG presented better performance than 2nd graders in all experimental measures, except in phonological short-term memory. Conclusion: SG presented more difficulty in phonological processing and writing tasks which were done slightly easily by CG. SLI individuals attested a more global analysis of the stimuli presented in phonological awareness tasks, what made them despise relevant segmental aspects. The difficulty in approaching information analytically, in addiction to linguistic and phonological processing alterations resulted in higher mistake rates in spelling tasks by SG individuals. In spite of the mentioned alterations, SG 3rd graders obtained better performance than 2nd graders in all abilities except in phonological short-term memory, which is its clinical marker. These data reinforces the necessity of diagnosis and early intervention in this population, where the abilities observed by this study should be included in the therapeutic process
183

Handskrift och maskinskrift i lågstadiet : Lågstadieslärares val av inlärningsmetoder för handskrift och maskinskrift / Handwriting and computer writing in primary school : Primary school teachers' choice of acqusitions of handwriting and computer writing

Hugosson, Anna January 2017 (has links)
Syftet med denna studie är att ta reda ut på hur lågstadielärare ser på den digitaliserade skriftspråksutvecklingen i skolan samt hur pedagoger undervisar skriftspråket. Samhället som vi människor lever i idag har blivit alltmer digitaliserat och skolan har naturligen följt med i denna digitala utveckling. Datorer och surfplattor är ett vanligt förekommande verktyg i skolor, även om tillgången till dessa verktyg ser olika ut i olika skolor. Studien har sin utgångspunkt utifrån det sociokulturella och pragmatiska perspektivet. Undersökningen genomförs med hjälp av enkät som riktas till pedagoger som arbetar i lågstadiet. Resultatet av studien visar att lärarna är positiva till den digitala skriftspråksutveckling. Det beror på dels på att maskinskriften är ett enkelt verktyg i samband med textbearbetning och dels för att lärarens fokus läggs mer på innehållet än formalia. Resultatet visar även vikten av att bevara handskriften då flera sinnen används vilket underlättar för skrivinlärningen. Lärarna arbetar med skrivinlärning på olika sätt. Några lärare använder en eller flera kända metoder, medan andra har sitt eget sätt att arbeta med skrivinlärning. / The purpose of this study is to find out how teachers in primary schools percieve the digitizied writing development in school and how they teach pupils to write. Our society has been digitized and naturally so has the school. Computers and tablets are a common equipment in schools, even though acess to these tools differs in Swedish schools.  The study is based on a socio-cultural and a pragmatic persepective. The study is conducted by using surveys aims at teachers working in the lower secondary school. The result of the study shows that teachers are positive to the digitized writing development. The main reasons are that the computer is a simple tool to edit texts and the teachers´ focus is mainly on the content than the formalities. The result also shows that serveral parts of the brain are being used in handwriting which simplify the writing acquisition. The teachers are working differently with writing acquisition. Some are using one or many famous methods and some has their own way to teach writing acquisition.
184

Handskrift och maskinskrift i lågstadiet : Lågstadielärares val av inlärningsmetoder för handskrift och maskinskrift / Handwriting and computer writing in primary school : Primary school teachers’ choice of acquisition of handwriting and computer writing

Hugosson, Anna January 2017 (has links)
Syftet med denna studie är att ta reda på hur lågstadielärare ser på den digitaliserade skriftspråksutvecklingen i skolan samt hur pedagoger undervisar skriftspråket. Samhället som vi människor lever i idag har blivit alltmer digitaliserat och skolan har naturligen följt med i denna digitala utveckling. Datorer och surfplattor är ett vanligt förekommande verktyg i skolor, även om tillgången till dessa verktyg ser olika ut i olika skolor.  Studien tar sin utgångspunkt utifrån det sociokulturella och det pragmatiska perspektivet. Undersökningen genomförs med hjälp av enkät som riktas till pedagoger som arbetar i lågstadiet. Resultatet av studien visar att lärarna är positiva till den digitala skriftspråksutvecklingen. Det beror dels på att maskinskriften är ett enkelt verktyg i samband med textbearbetning och dels för att lärarens fokus läggs mer på innehållet än formalia. Resultatet visar även vikten av att bevara handskriften då flera sinnen används vilket underlättar för skrivinlärningen. Lärarna arbetar med skrivinlärning på olika sätt. Några lärare använder en eller flera kända metoder, medan andra har sitt eget sätt att arbeta med skrivinlärningen. / The purpose of this study is to find out how teachers in primary schools perceive the digitized writing development in school and how they teach pupils to write. Our society has been digitized and naturally so has the school. Computers and tablets are a common equipment in schools, even though access to these tools differs in Swedish schools.  The study is based on a socio-cultural and a pragmatic perspective. The study is conducted by using surveys aims at teachers working in the lower secondary school. The result of the study shows that teachers are positive to the digitized writing development. The main reasons are that the computer is a simple tool to edit the text and the teachers´ focus is mainly on the content than the formalities. The result also shows that several parts of the brain are being used in handwriting which simplify the writing acquisition. The teachers are working differently with writing acquisition. Some are using one or many famous methods and some has their own way to teach writing acquisition.
185

The Effects of Self-evaluation and Response Restriction on Letter and Number Reversal in Young Children.

Strickland, Monica Kathleen 08 1900 (has links)
The purpose of this study was to evaluate the effects of a training package consisting of response restriction and the reinforcement of self-evaluation on letter reversal errors. Participants were 3 typically developing boys between the age of 5 and 7. The results indicated that the training package was successful in correcting reversals in the absence of a model during training and on application tests. These improvements maintained during subsequent follow-up sessions and generalized across trainers. Fading was not always necessary in correcting reversals, but was effective in correcting reversals that persisted during the overlay training procedures. The advantages to implementing a systematic intervention for reducing letter reversal errors in the classroom, as well as directions for future research, are discussed.
186

Comparison of a Traditional and an Integrated Program of Instruction in an Elementary School

Elder, Franklin L. January 1949 (has links)
The purpose of this study is to determine if elementary school children progress faster in academic or tool subjects when taught through interest units in an integrated curriculum or when taught the separate subjects by a traditional method. Reading, spelling, and handwriting are used as illustrative subjects in the sixth grade with reading only in the second grade.
187

Um estudo empírico sobre classificação de símbolos matemáticos manuscritos / An empirical study on handwritten mathematical symbol classication

Marcelo Valentim de Oliveira 25 August 2014 (has links)
Um importante problema na área de reconhecimento de padrões é o reconhecimento de textos manuscritos. O problema de reconhecimento de expressões matemáticas manuscritas é um caso particular, que vem sendo tratado por décadas. Esse problema é considerado desafiador devido à grande quantidade de possíveis tipos de símbolos, às variações intrínsecas da escrita, e ao complexo arranjo bidimensional dos símbolos na expressão. Neste trabalho adotamos o problema de reconhecimento de símbolos matemáticos manuscritos para realizar um estudo empírico sobre o comportamento de classificadores multi-classes. Examinamos métodos básicos de aprendizado para classificação multi-classe, especialmente as abordagens um-contra-todos e todos-contra-todos de decomposição de um problema multi-classe em problemas de classificação binária. Para decompor o problema em subproblemas menores, propomos também uma abordagem que utiliza uma árvore de decisão para dividir hierarquicamente o conjunto de dados, de modo que cada subconjunto resultante corresponda a um problema mais simples de classificação. Esses métodos são examinados usando-se como classificador base os modelos de classificação vizinhos-mais-próximos e máquinas de suporte vetorial (usando a abordagem um-contra-todos para combinar os classificadores binários). Para classificação, os símbolos são representados por um conjunto de características conhecido na literatura por HBF49 e que foi proposto recentemente especificamente para problemas de reconhecimento de símbolos on-line. Experimentos foram realizados para avaliar a acurácia dos classificadores, o desempenho dos classificadores para número crescente de classes, tempos de treinamento e teste, e uso de diferentes sub-conjuntos de características. Este trabalho inclui uma descrição dos fundamentos utilizados, detalhes do pré-processamento e extração de características para representação dos símbolos, e uma exposição e discussão sobre o estudo empírico realizado. Os dados adicionais que foram coletados para os experimentos serão publicamente disponibilizados. / An important problem in the eld of Pattern Recognition is handwriting recognition. The problem of handwritten mathematical expression recognition is a particular case that is being studied since decades. This is considered a challenging problem due to the large number of possible mathematical symbols, the intrinsic variation of handwriting, and the complex 2D arrangement of symbols within expressions. In this work we adopt the problem of recognition of online mathematical symbols in order to perform an empirical study on the behavior of multi-class classiers. We examine basic methods for multi-class classification, specially the one-versus-all and all-versus-all approaches for decomposing multi-class problems into a set of binary classification problems. To decompose the problem into smaller ones, we also propose an approach that uses a decision tree to hierarchically divide the whole dataset into subsets, in such a way that each subset corresponds to a simpler classification problem. These methods are examined using the k-nearest-neighbor and, accompanied by the oneversus-all approach, the support vector machine models as base classiers. For classification, symbols are represented through a set of features known in the literature as HBF49 and which has been proposed recently specially for the problem of recognition of online symbols. Experiments were performed in order to evaluate classier accuracy, the performance of the classiers as the number of classes are increased, training and testing time, and the use of dierent subsets of the whole set of features. This work includes a description of the needed background, details of the pre-processing and feature extraction techniques for symbol representation, and an exposition and discussion of the empirical studies performed. The data additionally collected for the experiments will be made publicly available.
188

Interactive Transcription of Old Text Documents

Serrano Martínez-Santos, Nicolás 09 June 2014 (has links)
Nowadays, there are huge collections of handwritten text documents in libraries all over the world. The high demand for these resources has led to the creation of digital libraries in order to facilitate the preservation and provide electronic access to these documents. However text transcription of these documents im- ages are not always available to allow users to quickly search information, or computers to process the information, search patterns or draw out statistics. The problem is that manual transcription of these documents is an expensive task from both economical and time viewpoints. This thesis presents a novel ap- proach for e cient Computer Assisted Transcription (CAT) of handwritten text documents using state-of-the-art Handwriting Text Recognition (HTR) systems. The objective of CAT approaches is to e ciently complete a transcription task through human-machine collaboration, as the e ort required to generate a manual transcription is high, and automatically generated transcriptions from state-of-the-art systems still do not reach the accuracy required. This thesis is centered on a special application of CAT, that is, the transcription of old text document when the quantity of user e ort available is limited, and thus, the entire document cannot be revised. In this approach, the objective is to generate the best possible transcription by means of the user e ort available. This thesis provides a comprehensive view of the CAT process from feature extraction to user interaction. First, a statistical approach to generalise interactive transcription is pro- posed. As its direct application is unfeasible, some assumptions are made to apply it to two di erent tasks. First, on the interactive transcription of hand- written text documents, and next, on the interactive detection of the document layout. Next, the digitisation and annotation process of two real old text documents is described. This process was carried out because of the scarcity of similar resources and the need of annotated data to thoroughly test all the developed tools and techniques in this thesis. These two documents were carefully selected to represent the general di culties that are encountered when dealing with HTR. Baseline results are presented on these two documents to settle down a benchmark with a standard HTR system. Finally, these annotated documents were made freely available to the community. It must be noted that, all the techniques and methods developed in this thesis have been assessed on these two real old text documents. Then, a CAT approach for HTR when user e ort is limited is studied and extensively tested. The ultimate goal of applying CAT is achieved by putting together three processes. Given a recognised transcription from an HTR system. The rst process consists in locating (possibly) incorrect words and employs the user e ort available to supervise them (if necessary). As most words are not expected to be supervised due to the limited user e ort available, only a few are selected to be revised. The system presents to the user a small subset of these words according to an estimation of their correctness, or to be more precise, according to their con dence level. Next, the second process starts once these low con dence words have been supervised. This process updates the recogni- tion of the document taking user corrections into consideration, which improves the quality of those words that were not revised by the user. Finally, the last process adapts the system from the partially revised (and possibly not perfect) transcription obtained so far. In this adaptation, the system intelligently selects the correct words of the transcription. As results, the adapted system will bet- ter recognise future transcriptions. Transcription experiments using this CAT approach show that this approach is mostly e ective when user e ort is low. The last contribution of this thesis is a method for balancing the nal tran- scription quality and the supervision e ort applied using our previously de- scribed CAT approach. In other words, this method allows the user to control the amount of errors in the transcriptions obtained from a CAT approach. The motivation of this method is to let users decide on the nal quality of the desired documents, as partially erroneous transcriptions can be su cient to convey the meaning, and the user e ort required to transcribe them might be signi cantly lower when compared to obtaining a totally manual transcription. Consequently, the system estimates the minimum user e ort required to reach the amount of error de ned by the user. Error estimation is performed by computing sepa- rately the error produced by each recognised word, and thus, asking the user to only revise the ones in which most errors occur. Additionally, an interactive prototype is presented, which integrates most of the interactive techniques presented in this thesis. This prototype has been developed to be used by palaeographic expert, who do not have any background in HTR technologies. After a slight ne tuning by a HTR expert, the prototype lets the transcribers to manually annotate the document or employ the CAT ap- proach presented. All automatic operations, such as recognition, are performed in background, detaching the transcriber from the details of the system. The prototype was assessed by an expert transcriber and showed to be adequate and e cient for its purpose. The prototype is freely available under a GNU Public Licence (GPL). / Serrano Martínez-Santos, N. (2014). Interactive Transcription of Old Text Documents [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37979 / TESIS
189

Contributions to Pen & Touch Human-Computer Interaction

Martín-Albo Simón, Daniel 01 September 2016 (has links)
[EN] Computers are now present everywhere, but their potential is not fully exploited due to some lack of acceptance. In this thesis, the pen computer paradigm is adopted, whose main idea is to replace all input devices by a pen and/or the fingers, given that the origin of the rejection comes from using unfriendly interaction devices that must be replaced by something easier for the user. This paradigm, that was was proposed several years ago, has been only recently fully implemented in products, such as the smartphones. But computers are actual illiterates that do not understand gestures or handwriting, thus a recognition step is required to "translate" the meaning of these interactions to computer-understandable language. And for this input modality to be actually usable, its recognition accuracy must be high enough. In order to realistically think about the broader deployment of pen computing, it is necessary to improve the accuracy of handwriting and gesture recognizers. This thesis is devoted to study different approaches to improve the recognition accuracy of those systems. First, we will investigate how to take advantage of interaction-derived information to improve the accuracy of the recognizer. In particular, we will focus on interactive transcription of text images. Here the system initially proposes an automatic transcript. If necessary, the user can make some corrections, implicitly validating a correct part of the transcript. Then the system must take into account this validated prefix to suggest a suitable new hypothesis. Given that in such application the user is constantly interacting with the system, it makes sense to adapt this interactive application to be used on a pen computer. User corrections will be provided by means of pen-strokes and therefore it is necessary to introduce a recognizer in charge of decoding this king of nondeterministic user feedback. However, this recognizer performance can be boosted by taking advantage of interaction-derived information, such as the user-validated prefix. Then, this thesis focuses on the study of human movements, in particular, hand movements, from a generation point of view by tapping into the kinematic theory of rapid human movements and the Sigma-Lognormal model. Understanding how the human body generates movements and, particularly understand the origin of the human movement variability, is important in the development of a recognition system. The contribution of this thesis to this topic is important, since a new technique (which improves the previous results) to extract the Sigma-lognormal model parameters is presented. Closely related to the previous work, this thesis study the benefits of using synthetic data as training. The easiest way to train a recognizer is to provide "infinite" data, representing all possible variations. In general, the more the training data, the smaller the error. But usually it is not possible to infinitely increase the size of a training set. Recruiting participants, data collection, labeling, etc., necessary for achieving this goal can be time-consuming and expensive. One way to overcome this problem is to create and use synthetically generated data that looks like the human. We study how to create these synthetic data and explore different approaches on how to use them, both for handwriting and gesture recognition. The different contributions of this thesis have obtained good results, producing several publications in international conferences and journals. Finally, three applications related to the work of this thesis are presented. First, we created Escritorie, a digital desk prototype based on the pen computer paradigm for transcribing handwritten text images. Second, we developed "Gestures à Go Go", a web application for bootstrapping gestures. Finally, we studied another interactive application under the pen computer paradigm. In this case, we study how translation reviewing can be done more ergonomically using a pen. / [ES] Hoy en día, los ordenadores están presentes en todas partes pero su potencial no se aprovecha debido al "miedo" que se les tiene. En esta tesis se adopta el paradigma del pen computer, cuya idea fundamental es sustituir todos los dispositivos de entrada por un lápiz electrónico o, directamente, por los dedos. El origen del rechazo a los ordenadores proviene del uso de interfaces poco amigables para el humano. El origen de este paradigma data de hace más de 40 años, pero solo recientemente se ha comenzado a implementar en dispositivos móviles. La lenta y tardía implantación probablemente se deba a que es necesario incluir un reconocedor que "traduzca" los trazos del usuario (texto manuscrito o gestos) a algo entendible por el ordenador. Para pensar de forma realista en la implantación del pen computer, es necesario mejorar la precisión del reconocimiento de texto y gestos. El objetivo de esta tesis es el estudio de diferentes estrategias para mejorar esta precisión. En primer lugar, esta tesis investiga como aprovechar información derivada de la interacción para mejorar el reconocimiento, en concreto, en la transcripción interactiva de imágenes con texto manuscrito. En la transcripción interactiva, el sistema y el usuario trabajan "codo con codo" para generar la transcripción. El usuario valida la salida del sistema proporcionando ciertas correcciones, mediante texto manuscrito, que el sistema debe tener en cuenta para proporcionar una mejor transcripción. Este texto manuscrito debe ser reconocido para ser utilizado. En esta tesis se propone aprovechar información contextual, como por ejemplo, el prefijo validado por el usuario, para mejorar la calidad del reconocimiento de la interacción. Tras esto, la tesis se centra en el estudio del movimiento humano, en particular del movimiento de las manos, utilizando la Teoría Cinemática y su modelo Sigma-Lognormal. Entender como se mueven las manos al escribir, y en particular, entender el origen de la variabilidad de la escritura, es importante para el desarrollo de un sistema de reconocimiento, La contribución de esta tesis a este tópico es importante, dado que se presenta una nueva técnica (que mejora los resultados previos) para extraer el modelo Sigma-Lognormal de trazos manuscritos. De forma muy relacionada con el trabajo anterior, se estudia el beneficio de utilizar datos sintéticos como entrenamiento. La forma más fácil de entrenar un reconocedor es proporcionar un conjunto de datos "infinito" que representen todas las posibles variaciones. En general, cuanto más datos de entrenamiento, menor será el error del reconocedor. No obstante, muchas veces no es posible proporcionar más datos, o hacerlo es muy caro. Por ello, se ha estudiado como crear y usar datos sintéticos que se parezcan a los reales. Las diferentes contribuciones de esta tesis han obtenido buenos resultados, produciendo varias publicaciones en conferencias internacionales y revistas. Finalmente, también se han explorado tres aplicaciones relaciones con el trabajo de esta tesis. En primer lugar, se ha creado Escritorie, un prototipo de mesa digital basada en el paradigma del pen computer para realizar transcripción interactiva de documentos manuscritos. En segundo lugar, se ha desarrollado "Gestures à Go Go", una aplicación web para generar datos sintéticos y empaquetarlos con un reconocedor de forma rápida y sencilla. Por último, se presenta un sistema interactivo real bajo el paradigma del pen computer. En este caso, se estudia como la revisión de traducciones automáticas se puede realizar de forma más ergonómica. / [CAT] Avui en dia, els ordinadors són presents a tot arreu i es comunament acceptat que la seva utilització proporciona beneficis. No obstant això, moltes vegades el seu potencial no s'aprofita totalment. En aquesta tesi s'adopta el paradigma del pen computer, on la idea fonamental és substituir tots els dispositius d'entrada per un llapis electrònic, o, directament, pels dits. Aquest paradigma postula que l'origen del rebuig als ordinadors prové de l'ús d'interfícies poc amigables per a l'humà, que han de ser substituïdes per alguna cosa més coneguda. Per tant, la interacció amb l'ordinador sota aquest paradigma es realitza per mitjà de text manuscrit i/o gestos. L'origen d'aquest paradigma data de fa més de 40 anys, però només recentment s'ha començat a implementar en dispositius mòbils. La lenta i tardana implantació probablement es degui al fet que és necessari incloure un reconeixedor que "tradueixi" els traços de l'usuari (text manuscrit o gestos) a alguna cosa comprensible per l'ordinador, i el resultat d'aquest reconeixement, actualment, és lluny de ser òptim. Per pensar de forma realista en la implantació del pen computer, cal millorar la precisió del reconeixement de text i gestos. L'objectiu d'aquesta tesi és l'estudi de diferents estratègies per millorar aquesta precisió. En primer lloc, aquesta tesi investiga com aprofitar informació derivada de la interacció per millorar el reconeixement, en concret, en la transcripció interactiva d'imatges amb text manuscrit. En la transcripció interactiva, el sistema i l'usuari treballen "braç a braç" per generar la transcripció. L'usuari valida la sortida del sistema donant certes correccions, que el sistema ha d'usar per millorar la transcripció. En aquesta tesi es proposa utilitzar correccions manuscrites, que el sistema ha de reconèixer primer. La qualitat del reconeixement d'aquesta interacció és millorada, tenint en compte informació contextual, com per exemple, el prefix validat per l'usuari. Després d'això, la tesi se centra en l'estudi del moviment humà en particular del moviment de les mans, des del punt de vista generatiu, utilitzant la Teoria Cinemàtica i el model Sigma-Lognormal. Entendre com es mouen les mans en escriure és important per al desenvolupament d'un sistema de reconeixement, en particular, per entendre l'origen de la variabilitat de l'escriptura. La contribució d'aquesta tesi a aquest tòpic és important, atès que es presenta una nova tècnica (que millora els resultats previs) per extreure el model Sigma- Lognormal de traços manuscrits. De forma molt relacionada amb el treball anterior, s'estudia el benefici d'utilitzar dades sintètiques per a l'entrenament. La forma més fàcil d'entrenar un reconeixedor és proporcionar un conjunt de dades "infinit" que representin totes les possibles variacions. En general, com més dades d'entrenament, menor serà l'error del reconeixedor. No obstant això, moltes vegades no és possible proporcionar més dades, o fer-ho és molt car. Per això, s'ha estudiat com crear i utilitzar dades sintètiques que s'assemblin a les reals. Les diferents contribucions d'aquesta tesi han obtingut bons resultats, produint diverses publicacions en conferències internacionals i revistes. Finalment, també s'han explorat tres aplicacions relacionades amb el treball d'aquesta tesi. En primer lloc, s'ha creat Escritorie, un prototip de taula digital basada en el paradigma del pen computer per realitzar transcripció interactiva de documents manuscrits. En segon lloc, s'ha desenvolupat "Gestures à Go Go", una aplicació web per a generar dades sintètiques i empaquetar-les amb un reconeixedor de forma ràpida i senzilla. Finalment, es presenta un altre sistema inter- actiu sota el paradigma del pen computer. En aquest cas, s'estudia com la revisió de traduccions automàtiques es pot realitzar de forma més ergonòmica. / Martín-Albo Simón, D. (2016). Contributions to Pen & Touch Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/68482 / TESIS
190

Neural Networks for Document Image and Text Processing

Pastor Pellicer, Joan 03 November 2017 (has links)
Nowadays, the main libraries and document archives are investing a considerable effort on digitizing their collections. Indeed, most of them are scanning the documents and publishing the resulting images without their corresponding transcriptions. This seriously limits the document exploitation possibilities. When the transcription is necessary, it is manually performed by human experts, which is a very expensive and error-prone task. Obtaining transcriptions to the level of required quality demands the intervention of human experts to review and correct the resulting output of the recognition engines. To this end, it is extremely useful to provide interactive tools to obtain and edit the transcription. Although text recognition is the final goal, several previous steps (known as preprocessing) are necessary in order to get a fine transcription from a digitized image. Document cleaning, enhancement, and binarization (if they are needed) are the first stages of the recognition pipeline. Historical Handwritten Documents, in addition, show several degradations, stains, ink-trough and other artifacts. Therefore, more sophisticated and elaborate methods are required when dealing with these kind of documents, even expert supervision in some cases is needed. Once images have been cleaned, main zones of the image have to be detected: those that contain text and other parts such as images, decorations, versal letters. Moreover, the relations among them and the final text have to be detected. Those preprocessing steps are critical for the final performance of the system since an error at this point will be propagated during the rest of the transcription process. The ultimate goal of the Document Image Analysis pipeline is to receive the transcription of the text (Optical Character Recognition and Handwritten Text Recognition). During this thesis we aimed to improve the main stages of the recognition pipeline, from the scanned documents as input to the final transcription. We focused our effort on applying Neural Networks and deep learning techniques directly on the document images to extract suitable features that will be used by the different tasks dealt during the following work: Image Cleaning and Enhancement (Document Image Binarization), Layout Extraction, Text Line Extraction, Text Line Normalization and finally decoding (or text line recognition). As one can see, the following work focuses on small improvements through the several Document Image Analysis stages, but also deals with some of the real challenges: historical manuscripts and documents without clear layouts or very degraded documents. Neural Networks are a central topic for the whole work collected in this document. Different convolutional models have been applied for document image cleaning and enhancement. Connectionist models have been used, as well, for text line extraction: first, for detecting interest points and combining them in text segments and, finally, extracting the lines by means of aggregation techniques; and second, for pixel labeling to extract the main body area of the text and then the limits of the lines. For text line preprocessing, i.e., to normalize the text lines before recognizing them, similar models have been used to detect the main body area and then to height-normalize the images giving more importance to the central area of the text. Finally, Convolutional Neural Networks and deep multilayer perceptrons have been combined with hidden Markov models to improve our transcription engine significantly. The suitability of all these approaches has been tested with different corpora for any of the stages dealt, giving competitive results for most of the methodologies presented. / Hoy en día, las principales librerías y archivos está invirtiendo un esfuerzo considerable en la digitalización de sus colecciones. De hecho, la mayoría están escaneando estos documentos y publicando únicamente las imágenes sin transcripciones, limitando seriamente la posibilidad de explotar estos documentos. Cuando la transcripción es necesaria, esta se realiza normalmente por expertos de forma manual, lo cual es una tarea costosa y propensa a errores. Si se utilizan sistemas de reconocimiento automático se necesita la intervención de expertos humanos para revisar y corregir la salida de estos motores de reconocimiento. Por ello, es extremadamente útil para proporcionar herramientas interactivas con el fin de generar y corregir la transcripciones. Aunque el reconocimiento de texto es el objetivo final del Análisis de Documentos, varios pasos previos (preprocesamiento) son necesarios para conseguir una buena transcripción a partir de una imagen digitalizada. La limpieza, mejora y binarización de las imágenes son las primeras etapas del proceso de reconocimiento. Además, los manuscritos históricos tienen una mayor dificultad en el preprocesamiento, puesto que pueden mostrar varios tipos de degradaciones, manchas, tinta a través del papel y demás dificultades. Por lo tanto, este tipo de documentos requiere métodos de preprocesamiento más sofisticados. En algunos casos, incluso, se precisa de la supervisión de expertos para garantizar buenos resultados en esta etapa. Una vez que las imágenes han sido limpiadas, las diferentes zonas de la imagen deben de ser localizadas: texto, gráficos, dibujos, decoraciones, letras versales, etc. Por otra parte, también es importante conocer las relaciones entre estas entidades. Estas etapas del pre-procesamiento son críticas para el rendimiento final del sistema, ya que los errores cometidos en aquí se propagarán al resto del proceso de transcripción. El objetivo principal del trabajo presentado en este documento es mejorar las principales etapas del proceso de reconocimiento completo: desde las imágenes escaneadas hasta la transcripción final. Nuestros esfuerzos se centran en aplicar técnicas de Redes Neuronales (ANNs) y aprendizaje profundo directamente sobre las imágenes de los documentos, con la intención de extraer características adecuadas para las diferentes tareas: Limpieza y Mejora de Documentos, Extracción de Líneas, Normalización de Líneas de Texto y, finalmente, transcripción del texto. Como se puede apreciar, el trabajo se centra en pequeñas mejoras en diferentes etapas del Análisis y Procesamiento de Documentos, pero también trata de abordar tareas más complejas: manuscritos históricos, o documentos que presentan degradaciones. Las ANNs y el aprendizaje profundo son uno de los temas centrales de esta tesis. Diferentes modelos neuronales convolucionales se han desarrollado para la limpieza y mejora de imágenes de documentos. También se han utilizado modelos conexionistas para la extracción de líneas: primero, para detectar puntos de interés y segmentos de texto y, agregarlos para extraer las líneas del documento; y en segundo lugar, etiquetando directamente los píxeles de la imagen para extraer la zona central del texto y así definir los límites de las líneas. Para el preproceso de las líneas de texto, es decir, la normalización del texto antes del reconocimiento final, se han utilizado modelos similares a los mencionados para detectar la zona central del texto. Las imagenes se rescalan a una altura fija dando más importancia a esta zona central. Por último, en cuanto a reconocimiento de escritura manuscrita, se han combinado técnicas de ANNs y aprendizaje profundo con Modelos Ocultos de Markov, mejorando significativamente los resultados obtenidos previamente por nuestro motor de reconocimiento. La idoneidad de todos estos enfoques han sido testeados con diferentes corpus en cada una de las tareas tratadas., obtenie / Avui en dia, les principals llibreries i arxius històrics estan invertint un esforç considerable en la digitalització de les seues col·leccions de documents. De fet, la majoria estan escanejant aquests documents i publicant únicament les imatges sense les seues transcripcions, fet que limita seriosament la possibilitat d'explotació d'aquests documents. Quan la transcripció del text és necessària, normalment aquesta és realitzada per experts de forma manual, la qual cosa és una tasca costosa i pot provocar errors. Si s'utilitzen sistemes de reconeixement automàtic es necessita la intervenció d'experts humans per a revisar i corregir l'eixida d'aquests motors de reconeixement. Per aquest motiu, és extremadament útil proporcionar eines interactives amb la finalitat de generar i corregir les transcripcions generades pels motors de reconeixement. Tot i que el reconeixement del text és l'objectiu final de l'Anàlisi de Documents, diversos passos previs (coneguts com preprocessament) són necessaris per a l'obtenció de transcripcions acurades a partir d'imatges digitalitzades. La neteja, millora i binarització de les imatges (si calen) són les primeres etapes prèvies al reconeixement. A més a més, els manuscrits històrics presenten una major dificultat d'analisi i preprocessament, perquè poden mostrar diversos tipus de degradacions, taques, tinta a través del paper i altres peculiaritats. Per tant, aquest tipus de documents requereixen mètodes de preprocessament més sofisticats. En alguns casos, fins i tot, es precisa de la supervisió d'experts per a garantir bons resultats en aquesta etapa. Una vegada que les imatges han sigut netejades, les diferents zones de la imatge han de ser localitzades: text, gràfics, dibuixos, decoracions, versals, etc. D'altra banda, també és important conéixer les relacions entre aquestes entitats i el text que contenen. Aquestes etapes del preprocessament són crítiques per al rendiment final del sistema, ja que els errors comesos en aquest moment es propagaran a la resta del procés de transcripció. L'objectiu principal del treball que estem presentant és millorar les principals etapes del procés de reconeixement, és a dir, des de les imatges escanejades fins a l'obtenció final de la transcripció del text. Els nostres esforços se centren en aplicar tècniques de Xarxes Neuronals (ANNs) i aprenentatge profund directament sobre les imatges de documents, amb la intenció d'extraure característiques adequades per a les diferents tasques analitzades: neteja i millora de documents, extracció de línies, normalització de línies de text i, finalment, transcripció. Com es pot apreciar, el treball realitzat aplica xicotetes millores en diferents etapes de l'Anàlisi de Documents, però també tracta d'abordar tasques més complexes: manuscrits històrics, o documents que presenten degradacions. Les ANNs i l'aprenentatge profund són un dels temes centrals d'aquesta tesi. Diferents models neuronals convolucionals s'han desenvolupat per a la neteja i millora de les dels documents. També s'han utilitzat models connexionistes per a la tasca d'extracció de línies: primer, per a detectar punts d'interés i segments de text i, agregar-los per a extraure les línies del document; i en segon lloc, etiquetant directament els pixels de la imatge per a extraure la zona central del text i així definir els límits de les línies. Per al preprocés de les línies de text, és a dir, la normalització del text abans del reconeixement final, s'han utilitzat models similars als utilitzats per a l'extracció de línies. Finalment, quant al reconeixement d'escriptura manuscrita, s'han combinat tècniques de ANNs i aprenentatge profund amb Models Ocults de Markov, que han millorat significativament els resultats obtinguts prèviament pel nostre motor de reconeixement. La idoneïtat de tots aquests enfocaments han sigut testejats amb diferents corpus en cadascuna de les tasques tractad / Pastor Pellicer, J. (2017). Neural Networks for Document Image and Text Processing [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90443 / TESIS

Page generated in 0.0875 seconds