Global ETD Search

1	Image description and image comparison Mortimer, Victoria G. January 2003 (has links) No description available. 006.424
2	Agent-based sketch recognition Mackenzie, Graham January 2003 (has links) No description available. 006.424
3	Generic named entity extraction Jara-Valencia, José Luis January 2005 (has links) This thesis proposes and evaluates different ways of performing generic named entity recognition, that is the construction of a system capable of recognising names in free text which is not specific to any particular domain or task. The starting point is an implementation of a well known baseline system which is based on maximum entropy models that utilise lexically-oriented features to recognised names in text. Although this system achieves good levels of performance, both maximum entropy models and lexically-oriented features have their limitations. Three alternative ways in which this system can be extended to overcome these limitations are then studied: [ > more linguistically-oriented features are extracted from a generic lexical source, namely WordNet®, and then added to the pool of features of the maximum entropy model [ > the maximum entropy model is bias towards training samples that are similar to the piece of text being analysed [ > a bootstrapping procedure is introduced to allow maximum entropy models to collect new, valuable information from unlabelled text Results in this thesis indicate that the maximum entropy model is a very strong approach that accomplishes levels of performance that are very hard to improve on. However, these results also suggest that these extensions of the baseline system could yield improvements, though some difficulties must be addressed and more research is needed to obtain more assertive conclusions. This thesis has nonetheless provided important contributions: a novel approach to estimate the complexity of a named entity extraction task, a method for selecting the features to be used by the maximum entropy model from a large pool of features and a novel procedure to bootstrap maximum entropy models. 006.424
4	Unsupervised detection of anomalous text Guthrie, David January 2008 (has links) This thesis describes work on the detection of anomalous material in text without the use of training data. We use the term anomalous to refer to text that is irregular, or deviates signihcantly from its surrounding context. In this thesis we show to identifying such abnormalities in text can be viewed as a type of outlier detection because these anomahes will differ significantly from the writing style in the majority We consider segments of text, which are anomalous with respect to topic about a different subject, author (written by a different person), or genre (written for a different audience or from a different source) and experiment with whether it is possible to identify these anomalous segments automatically. Five different innovative approaches to this problem are introduced and assessed using many experiments ver large document collections, created to contain randomly inserted anomalous segments. In order to identify anomalies in text successfully, we investigate and evaluate 166 stylistic and linguistic features used to characterize writing, some of which are well-established stylistic determiners, but many of which are original. Using these features with each of our methods, we examine the effect of segment size on our ability to detect anomaly, allowing segments of size 100 words, 500 words and 1000 words. We show substantial improvements over a baseline in all cases for all methods, a novel method which performs consistently better than others and the features that contribute most to unsupervised anomaly detection. 006.424
5	Offline printed Arabic character recognition AbdelRaouf, Ashraf M. January 2012 (has links) Optical Character Recognition (OCR) shows great potential for rapid data entry, but has limited success when applied to the Arabic language. Normal OCR problems are compounded by the right-to-left nature of Arabic and because the script is largely connected. This research investigates current approaches to the Arabic character recognition problem and innovates a new approach. The main work involves a Haar-Cascade Classifier (HCC) approach modified for the first time for Arabic character recognition. This technique eliminates the problematic steps in the pre-processing and recognition phases in additional to the character segmentation stage. A classifier was produced for each of the 61 Arabic glyphs that exist after the removal of diacritical marks. These 61 classifiers were trained and tested on an average of about 2,000 images each. A Multi-Modal Arabic Corpus (MMAC) has also been developed to support this work. MMAC makes innovative use of the new concept of connected segments of Arabic words (PAWs) with and without diacritics marks. These new tokens have significance for linguistic as well as OCR research and applications and have been applied here in the post-processing phase. A complete Arabic OCR application has been developed to manipulate the scanned images and extract a list of detected words. It consists of the HCC to extract glyphs, systems for parsing and correcting these glyphs and the MMAC to apply linguistic constrains. The HCC produces a recognition rate for Arabic glyphs of 87%. MMAC is based on 6 million words, is published on the web and has been applied and validated both in research and commercial use. 006.424
6	Σχεδίαση και υλοποίηση συστήματος αυτόματης αναγνώρισης εντύπων αιτήσεων και των χαρακτήρων των χειρόγραφων πεδίων τους Λιόλιος, Νικόλαος 17 September 2009 (has links) - / - 006.424 Computers Digital signal processing Optical pattern recognition
7	Σύστημα αυτόματης επεξεργασίας εγγράφου και αναγνώρισης χειρόγραφων χαρακτήρων συνεχόμενης γραφής, ανεξάρτητο συγγραφέα Καβαλλιεράτου, Εργίνα 17 September 2009 (has links) - / - 006.424 Computers Optical pattern recognition Digital signal processing

1

Page generated in 0.0188 seconds