In this thesis, we address some of the challenges involved in developing a robust writer-independent, lexicon-free system to recognize online Tamil words. Tamil, being a Dravidian language, is morphologically rich and also agglutinative and thus does not have a finite lexicon. For example, a single verb root can easily lead to hundreds of words after morphological changes and agglutination. Further, adoption of a lexicon-free recognition approach can be applied to form-filling applications, wherein the lexicon can become cumbersome (if not impossible) to capture all possible names. Under such circumstances, one must necessarily explore the possibility of segmenting a Tamil word to its individual symbols.
Modern day Tamil alphabet comprises 23 consonants and 11 vowels forming a total combination of 313 characters/aksharas. A minimal set of 155 distinct symbols have been derived to recognize these characters. A corpus of isolated Tamil symbols (IWFHR database) is used for deriving the various statistics proposed in this work. To address the challenges of segmentation and recognition (the primary focus of the thesis), Tamil words are collected using a custom application running on a tablet PC. A set of 10000 words (comprising 53246 symbols) have been collected from high school students and used for the experiments in this thesis. We refer to this database as the ‘MILE word database’.
In the first part of the work, a feedback based word segmentation mechanism has been proposed. Initially, the Tamil word is segmented based on a bounding box overlap criterion. This dominant overlap criterion segmentation (DOCS) generates a set of candidate stroke groups. Thereafter, attention is paid to certain attributes from the resulting stroke groups for detecting any possible splits or under-segmentations. By relying on feedbacks provided by
a priori knowledge of attributes such as number of dominant points and inter-stroke displacements the recognition label and likelihood of the primary SVM classifier
linguistic knowledge on the detected stroke groups, a decision is taken to correct it or not. Accordingly, we call the proposed segmentation as ‘attention feedback segmentation’ (AFS). Across the words in the MILE word database, a segmentation rate of 99.7% is achieved at symbol level with AFS. The high segmentation rate (with feedback) in turn improves the symbol recognition rate of the primary SVM classifier from 83.9% (with DOCS alone) to 88.4%.
For addressing the problem of segmentation, the SVM classifier fed with the x-y trace of the normalized and resampled online stroke groups is quite effective. However, the performance of the classifier is not robust to effectively distinguish between many sets of similar looking symbols. In order to improve the symbol recognition performance, we explore two approaches, namely reevaluation strategies and language models.
The reevaluation techniques, in particular, resolve the ambiguities in base consonants, pure consonants and vowel modifiers to a considerable extent. For the frequently confused sets (derived from the confusion matrix), a dynamic time warping (DTW) approach is proposed to automatically extract their discriminative regions. Dedicated to each confusion set, novel localized cues are derived from the discriminative region for their disambiguation. The proposed features are quite promising in improving the symbol recognition performance of the confusion sets. Comparative experimental analysis of these features with x-y coordinates are performed for judging their discriminative power. The resolving of confusions is accomplished with expert networks, comprising discriminative region extractor, feature extractor and SVM. The proposed techniques improve the symbol recognition rate by 3.5% (from 88.4% to 91.9%) on the MILE word database over the primary SVM classifier.
In the final part of the thesis, we integrate linguistic knowledge (derived from a text corpus) in the primary recognition system. The biclass, bigram and unigram language models at symbol level are compared in terms of recognition performance. Amongst the three models, the bigram model is shown to give the highest recognition accuracy. A class reduction approach for recognition is adopted by incorporating the language bigram model at the akshara level. Lastly, a judicious combination of reevaluation techniques with language models is proposed in this work. Overall, an improvement of up to 4.7% (from 88.4% to 93.1%) in symbol level accuracy is achieved.
The writer-independent and lexicon-free segmentation-recognition approach developed in this thesis for online handwritten Tamil word recognition is promising. The best performance of 93.1% (achieved at symbol level) is comparable to the highest reported accuracy in the literature for Tamil symbols. However, the latter one is on a database of isolated symbols (IWFHR competition test dataset), whereas our accuracy is on a database of 10000 words and thus, a product of segmentation and classifier accuracies. The recognition performance obtained may be enhanced further by experimenting on and choosing the best set of features and classifiers. Also, the word recognition performance can be very significantly improved by using a lexicon. However, these are not the issues addressed by the thesis. We hope that the lexicon-free experiments reported in this work will serve as a benchmark for future efforts.
Identifer | oai:union.ndltd.org:IISc/oai:etd.ncsi.iisc.ernet.in:2005/2363 |
Date | 12 1900 |
Creators | Sundaram, Suresh |
Contributors | Ramakrishnan, A G |
Source Sets | India Institute of Science |
Language | en_US |
Detected Language | English |
Type | Thesis |
Relation | G24982 |
Page generated in 0.0026 seconds