Return to search

Using lexical knowledge and parafoveal information for the recognition of common words and suffixes

Research over the past decade into the psychophysics of reading has demonstrated that information extracted from text falling on the parafoveal and peripheral regions
of the retina is used by the human visual system to significantly increase reading speed. Recent results provide evidence that knowledge of word frequency is brought to bear in processing parafoveal data. There is other psychological evidence indicating the type of large-scale features used by the visual system to recognize isolated
characters in parafoveal vision.
This thesis describes the design and implementation of a system able to recognize the most commonly occurring english words and suffixes from parafoveally available
information by employing knowledge of their letter sequences and of large-scale features of lower-case characters. The Marr-Hildreth theory of edge detection provides a description of the information computed by the earliest stages of visual processing from parafoveal words. Large-scale features extracted from this description,
while relatively invariant with respect to noise and font changes, are insufficient
to uniquely identify most characters but are used to place each into one of several classes of similar characters.
The sequence of these 'confusion classes' is found to place a strong constraint on word identity—of the 1000 most common words comprising the system's vocabulary,
representing 70% of the volume of the Brown Corpus of printed English, 92% have mutually unique confusion class sequences. Word recognition is achieved by using the confusion class sequence as a key into the vocabulary, retrieving the word or words having the same sequence. Suffixes are recognized in a similar way.
Results are presented demonstrating the system's ability to identify words and suffixes
in text images over a range of simulated parafoveal eccentricities and in two different fonts, one with serifs and one without. Smoothing by the Marr-Hildreth operator, the simplicity and scale of the features, the size of the character classes, and the context provided by the character sequence give the system a degree of robustness. / Science, Faculty of / Computer Science, Department of / Graduate

Identiferoai:union.ndltd.org:UBC/oai:circle.library.ubc.ca:2429/26520
Date January 1987
CreatorsRhone, Brock William
PublisherUniversity of British Columbia
Source SetsUniversity of British Columbia
LanguageEnglish
Detected LanguageEnglish
TypeText, Thesis/Dissertation
RightsFor non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Page generated in 0.002 seconds