This thesis describes the design and implementation of an innovative drawing interpretation tool for the sight-impaired. The move towards Graphical User Interfaces in today's computer era presents many challenges to those with impaired vision. Although there are many tools to aid this group in reading and writing text-based electronic documents, few software packages are available to help the sight-impaired interpret electronic images. This new tool, known as ImagePilot 2.0, processes electronic image files that contain line drawings and produces audio feedback to guide the user through the drawing. ImagePilot 2.0 receives input through a pointing device, thereby providing the user with a means of examining a drawing interactively, and serving as an aid for recognizing the outlines of familiar shapes and objects.
In this study, a "line drawing" is a simple image that contains line segments and curves, without any shading or colors. It can be represented as a 2-dimensional array of picture elements (pixels), in which each pixel is either black or white. The drawing is represented by a pattern of black (foreground) pixels against a white background. Typically, groups of connected foreground pixels represent segments or curves that are associated with a single object. ImagePilot 2.0 makes it possible for a sight-impaired user to interrogate such an image.
Line drawings can be stored in many different electronic formats. ImagePilot 2.0 supports valid Graphic Interchange Format (GIF) files. If the foreground image regions in the file are wider than one pixel in width, a Zhang-Suen thinning algorithm is applied to thin the drawing. The tool identifies the separate regions in the drawing and decides on the best starting point for each region. Once a starting point is chosen, the drawing is processed using a modified chain-coding algorithm.
The audio feedback consists of two types of audio cues, verbal and tone, along with stereo playback. Verbal feedback guides the user with a set of verbal cues played through the speakers. The tone feedback uses three tones representing above, level, and below the horizontal. The left and right speakers provide left and right directional information at each level.
Speed is a critical factor in image analysis and interpretation applications. While the tool is receiving and processing input, it must also respond to the user within an acceptable amount of time. ImagePilot 2.0 uses multi-threading and multi-tasking techniques to achieve higher performance speeds. After design and implementation, two groups of people tested the tool. The tool demonstrated the ability to help the user find and trace the segments in the test drawings with high efficiency and acceptable response time.
This tool is written in pure Java, and complies with Sun's Java API specification 1.1.7 released in October 1998. The tool functions on systems with multimedia capabilities that have a Java Virtual Machine (JVM) and Java Media Framework (JMF) installed. / Master of Science
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/31331 |
Date | 02 March 2000 |
Creators | Valad, Farzad M. |
Contributors | Electrical and Computer Engineering, Abbott, A. Lynn, Conners, Richard W., Nadler, Morton, Schmoldt, Daniel L. |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Relation | ImagePilot.pdf |
Page generated in 0.0019 seconds