Return to search

Document image retrieval with improvements in database quality

Abstract
Modern technology has made it possible to produce, process,
transmit and store digital images efficiently. Consequently, the
amount of visual information is increasing at an accelerating rate
in many diverse application areas. To fully exploit this new content-based
image retrieval techniques are required. Document image retrieval
systems can be utilized in many organizations which are using document
image databases extensively.

This thesis presents document image retrieval techniques and
new approaches to improve database content. The goal of the thesis
is to develop a functional retrieval system and to demonstrate that
better retrieval results can be achieved with the proposed database generation
methods.

Retrieval system architecture, a document data model, and
tools for querying document image databases are introduced. The
retrieval framework presented allows users to interactively define,
construct and combine queries using document or image properties: physical
(structural), semantic, textual and visual image content. A technique
for combining primitive features like color, shape and texture into
composite features is presented. A novel search base reduction technique
which uses structural and content properties of documents is proposed
for speeding up the query process.

A new model for database generation within the image retrieval
system is presented. An approach for automated document image defect
detection and management is presented to build high quality and
retrievable database objects. In image database population, image
feature profiles and their attributes are manipulated automatically
to better match with query requirements determined by the available
query methods, the application environment and the user.

Experiments were performed with multiple image databases containing
over one thousand images. They comprised a range of document and
scene images from different categories, properties and condition.
The results show that better recall and accuracy for retrieval is
achieved with the proposed optimization techniques. The search base
reduction technique results in a considerable speed-up in overall
query processing. The constructed document image retrieval system
performs well in different retrieval scenarios and provides a consistent
basis for algorithm development. The proposed modular system structure and
interfaces facilitate its usage in a wide variety of document image
retrieval applications.

Identiferoai:union.ndltd.org:oulo.fi/oai:oulu.fi:isbn951-42-5313-2
Date23 June 1999
CreatorsKauniskangas, H. (Hannu)
PublisherUniversity of Oulu
Source SetsUniversity of Oulu
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis, info:eu-repo/semantics/publishedVersion
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess, © University of Oulu, 1999
Relationinfo:eu-repo/semantics/altIdentifier/pissn/0355-3213, info:eu-repo/semantics/altIdentifier/eissn/1796-2226

Page generated in 0.0019 seconds