The focus of this thesis is the definition of a complete framework for texture-based annotation and retrieval. This framework is centred on the concept of "texture codes", so called because they encode the relative energy levels of Gabor filter responses. These codes are pixel-based, robust descriptors with respect to illumination variations, can be generated efficiently, and included in a fast retrieval process. They can act as local or global descriptors, and can be used in the representations of regions or objects. Our framework is therefore capable of supporting a wide range of queries and applications. During our research, we have been able to utilise results of psychological studies on the perception of similarity and have explored non-metric similarity scores. As a result, we have found that similarity can be evaluated with simple measures predominantly relying on the information extracted from the query, without a drastic loss in retrieval performance. We have been able to show that the most simple measure possible, counting the number of common codes between the query and a stored image, can for some algorithmic parameters outperform well-proven benchmarks. Importantly also, our measures can all support partial comparisons, so that region-based queries can be answered without the need for segmentation. We have investigated refinements of the framework which endow it with the ability to localise queries in candidate images, and to deal with user relevance feedback. The final framework can generate good and fast retrieval results as demonstrated with a databases of 3723 images, and can therefore be useful as a stand-alone system. The framework has also been applied to the problem of high-level annotation. In particular, it has been used as a cue detector, where a cue is a visual example of a particular concept such as a type of sport. The detection results show that the system can predict the correct cue among a small set of cues, and can therefore provide useful information to an engine fusing the outputs of several cue detectors. So an important aspect of this framework is that it is expected to be an asset within a multi-cue annotation and/or retrieval system.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:343488 |
Date | January 2001 |
Creators | Levienaise-Obadia, B. |
Publisher | University of Surrey |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://epubs.surrey.ac.uk/842806/ |
Page generated in 0.0019 seconds