Traditional content based image indexing aims at developing algorithms that can analyze and index images based on their visual content. A typical approach is to measure image attributes, like colors or textures, and save the result in image descriptors, which then can be used in recognition and retrieval applications. Two topics within content based image indexing are addressed in this thesis: Emotion based image indexing, and font recognition. The main contribution is the inclusion of high-level semantics in indexing of multi-colored images. We focus on color emotions and color harmony, and introduce novel emotion and harmony based image descriptors, including global emotion histograms, a bag-of-emotions descriptor, an image harmony descriptor, and an indexing method based on Kobayashi's Color Image Scale. The first three are based on models from color science, analyzing emotional properties of single colors or color combinations. A majority of the descriptors are evaluated in psychophysical experiments. The results indicate that observers perceive color emotions and color harmony for multi-colored images in similar ways, and that observer judgments correlate with values obtained from the presented descriptors. The usefulness of the descriptors is illustrated in large scale image classification experiments involving emotion related image categories, where the presented descriptors are compared with global and local standard descriptors within this field of research. We also investigate if these descriptors can predict the popularity of images. Three image databases are used in the experiments, one obtained from an image provider, and two from a major image search service. The two from the search service were harvested from the Internet, containing image thumbnails together with keywords and user statistics. One of them is a traditional object database, whereas the other is a unique database focused on emotional image categories. A large part of the emotion database has been released to the research community. The second contribution is visual font recognition. We implemented a font search engine, capable of handling very large font databases. The input to the search engine is an image of a text line, and the output is the name of the font used when rendering the text. After pre-processing and segmentation of the input image, eigenimages are used, where features are calculated for individual characters. The performance of the search engine is illustrated with a database containing more than 2700 fonts. A system for visualizing the entire font database is also presented. Both the font search engine, and the descriptors that are related to emotions and harmony are implemented in publicly available search engines. The implementations are presented together with user statistics.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-64591 |
Date | January 2011 |
Creators | Solli, Martin |
Publisher | Linköpings universitet, Medie- och Informationsteknik, Linköpings universitet, Tekniska högskolan, Linköping : Linköping University Electronic Press |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Doctoral thesis, monograph, info:eu-repo/semantics/doctoralThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | Linköping Studies in Science and Technology. Dissertations, 0345-7524 ; 1362 |
Page generated in 0.0017 seconds