Return to search

Semantic image understanding : from pixel to word

The aim of semantic image understanding is to reveal the semantic meaning behind the image pixel. This thesis investigates problems related to semantic image understanding, and have made the following contributions. Our first contribution is to propose the usage of histogram matching in Multiple Kernel Learning. We treat the two-dimensional kernel matrix as an image and transfer the histogram matching algorithm in image processing to kernel matrix. Experiments on various computer vision and machine learning datasets have shown that our method can always boost the performance of state of the art MKL methods. Our second contribution is to advocate the segment-then-recognize strategy in pixel-level semantic image understanding. We have developed a new framework which tries to integrate semantic segmentation with low-level segmentation for proposing object consistent regions. We have also developed a novel method trying to integrate semantic segmentation with interactive segmentation. We found this segment-then-recognize strategy also works well on medical image data, where we designed a novel polar space random field model for proposing gland-like regions. In the realm of image-level semantic image understanding, our contribution is a novel way to utilize the random forest. Most of the previous works utilizing random forest store the posterior probabilities at each leaf node, and each random tree in the random forest is considered to be independent from each other. In contrast, we store the training samples instead of the posterior probabilities at each leaf node. We consider the random forest as a whole and propose the concept of semantic nearest neighbor and semantic similarity measure. Based on these two concepts, we devise novel methods for image annotation and image retrieval tasks.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:570344
Date January 2012
CreatorsFu, Hao
PublisherUniversity of Nottingham
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://eprints.nottingham.ac.uk/12847/

Page generated in 0.0087 seconds