Return to search

Recognizing describable attributes of textures and materials in the wild and clutter

Visual textures play an important role in image understanding because theyare a key component of the semantic of many images. Furthermore, texture representations, which pool local image descriptors in an orderless manner, have hada tremendous impact in a wide range of computer vision problems, from texture recognition to object detection. In this thesis we make several contributions to the area of texture understanding. First, we add a new semantic dimension to texture recognition. Instead of focusing on instance or material recognition, we propose a human-interpretable vocabulary of texture attributes, inspired from studies in Cognitive Science, to describe common texture patterns. We also develop a corresponding dataset, the Describable Texture Dataset (DTD), for benchmarking. We show that these texture attributes produce intuitive descriptions of textures. We also show that they can be used to extract a very low dimensional representation of any texture that is very effective in other texture analysis tasks, including improving the state-of-the art in material recognition on the most challenging datasets available today. Second, we look at the problem of recognizing texture attributes and materials in realistic uncontrolled imaging conditions, including when textures appear in clutter. We build on top of the recently proposed Open Surfaces dataset, introduced by the graphics community, by deriving a corresponding benchmarks for material recognition. In addition to material labels, we also augment a subset of Open Surfaces with semantic attributes. Third, we propose a novel texture representation, combining the recent advances in deep-learning with the power of Fisher Vector pooling. We provide thorough evaluation of the new representation, and revisit in general classic texture representations, including bag-of-visual-words, VLAD and the Fisher Vectors, in the context of deep learning. We show that these pooling mechanisms have excellent efficiency and generalisation properties if the convolutional layers of a deep model are used as local features. We obtain in this manner state-of-the-art performance in numerous datasets, both in texture recognition and image understanding in general. We show through our experiments that the proposed representation is an efficient way to apply deep features to image regions, and that it is an effective manner of transferring deep features from one domain to another.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:664833
Date January 2015
CreatorsCimpoi, Mircea
ContributorsVedaldi, Andrea
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:805cb25c-61b4-4c84-9abf-c82ea2b64495

Page generated in 0.0023 seconds