Spelling suggestions: "subject:"finegrained visual categorization"" "subject:"finergrained visual categorization""
1 |
Interpretable Fine-Grained Visual CategorizationGuo, Pei 16 June 2021 (has links)
Not all categories are created equal in object recognition. Fine-grained visual categorization (FGVC) is a branch of visual object recognition that aims to distinguish subordinate categories within a basic-level category. Examples include classifying an image of a bird into specific species like "Western Gull" or "California Gull". Such subordinate categories exhibit characteristics like small inter-category variation and large intra-class variation, making distinguishing them extremely difficult. To address such challenges, an algorithm should be able to focus on object parts and be invariant to object pose. Like many other computer vision tasks, FGVC has witnessed phenomenal advancement following the resurgence of deep neural networks. However, the proposed deep models are usually treated as black boxes. Network interpretation and understanding aims to unveil the features learned by neural networks and explain the reason behind network decisions. It is not only a necessary component for building trust between humans and algorithms, but also an essential step towards continuous improvement in this field. This dissertation is a collection of papers that contribute to FGVC and neural network interpretation and understanding. Our first contribution is an algorithm named Pose and Appearance Integration for Recognizing Subcategories (PAIRS) which performs pose estimation and generates a unified object representation as the concatenation of pose-aligned region features. As the second contribution, we propose the task of semantic network interpretation. For filter interpretation, we represent the concepts a filter detects using an attribute probability density function. We propose the task of semantic attribution using textual summarization that generates an explanatory sentence consisting of the most important visual attributes for decision-making, as found by a general Bayesian inference algorithm. Pooling has been a key component in convolutional neural networks and is of special interest in FGVC. Our third contribution is an empirical and experimental study towards a thorough yet intuitive understanding and extensive benchmark of popular pooling approaches. Our fourth contribution is a novel LMPNet for weakly-supervised keypoint discovery. A novel leaky max pooling layer is proposed to explicitly encourages sparse feature maps to be learned. A learnable clustering layer is proposed to group the keypoint proposals into final keypoint predictions. 2020 marks the 10th year since the beginning of fine-grained visual categorization. It is of great importance to summarize the representative works in this domain. Our last contribution is a comprehensive survey of FGVC containing nearly 200 relevant papers that cover 7 common themes.
|
2 |
The FineView Dataset: a 3D Scanned Multi-View Object Dataset of Fine-Grained Category InstancesOnda, Suguru 23 December 2023 (has links) (PDF)
In the past decade state-of-the-art deep learning models have shown impressive performance in many computer vision tasks by learning from large and diverse image datasets. Most of these datasets consist of web-scraped image collections. This approach, however, makes it very challenging to obtain desirable data such as multiple views of the same object, 3D geometric information, or camera parameters for a large-scale image dataset. In this paper, we propose a 3D-scanned multi-view 2D image dataset of fine-grained category instances with accurate camera calibration parameters. We describe our bi-directional, multi-camera and 3D scanning system and the data collection pipeline. Our target objects are relatively small, highly-detailed fine-grained category instances, such as insects. We present this dataset as a contribution to fine-grained visual categorization, 3D representation learning, and for use in other computer vision tasks.
|
Page generated in 0.1204 seconds