Return to search

Graph-based Inference with Constraints for Object Detection and Segmentation

For many fundamental problems of computer vision, adopting a graph-based framework can be straight-forward and very effective. In this thesis, I propose several graph-based inference methods tailored for different computer vision applications. It starts from studying contour-based object detection methods. In particular, We propose a novel framework for contour based object detection, by replacing the hough-voting framework with finding dense subgraph inference. Compared to previous work, we propose a novel shape matching scheme suitable for partial matching of edge fragments. The shape descriptor has the same geometric units as shape context but our shape representation is not histogram based. The key contribution is that we formulate the grouping of partial matching hypotheses to object detection hypotheses is expressed as maximum clique inference on a weighted graph. Consequently, each detection result not only identifies the location of the target object in the image, but also provides a precise location of its contours, since we transform a complete model contour to the image. We achieve very competitive results on ETHZ dataset, obtained in a pure shape-based framework, demonstrate that our method achieves not only accurate object detection but also precise contour localization on cluttered background. Similar to the task of grouping of partial matches in the contour-based method, in many computer vision problems, we would like to discover certain pattern among a large amount of data. For instance, in the application of unsupervised video object segmentation, where we need automatically identify the primary object and segment the object out in every frame. We propose a novel formulation of selecting object region candidates simultaneously in all frames as finding a maximum weight clique in a weighted region graph. The selected regions are expected to have high objectness score (unary potential) as well as share similar appearance (binary potential). Since both unary and binary potentials are unreliable, we introduce two types of mutex (mutual exclusion) constraints on regions in the same clique: intra-frame and inter-frame constraints. Both types of constraints are expressed in a single quadratic form. An efficient algorithm is applied to compute the maximal weight cliques that satisfy the constraints. We apply our method to challenging benchmark videos and obtain very competitive results that outperform state-of-the-art methods. We also show that the same maximum weight subgraph with mutex constraints formulation can be used to solve various computer vision problems, such as points matching, solving image jigsaw puzzle, and detecting object using 3D contours. / Computer and Information Science

Identiferoai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/1797
Date January 2013
CreatorsMa, Tianyang
ContributorsLatecki, Longin, Ling, Haibin, Vucetic, Slobodan, Huang, Xiaolei
PublisherTemple University. Libraries
Source SetsTemple University
LanguageEnglish
Detected LanguageEnglish
TypeThesis/Dissertation, Text
Format147 pages
RightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/
Relationhttp://dx.doi.org/10.34944/dspace/1779, Theses and Dissertations

Page generated in 0.0021 seconds