• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 24
  • 13
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • Tagged with
  • 34
  • 34
  • 34
  • 29
  • 15
  • 10
  • 8
  • 7
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Exploring intrinsic structures from samples: supervised, unsupervised, an semisupervised frameworks.

January 2007 (has links)
Wang, Huan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (leaves 113-119). / Abstracts in English and Chinese. / Contents / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Learning Frameworks --- p.1 / Chapter 1.2 --- Sample Representation --- p.3 / Chapter 2 --- Background Study --- p.5 / Chapter 2.1 --- Tensor Algebra --- p.5 / Chapter 2.1.1 --- Tensor Unfolding (Flattening) --- p.6 / Chapter 2.1.2 --- Tensor Product --- p.6 / Chapter 2.2 --- Manifold Embedding and Dimensionality Reduction --- p.8 / Chapter 2.2.1 --- Principal Component Analysis (PCA) --- p.9 / Chapter 2.2.2 --- Metric Multidimensional Scaling (MDS) --- p.10 / Chapter 2.2.3 --- Isomap --- p.10 / Chapter 2.2.4 --- Locally Linear Embedding (LLE) --- p.11 / Chapter 2.2.5 --- Discriminant Analysis --- p.11 / Chapter 2.2.6 --- Laplacian Eigenmap --- p.14 / Chapter 2.2.7 --- Graph Embedding: A General Framework --- p.15 / Chapter 2.2.8 --- Maximum Variance Unfolding --- p.16 / Chapter 3 --- The Trace Ratio Optimization --- p.17 / Chapter 3.1 --- Introduction --- p.17 / Chapter 3.2 --- Dimensionality Reduction Formulations: Trace Ratio vs. Ratio Trace --- p.19 / Chapter 3.3 --- Efficient Solution of Trace Ratio Problem --- p.22 / Chapter 3.4 --- Proof of Convergency to Global Optimum --- p.23 / Chapter 3.4.1 --- Proof of the monotonic increase of λn --- p.23 / Chapter 3.4.2 --- Proof of Vn convergence and global optimum for λ --- p.24 / Chapter 3.5 --- Extension and Discussion --- p.27 / Chapter 3.5.1 --- Extension to General Constraints --- p.27 / Chapter 3.5.2 --- Discussion --- p.28 / Chapter 3.6 --- Experiments --- p.29 / Chapter 3.6.1 --- Dataset Preparation --- p.30 / Chapter 3.6.2 --- Convergence Speed --- p.31 / Chapter 3.6.3 --- Visualization of Projection Matrix --- p.31 / Chapter 3.6.4 --- Classification by Linear Trace Ratio Algorithms with Orthogonal Constraints --- p.33 / Chapter 3.6.5 --- Classification by Kernel Trace Ratio algorithms with General Constraints --- p.36 / Chapter 3.7 --- Conclusion --- p.36 / Chapter 4 --- A Convergent Solution to Tensor Subspace Learning --- p.40 / Chapter 4.1 --- Introduction --- p.40 / Chapter 4.2 --- Subspace Learning with Tensor Data --- p.43 / Chapter 4.2.1 --- Graph Embedding with Tensor Representation --- p.43 / Chapter 4.2.2 --- Computational Issues --- p.46 / Chapter 4.3 --- Solution Procedure and Convergency Proof --- p.46 / Chapter 4.3.1 --- Analysis of Monotonous Increase Property --- p.47 / Chapter 4.3.2 --- Proof of Convergency --- p.48 / Chapter 4.4 --- Experiments --- p.50 / Chapter 4.4.1 --- Data Sets --- p.50 / Chapter 4.4.2 --- Monotonicity of Objective Function Value --- p.51 / Chapter 4.4.3 --- Convergency of the Projection Matrices . . --- p.52 / Chapter 4.4.4 --- Face Recognition --- p.52 / Chapter 4.5 --- Conclusions --- p.54 / Chapter 5 --- Maximum Unfolded Embedding --- p.57 / Chapter 5.1 --- Introduction --- p.57 / Chapter 5.2 --- Maximum Unfolded Embedding --- p.59 / Chapter 5.3 --- Optimize Trace Ratio --- p.60 / Chapter 5.4 --- Another Justification: Maximum Variance Em- bedding --- p.60 / Chapter 5.5 --- Linear Extension: Maximum Unfolded Projection --- p.61 / Chapter 5.6 --- Experiments --- p.62 / Chapter 5.6.1 --- Data set --- p.62 / Chapter 5.6.2 --- Evaluation Metric --- p.63 / Chapter 5.6.3 --- Performance Comparison --- p.64 / Chapter 5.6.4 --- Generalization Capability --- p.65 / Chapter 5.7 --- Conclusion --- p.67 / Chapter 6 --- Regression on MultiClass Data --- p.68 / Chapter 6.1 --- Introduction --- p.68 / Chapter 6.2 --- Background --- p.70 / Chapter 6.2.1 --- Intuitive Motivations --- p.70 / Chapter 6.2.2 --- Related Work --- p.72 / Chapter 6.3 --- Problem Formulation --- p.73 / Chapter 6.3.1 --- Notations --- p.73 / Chapter 6.3.2 --- Regularization along Data Manifold --- p.74 / Chapter 6.3.3 --- Cross Manifold Label Propagation --- p.75 / Chapter 6.3.4 --- Inter-Manifold Regularization --- p.78 / Chapter 6.4 --- Regression on Reproducing Kernel Hilbert Space (RKHS) --- p.79 / Chapter 6.5 --- Experiments --- p.82 / Chapter 6.5.1 --- Synthetic Data: Nonlinear Two Moons . . --- p.82 / Chapter 6.5.2 --- Synthetic Data: Three-class Cyclones --- p.83 / Chapter 6.5.3 --- Human Age Estimation --- p.84 / Chapter 6.6 --- Conclusions --- p.86 / Chapter 7 --- Correspondence Propagation --- p.88 / Chapter 7.1 --- Introduction --- p.88 / Chapter 7.2 --- Problem Formulation and Solution --- p.92 / Chapter 7.2.1 --- Graph Construction --- p.92 / Chapter 7.2.2 --- Regularization on categorical Product Graph --- p.93 / Chapter 7.2.3 --- Consistency in Feature Domain and Soft Constraints --- p.96 / Chapter 7.2.4 --- Inhomogeneous Pair Labeling . --- p.97 / Chapter 7.2.5 --- Reliable Correspondence Propagation --- p.98 / Chapter 7.2.6 --- Rearrangement and Discretizing --- p.100 / Chapter 7.3 --- Algorithmic Analysis --- p.100 / Chapter 7.3.1 --- Selection of Reliable Correspondences . . . --- p.100 / Chapter 7.3.2 --- Computational Complexity --- p.102 / Chapter 7.4 --- Applications and Experiments --- p.102 / Chapter 7.4.1 --- Matching Demonstration on Object Recognition Databases --- p.103 / Chapter 7.4.2 --- Automatic Feature Matching on Oxford Image Transformation Database . --- p.104 / Chapter 7.4.3 --- Influence of Reliable Correspondence Number --- p.106 / Chapter 7.5 --- Conclusion and Future Works --- p.106 / Chapter 8 --- Conclusion and Future Work --- p.110 / Bibliography --- p.113
2

Lateral inhibition and the area operator in visual pattern processing

Connor, Denis John January 1969 (has links)
The static interaction of the receptor nerves in the lateral eye of the horsesoe crab, Limulus, is called lateral inhibition. It is described by the Hartline equations. A simulator has been built to study lateral inhibition with a view to applying it in a pre-processor for a visual pattern recognition system. The activity in a lateral inhibitory receptor network is maximal in regions of non-uniform illumination. This enhancement of intensity contours has been extensively studied for the case of black and white patterns. It is shown that the level of activity near a black-white boundary provides a measure of its local geometric properites. However, the level of activity is dependent on the boundary orientation. A number of methods for reducing this orientation dependence are explored. The activity in a lateral inhibitory network adjacent to a boundary can be modelled by an area operator. It is shown that the value of this operator along an intensity boundary provides a description of the boundary that is related to its intrinsic description — curvature as a function of arc length. Since the operator is maximal on an intensity boundary, this description has been called the ridge function for the boundary. A ridge function can also be obtained using a lateral inhibitory, network. The properties of this function are discussed. It is shown how ridge functions might be incorporated into a pattern recognition algorithm. A novel method for detecting the bilateral and rotational symmetries in a pattern is described. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
3

Example-based interpolation for correspondence-based computer vision problems. / CUHK electronic theses & dissertations collection

January 2006 (has links)
EBI and iEBI mechanism have all the desirable properties of a good interpolation: all given input-output examples are satisfied exactly, and the interpolation is smooth with minimum oscillations between the examples. / Example-Based Interpolation (EBI) is a powerful method to interpolate function from a set of input-output examples. The first part of the dissertation exams the EBI in detail and proposes a new enhanced EBI, indexed function Example-Based Interpolation (iEBI). The second part demonstrates the application of both EBI and iEBI to solve three well-defined problems of computer vision. / First, the dissertation has analyzed EBI solution in detail. It argues and demonstrates that there are three desired properties for any EBI solution. To satisfy all three desirable properties, the EBI solution must have adequate degrees of freedom. This dissertation shows in details that, for the EBI solution to have enough degrees of freedom, it needs only be in a simple format: the sum of a basis function plus a linear function. This dissertation also presents that a particular EBI solution, in a certain least-squares-error sense, could satisfy exactly all the three desirable properties. / Moreover, this dissertation also points out EBI's restriction and describes a new interpolation mechanism that could overcome EBI's restriction by constructing general indexed function from examples. The new mechanism, referred to as the general indexed function Example-Based Interpolation (iEBI) mechanism, first applies EBI to establish the initial correspondences over all input examples, and then interpolates the general indexed function from those initial correspondences. / Novel View Synthesis (NVS) is an important problem in image rendering. It tries to synthesize an image of a scene at any specified (novel) viewpoint using only a few images of that scene at some sample viewpoints. To avoid explicit 3-D reconstruction of the scene, this dissertation formulates the problem of NVS as an indexed function interpolation problem by treating viewpoint and image as the input and output of a function. The interpolation formulation has at least two advantages. First, it allows certain imaging details like camera intrinsic parameters to be unknown. Second, the viewpoint specification need not be physical. For example, the specification could consist of any set of values that adequately describe the viewpoint space and need not be measured in metric units. This dissertation solves the NVS problem using the iEBI formulation and presents how the iEBI mechanism could be used to synthesize images at novel viewpoints and acquire quality novel views even from only a few example views. / Stereo matching, or the determination of corresponding image points projected by the same 3-D feature, is one of the fundamental and long-studied problems in computer vision. Yet, few have tried to solve it using interpolation. This dissertation presents an interpolation approach, Interpolation-based Iterative Stereo Matching (IISM), that could construct dense correspondences in stereo image from sparse initial correspondences. IISM improves the existing EBI to ensure that the established correspondences satisfy exactly the epipolar constraint of the image pair, and to a certain extent, preserve discontinuities in the stereo disparity space of the imaged scene. IISM utilizes the refinement technique of coarse-to-fine to iteratively apply the improved EBI algorithm, and eventually, produces the dense disparity map for stereo image pair. / The second part of the dissertation focuses on applying the EBI and iEBI methods to solve three correspondence-based problems in computer vision: (1) stereo matching, (2) novel view synthesis, and (3) viewpoint determination. / This dissertation also illustrates, for all the three problems, experimental results on a number of real and benchmarking image datasets, and shows that interpolation-based methods could be effective in arriving at good solution even with sparse input examples. / Viewpoint determination of image is the problem of, given an image, determining the viewpoint from which the image was taken. This dissertation demonstrates to solve this problem without referencing to or estimating any explicit 3-D structure of the imaged scene. Used for reference are a small number of sample snapshots of the scene, each of which has the associated viewpoint. By treating image and its associated viewpoint as the input and output of a function, and the given snapshot-viewpoint pairs as examples of that function, the problem has a natural formulation of interpolation. Same as that in NVS, the interpolation formulation allows the given images to be uncalibrated and the viewpoint specification to be not necessarily measured. This dissertation presents an interpolation-based solution using iEBI mechanism that guarantees all given sample data are satisfied exactly with the least complexity in the interpolated function. / Liang Bodong. / "February 2006." / Adviser: Ronald Chi-kit Chung. / Source: Dissertation Abstracts International, Volume: 67-11, Section: B, page: 6516. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (p. 127-145). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
4

Computer vision using shape spaces / Burzin Bhavnagri.

Bhavnagri, Burzin January 1998 (has links)
Includes bibliography: p. 214-225 and index. / 232 p. ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / This thesis investigates a computational model of vision based on assumptions pertaining to the physical structure of a camera and the scattering of light from visible surfaces. A sufficient condition to detect occlusions, intensity discontinuities, discontinuities in derivatives of intensity, surface discontinuities and discontinuities in derivatives of surfaces are given. This leads to an algorithm with linear time and space complexity to generate a collection of feature points with attributes in cyclically ordered groups. Two approaches to rejecting false hypotheses of correspondence were developed: an error minimising approach and an approach based on formal language. A non-iterative algorithm that can use the rotation between two cameras to produce an exact reconstruction of a scene is presented. Two methods of comparing global shapes with occlusions are pointed out: one based on a grammar, the other on Le's inequality on euclidean shapes. / Thesis (Ph.D.)--University of Adelaide, Dept. of Computer Science, 1998
5

Detecting irregularity in videos using spatiotemporal volumes.

January 2007 (has links)
Li, Yun. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (leaves 68-72). / Abstracts in English and Chinese. / Abstract --- p.I / 摘要 --- p.III / Acknowledgments --- p.IV / List of Contents --- p.VI / List of Figures --- p.VII / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Visual Detection --- p.2 / Chapter 1.2 --- Irregularity Detection --- p.4 / Chapter Chapter 2 --- System Overview --- p.7 / Chapter 2.1 --- Definition of Irregularity --- p.7 / Chapter 2.2 --- Contributions --- p.8 / Chapter 2.3 --- Review of previous work --- p.9 / Chapter 2.3.1 --- Model-based Methods --- p.9 / Chapter 2.3.2 --- Statistical Methods --- p.11 / Chapter 2.4 --- System Outline --- p.14 / Chapter Chapter 3 --- Background Subtraction --- p.16 / Chapter 3.1 --- Related Work --- p.17 / Chapter 3.2 --- Adaptive Mixture Model --- p.18 / Chapter 3.2.1 --- Online Model Update --- p.20 / Chapter 3.2.2 --- Background Model Estimation --- p.22 / Chapter 3.2.3 --- Foreground Segmentation --- p.24 / Chapter Chapter 4 --- Feature Extraction --- p.28 / Chapter 4.1 --- Various Feature Descriptors --- p.29 / Chapter 4.2 --- Histogram of Oriented Gradients --- p.30 / Chapter 4.2.1 --- Feature Descriptor --- p.31 / Chapter 4.2.2 --- Feature Merits --- p.33 / Chapter 4.3 --- Subspace Analysis --- p.35 / Chapter 4.3.1 --- Principal Component Analysis --- p.35 / Chapter 4.3.2 --- Subspace Projection --- p.37 / Chapter Chapter 5 --- Bayesian Probabilistic Inference --- p.39 / Chapter 5.1 --- Estimation of PDFs --- p.40 / Chapter 5.1.1 --- K-Means Clustering --- p.40 / Chapter 5.1.2 --- Kernel Density Estimation --- p.42 / Chapter 5.2 --- MAP Estimation --- p.44 / Chapter 5.2.1 --- ML Estimation & MAP Estimation --- p.44 / Chapter 5.2.2 --- Detection through MAP --- p.46 / Chapter 5.3 --- Efficient Implementation --- p.47 / Chapter 5.3.1 --- K-D Trees --- p.48 / Chapter 5.3.2 --- Nearest Neighbor (NN) Algorithm --- p.49 / Chapter Chapter 6 --- Experiments and Conclusion --- p.51 / Chapter 6.1 --- Experiments --- p.51 / Chapter 6.1.1 --- Outdoor Video Surveillance - Exp. 1 --- p.52 / Chapter 6.1.2 --- Outdoor Video Surveillance - Exp. 2 --- p.54 / Chapter 6.1.3 --- Outdoor Video Surveillance - Exp. 3 --- p.56 / Chapter 6.1.4 --- Classroom Monitoring - Exp.4 --- p.61 / Chapter 6.2 --- Algorithm Evaluation --- p.64 / Chapter 6.3 --- Conclusion --- p.66 / Bibliography --- p.68
6

Segmentation based variational model for accurate optical flow estimation.

January 2009 (has links)
Chen, Jianing. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 47-54). / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Related Work --- p.3 / Chapter 1.3 --- Thesis Organization --- p.5 / Chapter 2 --- Review on Optical Flow Estimation --- p.6 / Chapter 2.1 --- Variational Model --- p.6 / Chapter 2.1.1 --- Basic Assumptions and Constraints --- p.6 / Chapter 2.1.2 --- More General Energy Functional --- p.9 / Chapter 2.2 --- Discontinuity Preserving Techniques --- p.9 / Chapter 2.2.1 --- Data Term Robustification --- p.10 / Chapter 2.2.2 --- Diffusion Based Regularization --- p.11 / Chapter 2.2.3 --- Segmentation --- p.15 / Chapter 2.3 --- Chapter Summary --- p.15 / Chapter 3 --- Segmentation Based Optical Flow Estimation --- p.17 / Chapter 3.1 --- Initial Flow --- p.17 / Chapter 3.2 --- Color-Motion Segmentation --- p.19 / Chapter 3.3 --- Parametric Flow Estimating Incorporating Segmentation --- p.21 / Chapter 3.4 --- Confidence Map Construction --- p.24 / Chapter 3.4.1 --- Occlusion detection --- p.24 / Chapter 3.4.2 --- Pixel-wise motion coherence --- p.24 / Chapter 3.4.3 --- Segment-wise model confidence --- p.26 / Chapter 3.5 --- Final Combined Variational Model --- p.28 / Chapter 3.6 --- Chapter Summary --- p.28 / Chapter 4 --- Experiment Results --- p.30 / Chapter 4.1 --- Quantitative Evaluation --- p.30 / Chapter 4.2 --- Warping Results --- p.34 / Chapter 4.3 --- Chapter Summary --- p.35 / Chapter 5 --- Application - Single Image Animation --- p.37 / Chapter 5.1 --- Introduction --- p.37 / Chapter 5.2 --- Approach --- p.38 / Chapter 5.2.1 --- Pre-Process Stage --- p.39 / Chapter 5.2.2 --- Coordinate Transform --- p.39 / Chapter 5.2.3 --- Motion Field Transfer --- p.41 / Chapter 5.2.4 --- Motion Editing and Apply --- p.41 / Chapter 5.2.5 --- Gradient-domain composition --- p.42 / Chapter 5.3 --- Experiments --- p.43 / Chapter 5.3.1 --- Active Motion Transfer --- p.43 / Chapter 5.3.2 --- Animate Stationary Temporal Dynamics --- p.44 / Chapter 5.4 --- Chapter Summary --- p.45 / Chapter 6 --- Conclusion --- p.46 / Bibliography --- p.47
7

Motion sensation dependence on visual and vestibular cues

Zacharias, Greg January 1977 (has links)
Thesis. 1977. Ph.D.--Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND AERO / Vita. / Bibliography : leaves 323-333. / by Greg L. Zacharias. / Ph.D.
8

Robust stereo motion and structure estimation scheme. / CUHK electronic theses & dissertations collection

January 2006 (has links)
Another important contribution of this thesis is that we propose another novel and highly robust estimator: Kernel Density Estimation Sample Consensus (KDESAC) which employs Random Sample Consensus algorithm combined with Kernel Density Estimation (KDE). The main advantage of KDESAC is that no prior information and no scale estimators are required in the estimation of the parameters. The computational load of KDESAC is much lower than the robust algorithms which estimate the scale in every sample loop. The experiments on synthetic data show that the proposed method is more robust to the heavily corrupted data than other algorithms. KDESAC can tolerate more than 80% outliers and multiple structures. Although Adaptive Scale Sample Consensus (ASSC) can obtain such good performance as KDESAC, ASSC is much slower than KDESAC. KDESAC is also applied to SFM problem and multi-motion estimation with real data. The experiments demonstrate that KDESAC is robust and efficient. / Structure from motion (SFM), the problem of estimating 3D structure from 2D images hereof, is one of the most popular and well studied problems within computer vision. This thesis is a study within the area of SFM. The main objective of this work is to improve the robustness of the SFM algorithm so as to make it capable of tolerating a great number of outliers in the correspondences. For improving the robustness, a stereo image sequence is processed, so the random sampling algorithms can be employed in the structure and motion estimation. With this strategy, we employ Random Sample Consensus (RANSAC) in motion and structure estimation to exclude outliers. Since the RANSAC method needs the prior information about the scale of the inliers, we proposed an auto-scale RANSAC algorithm which determines the inliers by analyzing the probability density of the residuals. The experimental results demonstrate that SFM by the proposed auto-scale RANSAC is more robust and accurate than that by RANSAC. / Chan Tai. / "September 2006." / Adviser: Yun Hui Liu. / Source: Dissertation Abstracts International, Volume: 68-03, Section: B, page: 1716. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (p. 113-120). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
9

Learning mid-level representations for scene understanding.

January 2013 (has links)
本論文包括了對場景分類框架的描述,并針對自然場景中學習中間層特徵表達的問題做了深入的探討。 / 當前的場景分類框架主要包括特徵提取,特稱編碼,空間信息整合和分類器學習幾個步驟。在這些步驟中,特徵提取是圖像理解的基礎環節。局部特徵表達被認為是計算機視覺在實際應用中成功的關鍵。但是近年來,中間層信息表達逐漸吸引了這個領域的眾多目光。本論文從兩個方面來理解中間層特徵。一個是局部底層信息的整合,另外一個是語義信息的嵌入。本文中,我們的工作同時覆蓋了“整合“和“語意“兩個方面。 / 在自然圖像的統計特徵中,我們發現圖像底層響應的相關性代表了局部結構信息。基於這個發現,我們構造了一個兩層學習模型。第一層是長得類似邊響應的底層信息,第二層是過完備的協方差特徵層,同時也是本文中提到的中間層信息。從“整合局部底層信息“的角度看,我們的方法在在這個方向上更進一步。我們將中間層特徵用到了場景分類中,并取得了良好的效果。特別是與人工設計的特徵相比,我們的特徵完全來自于自動學習。我們的協方差特徵的有效性為未來的特徵學習提供了一個新的思路:對於低層響應的相互關係的研究可以幫助構造表達能力更強的特徵。 / 爲了將語義信息加入到中間層特徵的學習中,我們定義了一個名詞叫做“信息化組分“。 所謂的信息化組分指的是那些能夠用來描述一類場景同時又能用來區分不同場景的結構化信息。基於固定秩的產生式模型的假設,我們設計了產生式模型和判別式分類器聯合學習的優化模型。通過將學習得到的信息化組分用到場景分類的實驗中,這類信息化結構的有效性得到了充分地證實。我們同時發現,如果將這一類信息化結構和底層的特徵表達聯合起來作為新的特徵表達,會使得分類的準確率得到進一步地提升。這個發現為我們未來的工作指引了方向:通過嘗試合併多層的特徵表達來提高整體的分類效果。 / This thesis contains the review of state-of-the-art scene classification frameworks and study about learning mid-level representations for scene understanding. / Current scene classification pipeline consists of feature extraction, feature encoding, spatial aggregation, and classifier learning. Among these steps, feature extraction is the most fundamental one for scene understanding. Beyond low level features, obtaining effective mid-level representations catches eyes in the scene understanding field in recent years. We interpret mid-level representations from two perspectives. One is the aggregation from low level cues and the other is embedding semantic information. In this thesis, our work harvests both properties of “aggregation“ and “semantic“. / Given the observation from natural image statistics that correlations among patch-level responses contain strong structure information, we build a two-layer model. The first layer is the patch level response with edge-let appearance, and the second layer contains sparse covariance patterns, which is considered as the mid-level representation. From the view of “aggregation from low level cues“, our work moves one step further in this direction. We use learned covariance patterns in scene classification. It shows promising performance even compared with those human-designed features. The efficiency of our covariance patterns gives a new clue for feature learning, that is, correlations among lower-layer responses can help build more powerful feature representations. / With the motivation of coupling semantic information into building the mid-level representation, we define a new “informative components“ term in this thesis. Informative components refer to those regions that are descriptive within one class and also distinctive among different classes. Based on a generative assumption that descriptive regions can fit a fixed rank model, we provide an integrated optimization framework, which combines generative modeling and discriminative learning together. Experiments on scene classification bear out the efficiency of our informative components. We also find that by simply concatenating informative components with low level responses, the classification performance can be further improved. This throws light on the future direction to improve representation power via the combination of multiple-layer representations. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Wang, Liwei. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 62-72). / Abstracts also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Scene Classification Pipeline --- p.1 / Chapter 1.2 --- Learning Mid-Level Representations --- p.6 / Chapter 1.3 --- Contributions and Organization --- p.7 / Chapter 2 --- Background --- p.9 / Chapter 2.1 --- Mid-level Representations --- p.9 / Chapter 2.1.1 --- Aggregation FromLow Level Cues --- p.10 / Chapter 2.1.2 --- Embedding Semantic Information --- p.13 / Chapter 2.2 --- Scene Data Sets Description --- p.16 / Chapter 3 --- Learning Sparse Covariance Patterns --- p.20 / Chapter 3.1 --- Introduction --- p.20 / Chapter 3.2 --- Model --- p.26 / Chapter 3.3 --- Learning and Inference --- p.28 / Chapter 3.3.1 --- Inference --- p.28 / Chapter 3.3.2 --- Learning --- p.30 / Chapter 3.4 --- Experiments --- p.31 / Chapter 3.4.1 --- Structure Mapping --- p.33 / Chapter 3.4.2 --- 15-Scene Classification --- p.34 / Chapter 3.4.3 --- Indoor Scene Recognition --- p.36 / Chapter 3.5 --- Summary --- p.38 / Chapter 4 --- Learning Informative Components --- p.39 / Chapter 4.1 --- Introduction --- p.39 / Chapter 4.2 --- RelatedWork --- p.43 / Chapter 4.3 --- OurModel --- p.45 / Chapter 4.3.1 --- Component Level Representation --- p.45 / Chapter 4.3.2 --- Fixed Rank Modeling --- p.46 / Chapter 4.3.3 --- Informative Component Learning --- p.47 / Chapter 4.4 --- Experiments --- p.52 / Chapter 4.4.1 --- Informative Components Learning --- p.54 / Chapter 4.4.2 --- Scene Classification --- p.55 / Chapter 4.5 --- Summary --- p.58 / Chapter 5 --- Conclusion --- p.60 / Bibliography --- p.62
10

Calibration of an active vision system and feature tracking based on 8-point projective invariants.

January 1997 (has links)
by Chen Zhi-Yi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references. / List of Symbols S --- p.1 / Chapter Chapter 1 --- Introduction / Chapter 1.1 --- Active Vision Paradigm and Calibration of Active Vision System --- p.1.1 / Chapter 1.1.1 --- Active Vision Paradigm --- p.1.1 / Chapter 1.1.2 --- A Review of the Existing Active Vision Systems --- p.1.1 / Chapter 1.1.3 --- A Brief Introduction to Our Active Vision System --- p.1.2 / Chapter 1.1.4 --- The Stages of Calibrating an Active Vision System --- p.1.3 / Chapter 1.2 --- Projective Invariants and Their Applications to Feature Tracking --- p.1.4 / Chapter 1.3 --- Thesis Overview --- p.1.4 / References --- p.1.5 / Chapter Chapter 2 --- Calibration for an Active Vision System: Camera Calibration / Chapter 2.1 --- An Overview of Camera Calibration --- p.2.1 / Chapter 2.2 --- Tsai's RAC Based Camera Calibration Method --- p.2.5 / Chapter 2.2.1 --- The Pinhole Camera Model with Radial Distortion --- p.2.7 / Chapter 2.2.2 --- Calibrating a Camera Using Mono view Noncoplanar Points --- p.2.10 / Chapter 2.3 --- Reg Willson's Implementation of R. Y. Tsai's RAC Based Camera Calibration Algorithm --- p.2.15 / Chapter 2.4 --- Experimental Setup and Procedures --- p.2.20 / Chapter 2.5 --- Experimental Results --- p.2.23 / Chapter 2.6 --- Conclusion --- p.2.28 / References --- p.2.29 / Chapter Chapter 3 --- Calibration for an Active Vision System: Head-Eye Calibration / Chapter 3.1 --- Why Head-Eye Calibration --- p.3.1 / Chapter 3.2 --- Review of the Existing Head-Eye Calibration Algorithms --- p.3.1 / Chapter 3.2.1 --- Category I Classic Approaches --- p.3.1 / Chapter 3.2.2 --- Category II Self-Calibration Techniques --- p.3.2 / Chapter 3.3 --- R.Tsai's Approach for Hand-Eye (Head-Eye) Calibration --- p.3.3 / Chapter 3.3.1 --- Introduction --- p.3.3 / Chapter 3.3.2 --- Definitions of Coordinate Frames and Homogeoeous Transformation Matrices --- p.3.3 / Chapter 3.3.3 --- Formulation of the Head-Eye Calibration Problem --- p.3.6 / Chapter 3.3.4 --- Using Principal Vector to Represent Rotation Transformation Matrix --- p.3.7 / Chapter 3.3.5 --- Calculating R cg and Tcg --- p.3.9 / Chapter 3.4 --- Our Local Implementation of Tsai's Head Eye Calibration Algorithm --- p.3.14 / Chapter 3.4.1 --- Using Denavit - Hartternberg's Approach to Establish a Body-Attached Coordinate Frame for Each Link of the Manipulator --- p.3.16 / Chapter 3.5 --- Function of Procedures and Formats of Data Files --- p.3.23 / Chapter 3.6 --- Experimental Results --- p.3.26 / Chapter 3.7 --- Discussion --- p.3.45 / Chapter 3.8 --- Conclusion --- p.3.46 / References --- p.3.47 / Appendix I Procedures --- p.3.48 / Chapter Chapter 4 --- A New Tracking Method for Shape from Motion Using an Active Vision System / Chapter 4.1 --- Introduction --- p.4.1 / Chapter 4.2 --- A New Tracking Method --- p.4.1 / Chapter 4.2.1 --- Our approach --- p.4.1 / Chapter 4.2.2 --- Using an Active Vision System to Track the Projective Basis Across Image Sequence --- p.4.2 / Chapter 4.2.3 --- Using Projective Invariants to Track the Remaining Feature Points --- p.4.2 / Chapter 4.3 --- Using Factorisation Method to Recover Shape from Motion --- p.4.11 / Chapter 4.4 --- Discussion and Future Research --- p.4.31 / References --- p.4.32 / Chapter Chapter 5 --- Experiments on Feature Tracking with 3D Projective Invariants / Chapter 5.1 --- 8-point Projective Invariant --- p.5.1 / Chapter 5.2 --- Projective Invariant Based Tranfer between Distinct Views of a 3-D Scene --- p.5.4 / Chapter 5.3 --- Transfer Experiments on the Image Sequence of an Calibration Block --- p.5.6 / Chapter 5.3.1 --- Experiment 1. Real Image Sequence 1 of a Camera Calibration Block --- p.5.6 / Chapter 5.3.2 --- Experiment 2. Real Image Sequence 2 of a Camera Calibration Block --- p.5.15 / Chapter 5.3.3 --- Experiment 3. Real Image Sequence 3 of a Camera Calibration Block --- p.5.22 / Chapter 5.3.4 --- Experiment 4. Synthetic Image Sequence of a Camera Calibration Block --- p.5.27 / Chapter 5.3.5 --- Discussions on the Experimental Results --- p.5.32 / Chapter 5.4 --- Transfer Experiments on the Image Sequence of a Human Face Model --- p.5.33 / References --- p.5.44 / Chapter Chapter 6 --- Conclusions and Future Researches / Chapter 6.1 --- Contributions and Conclusions --- p.6.1 / Chapter 6.2 --- Future Researches --- p.6.1 / Bibliography --- p.B.1

Page generated in 0.1157 seconds