Global ETD Search

Return to search

Inter-modality image synthesis and recognition.

跨模態圖像的合成和識別已成為計算機視覺領域的熱點。實際應用中存在各種各樣的圖像模態，比如刑偵中使用的素描畫和光照不變人臉識別中使用的近紅外圖像。由於某些模態的圖像很難獲得，模態間的轉換和匹配是一項十分有用的技術，為計算機視覺的應用提供了很大的便利。 / 本論文研究了三個應用：人像素描畫的合成，基於樣本的圖像風格化和人像素描畫識別。 / 我們將人像素描畫的合成的前沿研究擴展到非可控條件下的合成。以前的工作都只能在嚴格可控的條件下從照片合成素描畫。我們提出了一種魯棒的算法，可以從有光照和姿態變化的人臉照片合成素描畫。該算法用多尺度馬爾可夫隨機場來合成局部素描圖像塊。對光照和姿態的魯棒性通過三個部分來實現：基於面部器官的形狀先驗可以抑制缺陷和扭曲的合成效果，圖像塊的特征描述子和魯棒的距離測度用來選擇素描圖像塊，以及像素灰度和梯度的一致性來有效地匹配鄰近的素描圖像塊。在CUHK人像素描數據庫和網上的名人照片上的實驗結果表明我們的算法顯著提高了現有算法的效果。 / 針對基於樣本的圖像風格化，我們提供了一種將模板圖像的藝術風格傳遞到照片上的有效方法。大多數已有方法沒有考慮圖像內容和風格的分離。我們提出了一種通過頻段分解的風格傳遞算法。一幅圖像被分解成低頻、中頻和高頻分量，分別描述內容、主要風格和邊緣信息。接著中頻和高頻分量中的風格從模板傳遞到照片，這一過程用馬爾可夫隨機場來建模。最後我們結合照片中的低頻分量和獲得的風格信息重建出藝術圖像。和其它算法相比，我們的方法不僅合成了風格，而且很好的保持了原有的圖像內容。我們通過圖像風格化和個性化藝術合成的實驗來驗證了算法的有效性。 / 我們為人像素描畫的識別提出了一個從數據中學習人臉描述子的新方向。最近的研究都集中在轉換照片和素描畫到相同的模態，或者設計復雜的分類算法來減少從照片和素描畫提取的特征的模態間差異。我們提出了一種新穎的方法：在提取特征的階段減小模態間差異。我們用一種基於耦合信息論編碼的人臉描述子來獲取有判別性的局部人臉結構和有效的匹配照片和素描畫。通過最大化在量化特征空間的照片和素描畫的互信息，我們設計了耦合信息論投影森林來實現耦合編碼。在世界上最大的人像素描畫數據庫上的結果表明我們的方法和已有最好的方法相比有顯著提高。 / Inter-modality image synthesis and recognition has been a hot topic in computer vision. In real-world applications, there are diverse image modalities, such as sketch images for law enforcement and near infrared images for illumination invariant face recognition. Therefore, it is often useful to transform images from a modality to another or match images from different modalities, due to the difficulty of acquiring image data in some modality. These techniques provide large flexibility for computer vision applications. / In this thesis we study three problems: face sketch synthesis, example-based image stylization, and face sketch recognition. / For face sketch synthesis, we expand the frontier to synthesis from uncontrolled face photos. Previous methods only work under well controlled conditions. We propose a robust algorithm for synthesizing a face sketch from a face photo with lighting and pose variations. It synthesizes local sketch patches using a multiscale Markov Random Field (MRF) model. The robustness to lighting and pose variations is achieved with three components: shape priors specific to facial components to reduce artifacts and distortions, patch descriptors and robust metrics for selecting sketch patch candidates, and intensity compatibility and gradient compatibility to match neighboring sketch patches effectively. Experiments on the CUHK face sketch database and celebrity photos collected from the web show that our algorithm significantly improves the performance of the state-of-the-art. / For example-based image stylization, we provide an effective approach of transferring artistic effects from a template image to photos. Most existing methods do not consider the content and style separately. We propose a style transfer algorithm via frequency band decomposition. An image is decomposed into the low-frequency (LF), mid-frequency (MF), and highfrequency( HF) components, which describe the content, main style, and information along the boundaries. Then the style is transferred from the template to the photo in the MF and HF components, which is formulated as MRF optimization. Finally a reconstruction step combines the LF component of the photo and the obtained style information to generate the artistic result. Compared to the other algorithms, our method not only synthesizes the style, but also preserves the image content well. We demonstrate that our approach performs excellently in image stylization and personalized artwork in experiments. / For face sketch recognition, we propose a new direction based on learning face descriptors from data. Recent research has focused on transforming photos and sketches into the same modality for matching or developing advanced classification algorithms to reduce the modality gap between features extracted from photos and sketches. We propose a novel approach by reducing the modality gap at the feature extraction stage. A face descriptor based on coupled information-theoretic encoding is used to capture discriminative local face structures and to effectively match photos and sketches. Guided by maximizing the mutual information between photos and sketches in the quantized feature spaces, the coupled encoding is achieved by the proposed coupled information-theoretic projection forest. Experiments on the largest face sketch database show that our approach significantly outperforms the state-of-the-art methods. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Zhang, Wei. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 121-137). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgement --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Multi-Modality Computer Vision --- p.1 / Chapter 1.2 --- Face Sketches --- p.4 / Chapter 1.2.1 --- Face Sketch Synthesis --- p.6 / Chapter 1.2.2 --- Face Sketch Recognition --- p.7 / Chapter 1.3 --- Example-based Image Stylization --- p.9 / Chapter 1.4 --- Contributions and Summary of Approaches --- p.10 / Chapter 1.5 --- Thesis Road Map --- p.13 / Chapter 2 --- Literature Review --- p.14 / Chapter 2.1 --- Related Works in Face Sketch Synthesis --- p.14 / Chapter 2.2 --- Related Works in Example-based Image Stylization --- p.17 / Chapter 2.3 --- Related Works in Face Sketch Recognition --- p.21 / Chapter 3 --- Lighting and Pose Robust Sketch Synthesis --- p.27 / Chapter 3.1 --- The Algorithm --- p.31 / Chapter 3.1.1 --- Overview of the Method --- p.32 / Chapter 3.1.2 --- Local Evidence --- p.34 / Chapter 3.1.3 --- Shape Prior --- p.40 / Chapter 3.1.4 --- Neighboring Compatibility --- p.42 / Chapter 3.1.5 --- Implementation Details --- p.43 / Chapter 3.1.6 --- Acceleration --- p.45 / Chapter 3.2 --- Experimental Results --- p.47 / Chapter 3.2.1 --- Lighting and Pose Variations --- p.49 / Chapter 3.2.2 --- Celebrity Faces from the Web --- p.54 / Chapter 3.3 --- Conclusion --- p.54 / Chapter 4 --- Style Transfer via Band Decomposition --- p.58 / Chapter 4.1 --- Introduction --- p.58 / Chapter 4.2 --- Algorithm Overview --- p.63 / Chapter 4.3 --- Image Style Transfer --- p.64 / Chapter 4.3.1 --- Band Decomposition --- p.64 / Chapter 4.3.2 --- MF and HF Component Processing --- p.67 / Chapter 4.3.3 --- Reconstruction --- p.74 / Chapter 4.4 --- Experiments --- p.76 / Chapter 4.4.1 --- Comparison to State-of-the-Art --- p.76 / Chapter 4.4.2 --- Extended Application: Personalized Artwork --- p.82 / Chapter 4.5 --- Conclusion --- p.84 / Chapter 5 --- Coupled Encoding for Sketch Recognition --- p.86 / Chapter 5.1 --- Introduction --- p.86 / Chapter 5.1.1 --- Related work --- p.89 / Chapter 5.2 --- Information-Theoretic Projection Tree --- p.90 / Chapter 5.2.1 --- Projection Tree --- p.91 / Chapter 5.2.2 --- Mutual Information Maximization --- p.92 / Chapter 5.2.3 --- Tree Construction with MMI --- p.94 / Chapter 5.2.4 --- Randomized CITP Forest --- p.102 / Chapter 5.3 --- Coupled Encoding Based Descriptor --- p.103 / Chapter 5.4 --- Experiments --- p.106 / Chapter 5.4.1 --- Descriptor Comparison --- p.108 / Chapter 5.4.2 --- Parameter Exploration --- p.109 / Chapter 5.4.3 --- Experiments on Benchmarks --- p.112 / Chapter 5.5 --- Conclusions --- p.115 / Chapter 6 --- Conclusion --- p.116 / Bibliography --- p.121

Optical pattern recognition--Mathematics

Computer vision

Identifer	oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_328421
Date	January 2012
Contributors	Zhang, Wei, Chinese University of Hong Kong Graduate School. Division of Information Engineering.
Source Sets	The Chinese University of Hong Kong
Language	English, Chinese
Detected Language	English
Type	Text, bibliography
Format	electronic resource, electronic resource, remote, 1 online resource (xxiii, 137 leaves) : ill. (chiefly col.)
Rights	Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.003 seconds

Inter-modality image synthesis and recognition.

Description

Links & Downloads

Tags

Additional Fields