Return to search

Visual semantic complex network for web images / CUHK electronic theses & dissertations collection

The enormous and ever-growing amount of web images has brought great challenges and opportunities to computer vision. A lot of recent research efforts have been devoted to the effective access of web images (e.g., web image re-ranking, content-based image retrieval), as well as to making better use of web images as auxiliary data sources. A key problem that needs to be solved is to model the semantic and visual structure of web images and to effectively organize the web image collections. This thesis proposes Visual Semantic Complex Network (VSCN), a automatically generated graph structure to model their structure and organize the web image collections. The nodes of the VSCN are compact clusters of images with visual and semantic consistency, named semantic concepts. These nodes are connected based on the visual and semantic correlations. The VSCN consists of 33,240 semantic concepts and cover 10 million images, and is expandable given more computational resources. / The VSCN enables a macroscopic understanding of the structure of web image collection. We study its structural properties from the viewpoint of complex network, and derive a great deal of valuable information. We study the connectivity and efficiency of the VSCN, and show that it is a well connected and efficient network. The study of in-degrees reflects the unequal importance of the semantic concepts. We also show the presence of community structures in the VSCN through analyzing the clustering coefficient. These studies not only gives us a better understanding of the VSCN and the web image collection at a macroscopic level, but also inspires a number of import applications of the VSCN in traditional vision tasks. / The VSCN models the structure of web images, it can be used to assist a variety of vision tasks and provide effective access to web images. We demonstrate three applications of the VSCN in this thesis. In the first application, we devise Anchor Concept Graph Distance (ACGD Distance) based on the VSCN and use this distance for web image re-ranking. As the VSCN effectively models the structures of web images, the ACG distance is able to reduce the semantic gap and better represents similarities between images. We demonstrate its effectiveness on image re-ranking benchmarks. The second application leverages the VSCN to improve the content-based image retrieval algorithms. As the relevant images on the web are connected via inter-concept correlations and grouped by community structures, we can therefore effectively reduce the search space by exploiting the structures of web images encoded in the VSCN. Therefore, the image retrieval task can greatly benefit from the VSCN. In the third application, we develop an novel image browsing scheme based on the VSCN. With the help of the VSCN structure, our proposed browsing scheme allows image browsing over the entire network via navigating the inter-concept connections. The browsing process not only bridges relevant images indexed under different textual queries, it is also guided by natural and meaningful transitions of images. We demonstrate the the VSCN-based scheme leads to better user experience for image browsing. Apart from benefiting better access to web image collections, the VSCN can be also used as an large, structured auxiliary database to assist other vision tasks. In this thesis, we present an example application of object recognition. Our method employs both imagery and textual data indexed by the VSCN, as well as its structural information. Experiments on benchmark datasets demonstrate the effectiveness of our method and the potential of the VSCN database. / 海量並且不斷增長的互聯網圖片給計算機視覺的研究帶來了很大的挑戰和機遇。許多近期的研究致力於網絡圖片的有效獲取(例如網絡圖片重排序,基於內容的圖片檢索即CBIR),以及更有效利用網絡圖片作為輔助數據源。其中一個亟待解決的關鍵問題是如何對網絡圖片間的語義和視覺結構進行建模,並據此組織這些圖片。本論文提出一個自動生成的網絡結構,即視覺語義復雜網絡(VSCN)來建模網絡圖片的結構。VSCN上的節點由一些在視覺和語義層面具有一致性的圖片聚類組成,稱之為語義概念。這些節點又由它們之間的視覺和語義相關性所連接。VSCN共包含33,240個語義概念並囊括了超過一千萬的圖片,並且可以隨著計算資源的增加而不斷增長。 / VSCN使得對網絡圖片結構的宏觀描述成為可能。我們從復雜網絡的角度對它的結構性質進行了研究,並且得出了許多有價值的信息。我們研究了VSCN的連通性和效率,說明它是一個聯通性和效率較好的網絡。對網絡入度的研究反應了語義概念具有不均等的重要性。我們通過分析聚類系數,說明了VSCN存在較明顯的社區結構。這些研究不僅僅為我們提供了對於VSCN和網絡圖片結構的宏觀理解,並且啟發了許多VSCN在計算機視覺任務中的重要應用。 / 因為VSCN建模了網絡圖片結構,它可以被用於許多計算機視覺的任務中並提供對網絡圖片有效的獲取。在本論文中我們展示了三個VSCN的應用。在第一個應用中,我們基於VSCN設計了用於網絡圖片重排序的ACG距離。由於VSCN有效地建模了網絡圖片的結構,ACG距離能夠減少語義鴻溝並更好地表達圖片之間的相似性。我們在基准數據集上展現了此方法的有效性。第二個應用利用VSCN改進了CBIR系統的。由於網絡中的相關圖片被VSCN中的語義概念間相關性所聯系,並組成了社區結構,我們可以利用此種結構信息來有效地減小搜索空間。這使得圖片檢索任務獲益,並使檢索結果得到很大改善。在第三個應用中,我們基於VSCN設計了一個新穎的圖片瀏覽模式。我們的瀏覽模式可以允許通過語義概念間的連接對VSCN進行瀏覽。這個瀏覽過程不僅聯系了屬於不同文本檢索詞的相關圖片,並且被具有指導性的圖片過渡所輔助。我們證明這種基於VSCN的瀏覽方式能夠帶來圖片瀏覽中更好的用戶體驗。除去能夠使得獲取互聯網圖片獲益之外,VSCN同樣可以被用作一個大規模的具有結構的輔助數據庫,並輔助其他計算機使用的任務。在本論文中,我們介紹了一個在物體識別的應用例。我們的方法利用的VSCN所檢索的圖片和文本信息,以及VSCN本身的結構信息。在標准數據集上的實驗證明了我們方法以及VSCN作為輔助數據庫的有效性。 / Qiu, Shi. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2014. / Includes bibliographical references (leaves 145-164). / Abstracts also in Chinese. / Title from PDF title page (viewed on 04, October, 2016). / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only.

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_1290669
Date January 2014
ContributorsQiu, Shi (author.), Tang, Xiaoou (thesis advisor.), Wang, Xiaogang , active 2003 (thesis advisor.), Chinese University of Hong Kong Graduate School. Division of Information Engineering. (degree granting institution.)
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography, text
Formatelectronic resource, electronic resource, remote, 1 online resource (xxi, 164 leaves) : illustrations (some color), computer, online resource
RightsUse of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-NoDerivatives 4.0 International" License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0024 seconds