Spelling suggestions: "subject:"contentbased image retrieval"" "subject:"content.based image retrieval""
41 |
Efficient Techniques For Relevance Feedback Processing In Content-based Image RetrievalLiu, Danzhou 01 January 2009 (has links)
In content-based image retrieval (CBIR) systems, there are two general types of search: target search and category search. Unlike queries in traditional database systems, users in most cases cannot specify an ideal query to retrieve the desired results for either target search or category search in multimedia database systems, and have to rely on iterative feedback to refine their query. Efficient evaluation of such iterative queries can be a challenge, especially when the multimedia database contains a large number of entries, and the search needs many iterations, and when the underlying distance measure is computationally expensive. The overall processing costs, including CPU and disk I/O, are further emphasized if there are numerous concurrent accesses. To address these limitations involved in relevance feedback processing, we propose a generic framework, including a query model, index structures, and query optimization techniques. Specifically, this thesis has five main contributions as follows. The first contribution is an efficient target search technique. We propose four target search methods: naive random scan (NRS), local neighboring movement (LNM), neighboring divide-and-conquer (NDC), and global divide-and-conquer (GDC) methods. All these methods are built around a common strategy: they do not retrieve checked images (i.e., shrink the search space). Furthermore, NDC and GDC exploit Voronoi diagrams to aggressively prune the search space and move towards target images. We theoretically and experimentally prove that the convergence speeds of GDC and NDC are much faster than those of NRS and recent methods. The second contribution is a method to reduce the number of expensive distance computation when answering k-NN queries with non-metric distance measures. We propose an efficient distance mapping function that transfers non-metric measures into metric, and still preserves the original distance orderings. Then existing metric index structures (e.g., M-tree) can be used to reduce the computational cost by exploiting the triangular inequality property. The third contribution is an incremental query processing technique for Support Vector Machines (SVMs). SVMs have been widely used in multimedia retrieval to learn a concept in order to find the best matches. SVMs, however, suffer from the scalability problem associated with larger database sizes. To address this limitation, we propose an efficient query evaluation technique by employing incremental update. The proposed technique also takes advantage of a tuned index structure to efficiently prune irrelevant data. As a result, only a small portion of the data set needs to be accessed for query processing. This index structure also provides an inexpensive means to process the set of candidates to evaluate the final query result. This technique can work with different kernel functions and kernel parameters. The fourth contribution is a method to avoid local optimum traps. Existing CBIR systems, designed around query refinement based on relevance feedback, suffer from local optimum traps that may severely impair the overall retrieval performance. We therefore propose a simulated annealing-based approach to address this important issue. When a stuck-at-a-local-optimum occurs, we employ a neighborhood search technique (i.e., simulated annealing) to continue the search for additional matching images, thus escaping from the local optimum. We also propose an index structure to speed up such neighborhood search. Finally, the fifth contribution is a generic framework to support concurrent accesses. We develop new storage and query processing techniques to exploit sequential access and leverage inter-query concurrency to share computation. Our experimental results, based on the Corel dataset, indicate that the proposed optimization can significantly reduce average response time while achieving better precision and recall, and is scalable to support a large user community. This latter performance characteristic is largely neglected in existing systems making them less suitable for large-scale deployment. With the growing interest in Internet-scale image search applications, our framework offers an effective solution to the scalability problem.
|
42 |
IMAGE CAPTIONING FOR REMOTE SENSING IMAGE ANALYSISHoxha, Genc 09 August 2022 (has links)
Image Captioning (IC) aims to generate a coherent and comprehensive textual description that summarizes the complex content of an image. It is a combination of computer vision and natural language processing techniques to encode the visual features of an image and translate them into a sentence. In the context of remote sensing (RS) analysis, IC has been emerging as a new research area of high interest since it not only recognizes the objects within an image but also describes their attributes and relationships. In this thesis, we propose several IC methods for RS image analysis. We focus on the design of different approaches that take into consideration the peculiarity of RS images (e.g. spectral, temporal and spatial properties) and study the benefits of IC in challenging RS applications.
In particular, we focus our attention on developing a new decoder which is based on support vector machines. Compared to the traditional decoders that are based on deep learning, the proposed decoder is particularly interesting for those situations in which only a few training samples are available to alleviate the problem of overfitting. The peculiarity of the proposed decoder is its simplicity and efficiency. It is composed of only one hyperparameter, does not require expensive power units and is very fast in terms of training and testing time making it suitable for real life applications. Despite the efforts made in developing reliable and accurate IC systems, the task is far for being solved. The generated descriptions are affected by several errors related to the attributes and the objects present in an RS scene. Once an error occurs, it is propagated through the recurrent layers of the decoders leading to inaccurate descriptions. To cope with this issue, we propose two post-processing techniques with the aim of improving the generated sentences by detecting and correcting the potential errors. They are based on Hidden Markov Model and Viterbi algorithm. The former aims to generate a set of possible states while the latter aims at finding the optimal sequence of states. The proposed post-processing techniques can be injected to any IC system at test time to improve the quality of the generated sentences. While all the captioning systems developed in the RS community are devoted to single and RGB images, we propose two captioning systems that can be applied to multitemporal and multispectral RS images. The proposed captioning systems are able at describing the changes occurred in a given geographical through time. We refer to this new paradigm of analysing multitemporal and multispectral images as change captioning (CC). To test the proposed CC systems, we construct two novel datasets composed of bitemporal RS images. The first one is composed of very high-resolution RGB images while the second one of medium resolution multispectral satellite images. To advance the task of CC, the constructed datasets are publically available in the following link: https://disi.unitn.it/~melgani/datasets.html. Finally, we analyse the potential of IC for content based image retrieval (CBIR) and show its applicability and advantages compared to the traditional techniques. Specifically, we focus our attention on developing
a CBIR systems that represents an image with generated descriptions and uses sentence similarity to search and retrieve relevant RS images. Compare to traditional CBIR systems, the proposed system is able to search and retrieve images using either an image or a sentence as a query making it more comfortable for the end-users. The achieved results show the promising potentialities of our proposed methods compared to the baselines and state-of-the art methods.
|
43 |
A HUMAN-COMPUTER INTEGRATED APPROACH TOWARDS CONTENT BASED IMAGE RETRIEVALKidambi, Phani Nandan January 2010 (has links)
No description available.
|
44 |
Ανάπτυξη μεθόδων ανάκτησης εικόνας βάσει περιεχομένου σε αναπαραστάσεις αντικειμένων ασαφών ορίων / Development of methods for content-based image retrieval in representations of fuzzily bounded objectsΚαρτσακάλης, Κωνσταντίνος 11 March 2014 (has links)
Τα δεδομένα εικόνων που προκύπτουν από την χρήση βιο-ιατρικών μηχανημάτων είναι από την φύση τους ασαφή, χάρη σε μια σειρά από παράγοντες ανάμεσα στους οποίους οι περιορισμοί στον χώρο, τον χρόνο, οι παραμετρικές αναλύσεις καθώς και οι φυσικοί περιορισμοί που επιβάλλει το εκάστοτε μηχάνημα. Όταν το αντικείμενο ενδιαφέροντος σε μια τέτοια εικόνα έχει κάποιο μοτίβο φωτεινότητας ευκρινώς διαφορετικό από τα μοτίβα των υπόλοιπων αντικειμένων που εμφανίζονται, είναι εφικτή η κατάτμηση της εικόνας με έναν απόλυτο, δυαδικό τρόπο που να εκφράζει επαρκώς τα όρια των αντικειμένων. Συχνά ωστόσο σε τέτοιες εικόνες υπεισέρχονται παράγοντες όπως η ανομοιογένεια των υλικών που απεικονίζονται, θόλωμα, θόρυβος ή και μεταβολές στο υπόβαθρο που εισάγονται από την συσκευή απεικόνισης με αποτέλεσμα οι εντάσεις φωτεινότητας σε μια τέτοια εικόνα να εμφανίζονται με έναν ασαφή, βαθμωτό, «μη-δυαδικό» τρόπο.
Μια πρωτοπόρα τάση στην σχετική βιβλιογραφία είναι η αξιοποίηση της ασαφούς σύνθεσης των αντικειμένων μιας τέτοιας εικόνας, με τρόπο ώστε η ασάφεια να αποτελεί γνώρισμα του εκάστοτε αντικειμένου αντί για ανεπιθύμητο χαρακτηριστικό: αντλώντας από την θεωρία ασαφών συνόλων, τέτοιες προσεγγίσεις κατατμούν μια εικόνα με βαθμωτό, μη-δυαδικό τρόπο αποφεύγοντας τον μονοσήμαντο καθορισμό ορίων μεταξύ των αντικειμένων. Μια τέτοια προσέγγιση καταφέρνει να αποτυπώσει με μαθηματικούς όρους την ασάφεια της θολής εικόνας, μετατρέποντάς την σε χρήσιμο εργαλείο ανάλυσης στα χέρια ενός ειδικού. Από την άλλη, το μέγεθος της ασάφειας που παρατηρείται σε τέτοιες εικόνες είναι τέτοιο ώστε πολλές φορές να ωθεί τους ειδικούς σε διαφορετικές ή και αντικρουόμενες κατατμήσεις, ακόμη και από το ίδιο ανθρώπινο χέρι. Επιπλέον, το παραπάνω έχει ως αποτέλεσμα την οικοδόμηση βάσεων δεδομένων στις οποίες για μια εικόνα αποθηκεύονται πολλαπλές κατατμήσεις, δυαδικές και μη.
Μπορούμε με βάση μια κατάτμηση εικόνας να ανακτήσουμε άλλες, παρόμοιες τέτοιες εικόνες των οποίων τα δεδομένα έχουν προέλθει από αναλύσεις ειδικών, χωρίς σε κάποιο βήμα να υποβαθμίζουμε την ασαφή φύση των αντικειμένων που απεικονίζονται; Πως επιχειρείται η ανάκτηση σε μια βάση δεδομένων στην οποία έχουν αποθηκευτεί οι παραπάνω πολλαπλές κατατμήσεις για κάθε εικόνα; Αποτελεί κριτήριο ομοιότητας μεταξύ εικόνων το πόσο συχνά θα επέλεγε ένας ειδικός να οριοθετήσει ένα εικονοστοιχείο μιας τέτοιας εικόνας εντός ή εκτός ενός τέτοιου θολού αντικειμένου;
Στα πλαίσια της παρούσας εργασίας προσπαθούμε να απαντήσουμε στα παραπάνω ερωτήματα, μελετώντας διεξοδικά την διαδικασία ανάκτησης τέτοιων εικόνων. Προσεγγίζουμε το πρόβλημα θεωρώντας ότι για κάθε εικόνα αποθηκεύονται στην βάση μας περισσότερες της μίας κατατμήσεις, τόσο δυαδικής φύσης από ειδικούς όσο και από ασαφείς από αυτόματους αλγορίθμους. Επιδιώκουμε εκμεταλλευόμενοι το χαρακτηριστικό της ασάφειας να ενοποιήσουμε την διαδικασία της ανάκτησης και για τις δυο παραπάνω περιπτώσεις, προσεγγίζοντας την συχνότητα με την οποία ένας ειδικός θα οριοθετούσε το εκάστοτε ασαφές αντικείμενο με συγκεκριμένο τρόπο καθώς και τα ενδογενή χαρακτηριστικά ενός ασαφούς αντικειμένου που έχει εξαχθεί από αυτόματο αλγόριθμο. Προτείνουμε κατάλληλο μηχανισμό ανάκτησης ο οποίος αναλαμβάνει την μετάβαση από τον χώρο της αναποφασιστικότητας και του ασαφούς στον χώρο της πιθανοτικής αναπαράστασης, διατηρώντας παράλληλα όλους τους περιορισμούς που έχουν επιβληθεί στα δεδομένα από την πρωταρχική ανάλυσή τους. Στην συνέχεια αξιολογούμε την διαδικασία της ανάκτησης, εφαρμόζοντας την νέα μέθοδο σε ήδη υπάρχον σύνολο δεδομένων από το οποίο και εξάγουμε συμπεράσματα για τα αποτελέσματά της. / Image data acquired through the use of bio-medical scanners are by nature fuzzy, thanks to a series of factors including limitations in spatial, temporal and parametric resolutions other than the physical limitations of the device. When the object of interest in such an image displays intensity patterns that are distinct from the patterns of other objects appearing together, a segmentation of the image in a hard, binary manner that clearly defines the borders between objects is feasible. It is frequent though that in such images factors like the lack of homogeneity between materials depicted, blurring, noise or deviations in the background pose difficulties in the above process. Intensity values in such an image appear in a fuzzy, gradient, “non-binary” manner.
An innovative trend in the field of study is to make use of the fuzzy composition of objects in such an image, in a way in which fuzziness becomes a characteristic feature of the object instead of an undesirable trait: deriving from the theory of fuzzy sets, such approaches segment an image in a gradient, non-binary manner, therefore avoiding to set up a clear boundary between depicted objects. Such approaches are successful in capturing the fuzziness of the blurry image in mathematical terms, transforming the quality into a powerful tool of analysis in the hands of an expert. On the other hand, the scale of fuzziness observed in such images often leads experts towards different or contradictory segmentations, even drawn by the same human hand. What is more, the aforementioned case results in the compilation of image data bases consisting of multiple segmentations for each image, both binary and fuzzy.
Are we able, by segmenting an image, to retrieve other similar such images whose segmented data have been acquired by experts, without downgrading the importance of the fuzziness of the objects depicted in any step involved? How exactly are images in such a database storing multiple segmentations of each retrieved? Is the frequency with which an expert would choose to either include or exclude from a fuzzy object a pixel of an image, a criterion of semblance between objects depicted in images? Finally, how able are we to tackle the feature of fuzziness in a probabilistic manner, thus providing a valuable tool in bridging the gap between automatic segmentation algorithms and segmentations coming from field experts?
In the context of this thesis, we tackle the aforementioned problems studying thoroughly the process of image retrieval in a fuzzy context. We consider the case in which a database consists of images for which exist more than one segmentations, both crisp, derived by experts’ analysis, and fuzzy, generated by segmentation algorithms. We attempt to unify the retrieval process for both cases by taking advantage of the feature of fuzziness, and by approximating the frequency with which an expert would confine the boundaries of the fuzzy object in a uniform manner, along with the intrinsic features of a fuzzy, algorithm-generated object. We propose a suitable retrieval mechanism that undertakes the transition from the field of indecisiveness to that of a probabilistic representation, at the same time preserving all the limitations imposed on the data by their initial analysis. Next, we evaluate the retrieval process, by implementing the new method on an already existing data-set and draw conclusions on the effectiveness of the proposed scheme.
|
45 |
美術影像中顏色風格探勘之研究 / Mining Painting Color Style from Fine Arts劉勁男, Chin-Nan Liu Unknown Date (has links)
資料探勘技術的研究,隨著資料庫系統的普遍建置而日益重要。但是尚沒有研究針對美術繪畫影像的風格探勘。本研究的目的也就是發展資料探勘的技術,從繪畫的低階影像特徵中探勘出繪畫風格,並以分類規則的方式來表示繪畫風格。畫家的畫風是指表現在大部分畫作裡的繪畫風格,也是與其他畫家相比,在畫作的共同特徵上之獨特性與差異性。基於以上的兩個特性,我們把畫風探勘分為三個議題︰一、feature extraction,從美術影像中萃取低階影像特徵,我們使用的有主要顏色與相鄰顏色。為了因應MPEG-7標準即將統一描述多媒體資料的內容表示方式,所以我們也針對MPEG-7規格的低階影像特徵。二、mining frequent patterns,從所有該畫家畫作的低階影像特徵找出共同的個人畫作特徵,我們利用association rule中mining frequent itemset的方法找出畫風中顏色的搭配,而且我們也發展了一個新的規則,frequent 2D sequential pattern,用來表示畫風中顏色的佈局。三、classification,找出每個畫家與別人不一樣的個人畫作特徵,就是定量描述的繪畫影像風格。我們分別利用C4.5與修改過的associative classification。我們提出了二個改進associative classification的分類演算法,single-feature variant support (SFVS) classification,容許各個class進行不同minimum support的mining以及與multi-feature variant support (MFVS) classification,同時用不同低階影像特徵進行分類。有關實驗的進行,我們有兩組測試畫家,一組是西方印象派畫家,另一組則是受西方印象派影響的臺灣本土畫家。每組畫家都進行兩人配對,分別建出2-way的associative classifier、SFVS classifier與MFVS classifier,並評估畫風探勘演算法的效果。最後,本論文實作了一個「影像風格查詢系統」。查詢系統的基本功能提供使用者以風格查詢藝術影像的功能。例如,使用者可以查詢具有梵谷畫風的畫作或是查詢融合雷諾瓦與莫內畫風的畫作。 / The data mining researches become more and more important. However, no studies have investigated on painting style mining of fine arts images. The purpose of this paper is to develop a new approach for mining painting style from low level image features of fine art images and represent painting style as the classification rules. The painitng style of an artist is characterized not only by the frequent pattern appears in most works but also by the discrimination patterns from others. According to these two characteristics, we identified three design issues for painting style mining: feature extraction, mining frequent patterns and classification. Feature extraction extracts low level image freatures from fine arts images. In this thesis, we extract dominant color and adjacency color relationship as low level image features. Besides, we also extract MPEG-7 descriptors. Mining frequent patterns finds the frequent patterns appear in all works by one artist. We apply the technique of frequent itemset mining in association rule mining to find which colors are likely be used together in artist’s painting style. Moreover, we proposed a new pattern, frequent 2D subsequence, to represent painting style in terms of color layout. Classification finds the artist’s discriminating patterns from others and presents those patterns as painting style in quantitative manner. We utilize C4.5 and modified associative classification as classification methods. We developed two association classification algorithm, single-feature variant support (SFVS) classification and multi-feature variant support (MFVS) classification. The experiment is conducted by two groups of painting work. One is the work of impressionism artists and the other is the work of Taiwan artists that were influenced by impressionism. The 2-way associative classifier, SFVS classifier and MFVS classifier are constructed for each group of painting work and evaluate the proformance. Finally, we implemented a “Painting Style Query System” which provides users to query fine arts images by painting style. For example, user can query those images that matchs VanGogh’s style or query those images that matchs integration style with Renoir and Monet.
|
46 |
Efficient Image Retrieval with Statistical Color DescriptorsViet Tran, Linh January 2003 (has links)
Color has been widely used in content-based image retrieval (CBIR) applications. In such applications the color properties of an image are usually characterized by the probability distribution of the colors in the image. A distance measure is then used to measure the (dis-)similarity between images based on the descriptions of their color distributions in order to quickly find relevant images. The development and investigation of statistical methods for robust representations of such distributions, the construction of distance measures between them and their applications in efficient retrieval, browsing, and structuring of very large image databases are the main contributions of the thesis. In particular we have addressed the following problems in CBIR. Firstly, different non-parametric density estimators are used to describe color information for CBIR applications. Kernel-based methods using nonorthogonal bases together with a Gram-Schmidt procedure and the application of the Fourier transform are introduced and compared to previously used histogram-based methods. Our experiments show that efficient use of kernel density estimators improves the retrieval performance of CBIR. The practical problem of how to choose an optimal smoothing parameter for such density estimators as well as the selection of the histogram bin-width for CBIR applications are also discussed. Distance measures between color distributions are then described in a differential geometry-based framework. This allows the incorporation of geometrical features of the underlying color space into the distance measure between the probability distributions. The general framework is illustrated with two examples: Normal distributions and linear representations of distributions. The linear representation of color distributions is then used to derive new compact descriptors for color-based image retrieval. These descriptors are based on the combination of two ideas: Incorporating information from the structure of the color space with information from images and application of projection methods in the space of color distribution and the space of differences between neighboring color distributions. In our experiments we used several image databases containing more than 1,300,000 images. The experiments show that the method developed in this thesis is very fast and that the retrieval performance chievedcompares favorably with existing methods. A CBIR system has been developed and is currently available at http://www.media.itn.liu.se/cse. We also describe color invariant descriptors that can be used to retrieve images of objects independent of geometrical factors and the illumination conditions under which these images were taken. Both statistics- and physics-based methods are proposed and examined. We investigated the interaction between light and material using different physical models and applied the theory of transformation groups to derive geometry color invariants. Using the proposed framework, we are able to construct all independent invariants for a given physical model. The dichromatic reflection model and the Kubelka-Munk model are used as examples for the framework. The proposed color invariant descriptors are then applied to both CBIR, color image segmentation, and color correction applications. In the last chapter of the thesis we describe an industrial application where different color correction methods are used to optimize the layout of a newspaper page. / <p>A search engine based, on the methodes discribed in this thesis, can be found at http://pub.ep.liu.se/cse/db/?. Note that the question mark must be included in the address.</p>
|
47 |
Passage à l’échelle des méthodes de recherche sémantique dans les grandes bases d’images / Scalable search engines for content-based image retrieval task in huge image databaseGorisse, David 17 December 2010 (has links)
Avec la révolution numérique de cette dernière décennie, la quantité de photos numériques mise à disposition de chacun augmente plus rapidement que la capacité de traitement des ordinateurs. Les outils de recherche actuels ont été conçus pour traiter de faibles volumes de données. Leur complexité ne permet généralement pas d'effectuer des recherches dans des corpus de grande taille avec des temps de calculs acceptables pour les utilisateurs. Dans cette thèse, nous proposons des solutions pour passer à l'échelle les moteurs de recherche d'images par le contenu. Dans un premier temps, nous avons considéré les moteurs de recherche automatique traitant des images indexées sous la forme d'histogrammes globaux. Le passage à l'échelle de ces systèmes est obtenu avec l'introduction d'une nouvelle structure d'index adaptée à ce contexte qui nous permet d'effectuer des recherches de plus proches voisins approximées mais plus efficaces. Dans un second temps, nous nous sommes intéressés à des moteurs plus sophistiqués permettant d'améliorer la qualité de recherche en travaillant avec des index locaux tels que les points d'intérêt. Dans un dernier temps, nous avons proposé une stratégie pour réduire la complexité de calcul des moteurs de recherche interactifs. Ces moteurs permettent d'améliorer les résultats en utilisant des annotations que les utilisateurs fournissent au système lors des sessions de recherche. Notre stratégie permet de sélectionner rapidement les images les plus pertinentes à annoter en optimisant une méthode d'apprentissage actif. / In this last decade, would the digital revolution and its ancillary consequence of a massive increases in digital picture quantities. The database size grow much faster than the processing capacity of computers. The current search engine which conceived for small data volumes do not any more allow to make searches in these new corpus with acceptable response times for users.In this thesis, we propose scalable content-based image retrieval engines.At first, we considered automatic search engines where images are indexed with global histograms. Secondly, we were interested in more sophisticated engines allowing to improve the search quality by working with bag of feature. In a last time, we proposed a strategy to reduce the complexity of interactive search engines. These engines allow to improve the results by using labels which the users supply to the system during the search sessions.
|
48 |
Improving image representation using image saliency and information gain / Amélioration de la représentation des images : apport de la saillance et du gain d'informationLe, Huu Ton 23 November 2015 (has links)
De nos jours, avec le développement des nouvelles technologies multimédia, la recherche d’images basée sur le contenu visuel est un sujet de recherche en plein essor avec de nombreux domaines d'application: indexation et recherche d’images, la graphologie, la détection et le suivi d’objets... Un des modèles les plus utilisés dans ce domaine est le sac de mots visuels qui tire son inspiration de la recherche d’information dans des documents textuels. Dans ce modèle, les images sont représentées par des histogrammes de mots visuels à partir d'un dictionnaire visuel de référence. La signature d’une image joue un rôle important car elle détermine la précision des résultats retournés par le système de recherche.Dans cette thèse, nous étudions les différentes approches concernant la représentation des images. Notre première contribution est de proposer une nouvelle méthodologie pour la construction du vocabulaire visuel en utilisant le gain d'information extrait des mots visuels. Ce gain d’information est la combinaison d’un modèle de recherche d’information avec un modèle d'attention visuelle.Ensuite, nous utilisons un modèle d'attention visuelle pour améliorer la performance de notre modèle de sacs de mots visuels. Cette étude de la saillance des descripteurs locaux souligne l’importance d’utiliser un modèle d’attention visuelle pour la description d’une image.La dernière contribution de cette thèse au domaine de la recherche d’information multimédia démontre comment notre méthodologie améliore le modèle des sacs de phrases visuelles. Finalement, une technique d’expansion de requêtes est utilisée pour augmenter la performance de la recherche par les deux modèles étudiés. / Nowadays, along with the development of multimedia technology, content based image retrieval (CBIR) has become an interesting and active research topic with an increasing number of application domains: image indexing and retrieval, face recognition, event detection, hand writing scanning, objects detection and tracking, image classification, landmark detection... One of the most popular models in CBIR is Bag of Visual Words (BoVW) which is inspired by Bag of Words model from Information Retrieval field. In BoVW model, images are represented by histograms of visual words from a visual vocabulary. By comparing the images signatures, we can tell the difference between images. Image representation plays an important role in a CBIR system as it determines the precision of the retrieval results.In this thesis, image representation problem is addressed. Our first contribution is to propose a new framework for visual vocabulary construction using information gain (IG) values. The IG values are computed by a weighting scheme combined with a visual attention model. Secondly, we propose to use visual attention model to improve the performance of the proposed BoVW model. This contribution addresses the importance of saliency key-points in the images by a study on the saliency of local feature detectors. Inspired from the results from this study, we use saliency as a weighting or an additional histogram for image representation.The last contribution of this thesis to CBIR shows how our framework enhances the BoVP model. Finally, a query expansion technique is employed to increase the retrieval scores on both BoVW and BoVP models.
|
49 |
Arcabouço para recuperação de imagens por conteúdo visando à percepção do usuário / Content-based image retrieval aimed at reaching user´s perceptionBugatti, Pedro Henrique 29 October 2012 (has links)
Na última década observou-se grande interesse pra o desenvolvimento de técnicas para Recuperação de Imagens Baseada em Conteúdo devido à explosão na quantidade de imagens capturadas e à necessidade de armazenamento e recuperação dessas imagens. A área médica especificamente é um exemplo que gera um grande fluxo de informações, principalmente imagens digitais para a realização de diagnósticos. Porém um problema ainda permanecia sem solução que tratava-se de como atingir a similaridade baseada na percepção do usuário, uma vez que para que se consiga uma recuperação eficaz, deve-se caracterizar e quantificar o melhor possível tal similaridade. Nesse contexto, o presente trabalho de Doutorado visou trazer novas contribuições para a área de recuperação de imagens por contúdo. Dessa forma, almejou ampliar o alcance de consultas por similaridade que atendam às expectativas do usuário. Tal abordagem deve permitir ao sistema CBIR a manutenção da semântica da consulta desejada pelo usuário. Assim, foram desenvolvidos três métodos principais. O primeiro método visou a seleção de características por demanda baseada na intenção do usuário, possibilitando dessa forma agregação de semântica ao processo de seleção de características. Já o segundo método culminou no desenvolvimento de abordagens para coleta e agragação de perfis de usuário, bem como novas formulações para quantificar a similaridade perceptual dos usuários, permitindo definir dinamicamente a função de distância que melhor se adapta à percepção de um determinado usuário. O terceiro método teve por objetivo a modificação dinâmica de funções de distância em diferentes ciclos de realimentação. Para tanto foram definidas políticas para realizar tal modificação as quais foram baseadas na junção de informações a priori da base de imagens, bem como, na percepção do usuário no processo das consultas por similaridade. Os experimentos realizados mostraram que os métodos propostos contribuíram de maneira efetiva para caracterizar e quantificar a similaridade baseada na percepção do usuário, melhorando consideravelmente a busca por conteúdo segundo as expectativas dos usuários / In the last decade techniques for content-based image retrieval (CBIR) have been intensively explored due to the increase in the amount of capttured images and the need of fast retrieval of them. The medical field is a specific example that generates a large flow of information, especially digital images employed for diagnosing. One issue that still remains unsolved deals with how to reach the perceptual similarity. That is, to achieve an effectivs retrieval, one must characterize and quantify the perceptual similarity regarding the specialist in the field. Therefore, the present thesis was conceived tofill in this gap creating a consistent support to perform similarity queries over images, maintaining the semantics of a given query desired by tyhe user, bringing new contribuitions to the content-based retrieval area. To do so, three main methods were developed. The first methods applies a novel retrieval approach that integrates techniques of feature selection and relevance feedback to preform demand-driven feature selection guided by perceptual similarity, tuning the mining process on the fly, according to the user´s intention. The second method culminated in the development of approaches for harvesting and surveillance of user profiles, as well as new formulations to quantify the perceptual similarity of users , allowing to dynamically set the distance function that best fits the perception of a given user. The third method introduces a novel approach to enhance the retrieval process through user feedback and profiling, modifying the distance function in each feedback cycle choosing the best one for each cycle according to the user expectation. The experiments showed that the proposed metods effectively contributed to capture the perceptual similarity, improving in a great extent the image retrieval according to users´expectations
|
50 |
"Recuperação de imagens por conteúdo através de análise multiresolução por Wavelets" / "Content based image retrieval through multiresolution wavelet analysisCastañon, Cesar Armando Beltran 28 February 2003 (has links)
Os sistemas de recuperação de imagens por conteúdo (CBIR -Content-based Image Retrieval) possuem a habilidade de retornar imagens utilizando como chave de busca outras imagens. Considerando uma imagem de consulta, o foco de um sistema CBIR é pesquisar no banco de dados as "n" imagens mais similares à imagem de consulta de acordo com um critério dado. Este trabalho de pesquisa foi direcionado na geração de vetores de características para um sistema CBIR considerando bancos de imagens médicas, para propiciar tal tipo de consulta. Um vetor de características é uma representação numérica sucinta de uma imagem ou parte dela, descrevendo seus detalhes mais representativos. O vetor de características é um vetor "n"-dimensional contendo esses valores. Essa nova representação da imagem pode ser armazenada em uma base de dados, e assim, agilizar o processo de recuperação de imagens. Uma abordagem alternativa para caracterizar imagens para um sistema CBIR é a transformação do domínio. A principal vantagem de uma transformação é sua efetiva caracterização das propriedades locais da imagem. Recentemente, pesquisadores das áreas de matemática aplicada e de processamento de sinais desenvolveram técnicas práticas de "wavelet" para a representação multiescala e análise de sinais. Estas novas ferramentas diferenciam-se das tradicionais técnicas de Fourier pela forma de localizar a informação no plano tempo-freqüência; basicamente, elas têm a capacidade de mudar de uma resolução para outra, o que faz delas especialmente adequadas para a análise de sinais não estacionários. A transformada "wavelet" consiste de um conjunto de funções base que representa o sinal em diferentes bandas de freqüência, cada uma com resoluções distintas correspondentes a cada escala. Estas foram aplicadas com sucesso na compressão, melhoria, análise, classificação, caracterização e recuperação de imagens. Uma das áreas beneficiadas, onde essas propriedades têm encontrado grande relevância, é a área médica, através da representação e descrição de imagens médicas. Este trabalho descreve uma abordagem para um banco de imagens médicas, que é orientada à extração de características para um sistema CBIR baseada na decomposição multiresolução de "wavelets" utilizando os filtros de Daubechies e Gabor. Essas novas características de imagens foram também testadas utilizando uma estrutura de indexação métrica "Slim-tree". Assim, pode-se aumentar o alcance semântico do sistema cbPACS (Content-Based Picture Archiving and Comunication Systems), atualmente em desenvolvimento conjunto entre o Grupo de Bases de Dados e Imagens do ICMC--USP e o Centro de Ciências de Imagens e Física Médica do Hospital das Clínicas de Riberão Preto-USP. / Content-based image retrieval (CBIR) refers to the ability to retrieve images on the basis of the image content. Given a query image, the goal of a CBIR system is to search the database and return the "n" most similar (close) ones to the query image according to a given criteria. Our research addresses the generation of feature vectors of a CBIR system for medical image databases. A feature vector is a numeric representation of an image or part of it over its representative aspects. The feature vector is a "n"-dimensional vector organizing such values. This new image representation can be stored into a database and allow a fast image retrieval. An alternative for image characterization for a CBIR system is the domain transform. The principal advantage of a transform is its effective characterization for their local image properties. In the past few years, researches in applied mathematics and signal processing have developed practical "wavelet" methods for the multiscale representation and analysis of signals. These new tools differ from the traditional Fourier techniques by the way in which they localize the information in the time-frequency plane; in particular, they are capable of trading one type of resolution for the other, which makes them especially suitable for the analysis of non-stationary signals. The "wavelet" transform is a set of basis functions that represents signals in different frequency bands, each one with a resolution matching its scale. They have been successfully applied to image compression, enhancements, analysis, classifications, characterization and retrieval. One privileged area of application where these properties have been found to be relevant is medical imaging. In this work we describe an approach to CBIR for medical image databases focused on feature extraction based on multiresolution "wavelets" decomposition, taking advantage of the Daubechies and Gabor. Fundamental to our approach is how images are characterized, such that the retrieval procedure can bring similar images within the domain of interest, using a metric structure indexing, like the "Slim-tree". Thus, it increased the semantic capability of the cbPACS(Content-Based Picture Archiving and Comunication Systems), currently in joined developing between the Database and Image Group of the ICMC--USP and the Science Center for Images and Physical Medic of the Clinics Hospital of Riberão Preto--USP.
|
Page generated in 0.121 seconds