Global ETD Search

81	Digital Libraries with Superimposed Information: Supporting Scholarly Tasks that Involve Fine Grain Information Murthy, Uma 02 May 2011 (has links) Many scholarly tasks involve working with contextualized fine-grain information, such as a music professor creating a multimedia lecture on a musical style, while bringing together several snippets of compositions of that style. We refer to such contextualized parts of a larger unit of information (or whole documents), as subdocuments. Current approaches to work with subdocuments involve a mix of paper-based and digital techniques. With the increase in the volume and in the heterogeneity of information sources, the management, organization, access, retrieval, as well as reuse of subdocuments becomes challenging, leading to inefficient and ineffective task execution. A digital library (DL) facilitates management, access, retrieval, and use of collections of data and metadata through services. However, most DLs do not provide infrastructure or services to support working with subdocuments. Superimposed information (SI) refers to new information that is created to reference subdocuments in existing information resources. We combine this idea of SI with traditional DL services, to define and develop a DL with SI (an SI-DL). Our research questions are centered around one main question: how can we extend the notion of a DL to include SI, in order to support scholarly tasks that involve working with subdocuments? We pursued this question from a theoretical as well as a practical/user perspective. From a theoretical perspective, we developed a formal metamodel that precisely defines the components of an SI-DL, building upon related work in DLs, SI, annotations, and hypertext. From the practical/user perspective, we developed prototype superimposed applications and conducted user studies to explore the use of SI in scholarly tasks. We developed SuperIDR, a prototype SI-DL, which enables users to mark up subimages, annotate them, and retrieve information in multiple ways, including browsing, and text- and content-based image retrieval. We explored the use of subimages and evaluated the use of SuperIDR in fish species identification, a scholarly task that involves working with subimages. Findings from the user studies and other work in our research lead to theory- and experiment-based enhancements that can guide design of digital libraries with superimposed information. / Ph. D. Annotation Digital libraries Fish species identification Image retrieval Metamodel Subdocument Superimposed information User study
82	Backdrop Explorer: A Human-AI Collaborative Approach for Exploring Studio Backdrops in Civil War Portraits Lim, Ken Yoong 14 June 2023 (has links) In historical photo research, the presence of painted backdrops have the potential to help identify subjects, photographers, locations, and jl{events surrounding} certain photographs. Yet, research processes around these backdrops are poorly documented, with no known tools to aid in the task. We propose a four-step human-AI collaboration workflow to support the jl{discovery} and clustering of these backdrops. Focusing on the painted backdrops of the American Civil War (1861 -- 1865), we present Backdrop Explorer, a content-based image retrieval (CBIR) system incorporating computer vision and novel user interactions. We evaluated Backdrop Explorer on nine users of diverse experience levels and found that all were able to effectively utilize Backdrop Explorer to find photos with similar backdrops. We also document current practices and pain points in Civil War backdrop research through user interviews. Finally, we discuss how our findings and workflow can be applied to other topics and domains. / Master of Science / In historical photo research, the presence of painted backdrops have the potential to help identify subjects, photographers, locations, and events surrounding certain photographs. Yet, research processes around these backdrops are poorly documented, with no known tools to aid in the largely manual task. We present Backdrop Explorer, a reverse image search system that helps users discover and subsequently group photos with similar backdrops. We evaluated the system and found that it effectively supported the tasks. We also document current practices and pain points in Civil War backdrop research. Finally, we discuss how our findings and system can be applied to other domains. Human-AI Interaction Computer Vision Content-Based Image Retrieval Digital Humanities
83	Saliency-weighted graphs for efficient visual content description and their applications in real-time image retrieval systems Ahmad, J., Sajjad, M., Mehmood, Irfan, Rho, S., Baik, S.W. 18 July 2019 (has links) Yes / The exponential growth in the volume of digital image databases is making it increasingly difficult to retrieve relevant information from them. Efficient retrieval systems require distinctive features extracted from visually rich contents, represented semantically in a human perception-oriented manner. This paper presents an efficient framework to model image contents as an undirected attributed relational graph, exploiting color, texture, layout, and saliency information. The proposed method encodes salient features into this rich representative model without requiring any segmentation or clustering procedures, reducing the computational complexity. In addition, an efficient graph-matching procedure implemented on specialized hardware makes it more suitable for real-time retrieval applications. The proposed framework has been tested on three publicly available datasets, and the results prove its superiority in terms of both effectiveness and efficiency in comparison with other state-of-the-art schemes. / Supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2012904). Attributed relational graph Image representation Content-based image retrieval Saliency map Real-time retrieval
84	Recherche multi-descripteurs dans les fonds photographiques numérisés / Multi-descriptor retrieval in digitalized photographs collections Bhowmik, Neelanjan 07 November 2017 (has links) La recherche d’images par contenu (CBIR) est une discipline de l’informatique qui vise à structurer automatiquement les collections d’images selon des critères visuels. Les fonctionnalités proposées couvrent notamment l’accès efficace aux images dans une grande base de données d’images ou l’identification de leur contenu par des outils de détection et de reconnaissance d’objets. Ils ont un impact sur une large gamme de domaines qui manipulent ce genre de données, telles que le multimedia, la culture, la sécurité, la santé, la recherche scientifique, etc.Indexer une image à partir de son contenu visuel nécessite d’abord de produire un résumé visuel de ce contenu pour un usage donné, qui sera l’index de cette image dans la collection. En matière de descripteurs d’images, la littérature est désormais trés riche: plusieurs familles de descripteurs existent, et dans chaque famille de nombreuses approches cohabitent. Bon nombre de descripteurs ne décrivant pas la même information et n’ayant pas les mêmes propriétés d’invariance, il peut être pertinent de les combiner de manière à mieux décrire le contenu de l’image. Cette combinaison peut être mise en oeuvre de différentes manières, selon les descripteurs considérés et le but recherché. Dans cette thése, nous nous concentrons sur la famille des descripteurs locaux, avec pour application la recherche d’images ou d’objets par l’exemple dans une collection d’images. Leurs bonnes propriétés les rendent très populaires pour la recherche, la reconnaissance et la catégorisation d'objets et de scènes. Deux directions de recherche sont étudiées:Combinaison de caractéristiques pour la recherche d’images par l’exemple: Le coeur de la thèse repose sur la proposition d’un modèle pour combiner des descripteurs de bas niveau et génériques afin d’obtenir un descripteur plus riche et adapté à un cas d’utilisation donné tout en conservant la généricité afin d’indexer différents types de contenus visuels. L’application considérée étant la recherche par l’exemple, une autre difficulté majeure est la complexité de la proposition, qui doit correspondre à des temps de récupération réduits, même avec de grands ensembles de données. Pour atteindre ces objectifs, nous proposons une approche basée sur la fusion d'index inversés, ce qui permet de mieux représenter le contenu tout en étant associé à une méthode d’accès efficace.Complémentarité des descripteurs: Nous nous concentrons sur l’évaluation de la complémentarité des descripteurs locaux existant en proposant des critères statistiques d’analyse de leur répartition spatiale dans l'image. Ce travail permet de mettre en évidence une synergie entre certaines de ces techniques lorsqu’elles sont jugées suffisamment complémentaires. Les critères spatiaux sont exploités dans un modèle de prédiction à base de régression linéaire, qui a l'avantage de permettre la sélection de combinaisons de descripteurs optimale pour la base considérée mais surtout pour chaque image de cette base. L'approche est évaluée avec le moteur de recherche multi-index, où il montre sa pertinence et met aussi en lumière le fait que la combinaison optimale de descripteurs peut varier d'une image à l'autre.En outre, nous exploitons les deux propositions précédentes pour traiter le problème de la recherche d'images inter-domaines, correspondant notamment à des vues multi-source et multi-date. Deux applications sont explorées dans cette thèse. La recherche d’images inter-domaines est appliquée aux collections photographiques culturelles numérisées d’un musée, où elle démontre son efficacité pour l’exploration et la valorisation de ces contenus à différents niveaux, depuis leur archivage jusqu’à leur exposition ou ex situ. Ensuite, nous explorons l’application de la localisation basée image entre domaines, où la pose d’une image est estimée à partir d’images géoréférencées, en retrouvant des images géolocalisées visuellement similaires à la requête / Content-Based Image Retrieval (CBIR) is a discipline of Computer Science which aims at automatically structuring image collections according to some visual criteria. The offered functionalities include the efficient access to images in a large database of images, or the identification of their content through object detection and recognition tools. They impact a large range of fields which manipulate this kind of data, such as multimedia, culture, security, health, scientific research, etc.To index an image from its visual content first requires producing a visual summary of this content for a given use, which will be the index of this image in the database. From now on, the literature on image descriptors is very rich; several families of descriptors exist and in each family, a lot of approaches live together. Many descriptors do not describe the same information and do not have the same properties. Therefore it is relevant to combine some of them to better describe the image content. The combination can be implemented differently according to the involved descriptors and to the application. In this thesis, we focus on the family of local descriptors, with application to image and object retrieval by example in a collection of images. Their nice properties make them very popular for retrieval, recognition and categorization of objects and scenes. Two directions of research are investigated:Feature combination applied to query-by-example image retrieval: the core of the thesis rests on the proposal of a model for combining low-level and generic descriptors in order to obtain a descriptor richer and adapted to a given use case while maintaining genericity in order to be able to index different types of visual contents. The considered application being query-by-example, another major difficulty is the complexity of the proposal, which has to meet with reduced retrieval times, even with large datasets. To meet these goals, we propose an approach based on the fusion of inverted indices, which allows to represent the content better while being associated with an efficient access method.Complementarity of the descriptors: We focus on the evaluation of the complementarity of existing local descriptors by proposing statistical criteria of analysis of their spatial distribution. This work allows highlighting a synergy between some of these techniques when judged sufficiently complementary. The spatial criteria are employed within a regression-based prediction model which has the advantage of selecting the suitable feature combinations globally for a dataset but most importantly for each image. The approach is evaluated within the fusion of inverted indices search engine, where it shows its relevance and also highlights that the optimal combination of features may vary from an image to another.Additionally, we exploit the previous two proposals to address the problem of cross-domain image retrieval, where the images are matched across different domains, including multi-source and multi-date contents. Two applications of cross-domain matching are explored. First, cross-domain image retrieval is applied to the digitized cultural photographic collections of a museum, where it demonstrates its effectiveness for the exploration and promotion of these contents at different levels from their archiving up to their exhibition in or ex-situ. Second, we explore the application of cross-domain image localization, where the pose of a landmark is estimated by retrieving visually similar geo-referenced images to the query images Recherche d’image par contenu Combinaison de caractéristiques Sac de mots Index inversé Complémentarité spatiale Recherche d’images inter-Domaines Content-Based image retrieval Feature combination Bag-Of-Features Inverted index Spatial complementarity Cross-Domain image retrieval
85	以眼動資訊增進基於內容的圖像檢索效能 / Improving the Performance of Content Based Image Retrieval by Eye Tracking 張京文, Jhang ,Jing Wun Unknown Date (has links) 在現今的基於內容的圖像檢索的研究中，會將人的主觀認知考慮進去。因為傳統的圖像檢索中採取低階特徵來找出圖片上可能的重要區域的方法和人的感覺還是有著相當大的語意上的鴻溝。然而藉由考慮人對圖片的主觀認知，可以讓人找到對它而言圖片上重要的部分，再去做圖像檢索，找出使用者想要的圖片。這樣的作法是比較自然且直觀的。還能達到個人化的效果，因為每個人對同一張圖片上覺得重要的物體可能不盡相同。在本論文中的圖像檢索系統採用眼動軌跡當作人的主觀認知來輔助檢索。因為在心理學的研究中有提到，人在看圖片的時候會有較多的凝視點落在他覺得重要的區域上。所以藉由這個理論，本論文利用使用者看圖片的眼動軌跡即時的調整圖片上物體的重要性。最後將重要性高的數個物體去做圖像檢索，找出含有這些對這個使用者是重要的物體的圖片。經由實驗證實，眼動軌跡輔助圖像檢索的確可以減少不重要的物體對圖像檢索的干擾，繼而可以提升圖像檢索系統的效能。 / Recently, researches in Content-Based Image Retrieval (CBIR) focuses on incorporation of knowledge about human perception in the systems’ design and implementation process. This enables the design of more natural and intuitive image retrieval techniques in order to overcome some of the challenges faced by modern CBIR system such as the difficulty to extract important regions of an image. By researches of psychology, user’s eye tracking reflects his interest. So, in my CBIR system, user’s eye movements were used online to adjust the importance for objects in query image. Thus in my system, only those images with important objects will be retrieved. One experiment was performed: record the eye movement of participants on query images. Then compare my approach with a classic CBIR system according to performance. The results reveal that higher retrieval performance of my image retrieval system because of decreasing the influence of not importance objects to image retrieval system. 圖像檢索眼動軌跡眼動資訊 image retrieval eye tracking eye movement
86	Group-Theoretical Structure in Multispectral Color and Image Databases Hai Bui, Thanh January 2005 (has links) Many applications lead to signals with nonnegative function values. Understanding the structure of the spaces of nonnegative signals is therefore of interest in many different areas. Hence, constructing effective representation spaces with suitable metrics and natural transformations is an important research topic. In this thesis, we present our investigations of the structure of spaces of nonnegative signals and illustrate the results with applications in the fields of multispectral color science and content-based image retrieval. The infinite-dimensional Hilbert space of nonnegative signals is conical and convex. These two properties are preserved under linear projections onto lower dimensional spaces. The conical nature of these coordinate vector spaces suggests the use of hyperbolic geometry. The special case of three-dimensional hyperbolic geometry leads to the application of the SU(1,1) or SO 2,1) groups. We introduce a new framework to investigate nonnegative signals. We use PCA-based coordinates and apply group theoretical tools to investigate sequences of signal coordinate vectors. We describe these sequences with oneparameter subgroups of SU(1,1) and show how to compute the one-parameter subgroup of SU(1,1) from a given set of nonnegative signals. In our experiments we investigate the following signal sequences: (i) blackbody radiation spectra; (ii) sequences of daylight/twilight spectra measured in Norrk¨oping, Sweden and in Granada, Spain; (iii) spectra generated by the SMARTS2 simulation program; and (iv) sequences of image histograms. The results show that important properties of these sequences can be modeled in this framework. We illustrate the usefulness with examples where we derive illumination invariants and introduce an efficient visualization implementation. Content-Based Image Retrieval (CBIR) is another topic of the thesis. In such retrieval systems, images are first characterized by descriptor vectors. Retrieval is then based on these content-based descriptors. Selection of contentbased descriptors and defining suitable metrics are the core of any CBIR system. We introduce new descriptors derived by using group theoretical tools. We exploit the symmetry structure of the space of image patches and use the group theoretical methods to derive low-level image filters in a very general framework. The derived filters are simple and can be used for multispectral images and images defined on different sampling grids. These group theoretical filters are then used to derive content-based descriptors, which will be used in a real implementation of a CBIR. group theory PCA (Principal Component Analysis) CBIR (Content-based Image Retrieval) filter color image multispectral database Signal processing Signalbehandling
87	A Common Representation Format for Multimedia Documents Jeong, Ki Tai 12 1900 (has links) Multimedia documents are composed of multiple file format combinations, such as image and text, image and sound, or image, text and sound. The type of multimedia document determines the form of analysis for knowledge architecture design and retrieval methods. Over the last few decades, theories of text analysis have been proposed and applied effectively. In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and progressed quickly due in part to rapid progress in computer processing speed. Retrieval of multimedia documents formerly was divided into the categories of image and text, and image and sound. While standard retrieval process begins from text only, methods are developing that allow the retrieval process to be accomplished simultaneously using text and image. Although image processing for feature extraction and text processing for term extractions are well understood, there are no prior methods that can combine these two features into a single data structure. This dissertation will introduce a common representation format for multimedia documents (CRFMD) composed of both images and text. For image and text analysis, two techniques are used: the Lorenz Information Measurement and the Word Code. A new process named Jeong's Transform is demonstrated for extraction of text and image features, combining the two previous measurements to form a single data structure. Finally, this single data measurements to form a single data structure. Finally, this single data structure is analyzed by using multi-dimensional scaling. This allows multimedia objects to be represented on a two-dimensional graph as vectors. The distance between vectors represents the magnitude of the difference between multimedia documents. This study shows that image classification on a given test set is dramatically improved when text features are encoded together with image features. This effect appears to hold true even when the available text is diffused and is not uniform with the image features. This retrieval system works by representing a multimedia document as a single data structure. CRFMD is applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval. Multimedia systems. Document imaging systems. Information retrieval. Content-based image retrieval text and image Lorenz information measurement single data structure
88	Latent Semantic Analysis as a Method of Content-Based Image Retrieval in Medical Applications Makovoz, Gennadiy 01 January 2010 (has links) The research investigated whether a Latent Semantic Analysis (LSA)-based approach to image retrieval can map pixel intensity into a smaller concept space with good accuracy and reasonable computational cost. From a large set of computed tomography (CT) images, a retrieval query found all images for a particular patient based on semantic similarity. The effectiveness of the LSA retrieval was evaluated based on precision, recall, and F-score. This work extended the application of LSA to high-resolution CT radiology images. The images were chosen for their unique characteristics and their importance in medicine. Because CT images are intensity-only, they carry less information than color images. They typically have greater noise, higher intensity, greater contrast, and fewer colors than a raw RGB image. The study targeted level of intensity for image features extraction. The focus of this work was a formal evaluation of the LSA method in the context of large number of high-resolution radiology images. The study reported on preprocessing and retrieval time and discussed how reduction of the feature set size affected the results. LSA is an information retrieval technique that is based on the vector-space model. It works by reducing the dimensionality of the vector space, bringing similar terms and documents closer together. Matlab software was used to report on retrieval and preprocessing time. In determining the minimum size of concept space, it was found that the best combination of precision, recall, and F-score was achieved with 250 concepts (k = 250). This research reported precision of 100% on 100% of the queries and recall close to 90% on 100% of the queries with k=250. Selecting a higher number of concepts did not improve recall and resulted in significantly increased computational cost. CBIR with High Resolution CT Images Content-Based Image Retrieval LSA LSA with CT images Medical Imaging Computer Sciences
89	Enhanced image and video representation for visual recognition / Représentations d'image et de vidéo pour la reconnaissance visuelle Jain, Mihir 09 April 2014 (has links) L'objectif de cette thèse est d'améliorer les représentations des images et des vidéos dans le but d'obtenir une reconnaissance visuelle accrue, tant pour des entités spécifiques que pour des catégories plus génériques. Les contributions de cette thèse portent, pour l'essentiel, sur des méthodes de description du contenu visuel. Nous proposons des méthodes pour la recherche d'image par le contenu ou par des requêtes textuelles, ainsi que des méthodes pour la reconnaissance et la localisation d'action dans des vidéos. En recherche d'image, les contributions se fondent sur des méthodes `a base de plongements de Hamming. Tout d'abord, une méthode de comparaison asymétrique vecteur-`a-code est proposée pour améliorer la méthode originale, symétrique et utilisant une comparaison code-`a-code. Une méthode de classification fondée sur l'appariement de descripteurs locaux est ensuite proposée. Elle s'appuie sur une classification opérée dans un espace de similarités associées au plongement de Hamming. En reconnaissance d'action, les contributions portent essentiellement sur des meilleures manières d'exploiter et de représenter le mouvement. Finalement, une méthode de localisation est proposée. Elle utilise une partition de la vidéo en super-voxels, qui permet d'effectuer un échantillonnage 2D+t de suites de boîtes englobantes autour de zones spatio-temporelles d'intérêt. Elle s'appuie en particulier sur un critère de similarité associé au mouvement. Toutes les méthodes proposées sont évaluées sur des jeux de données publics. Ces expériences montrent que les méthodes proposées dans cette thèse améliorent l'état de l'art au moment de leur publication. / The subject of this thesis is about image and video representations for visual recognition. This thesis ﬁrst focuses on image search, both for image and textual queries, and then considers the classiﬁcation and the localization of actions in videos. In image retrieval, images similar to the query image are retrieved from a large dataset. On this front, we propose an asymmetric version of the Hamming Embedding method, where the comparison of query and database descriptors relies on a vector-to-binary code comparison. For image classiﬁcation, where the task is to identify if an image contains any instance of the queried category, we propose a novel approach based on a match kernel between images, more speciﬁcally based on Hamming Embedding similarity. We also present an effective variant of the SIFT descriptor, which leads to a better classiﬁcation accuracy. Action classiﬁcation is improved by several methods to better employ the motion inherent to videos. This is done by dominant motion compensation, and by introducing a novel descriptor based on kinematic features of the visual ﬂow. The last contribution is devoted to action localization, whose objective is to determine where and when the action of interest appears in the video. A selective sampling strategy produces 2D+t sequences of bounding boxes, which drastically reduces the candidate locations. The method advantageously exploits a criterion that takes in account how motion related to actions deviates from the background motion. We thoroughly evaluated all the proposed methods on real world images and videos from challenging benchmarks. Our methods outperform the previously published related state of the art and remains competitive with the subsequently proposed methods. Représentations visuelles Recherche d'image Classification d'image Reconnaissance d'action Localisation d'actions Visual representation Image retrieval Image classification Action recognition Action localization
90	Processamento de consultas por similaridade em imagens médicas visando à recuperação perceptual guiada pelo usuário / Similarity Queries Processing Aimed at Retrieving Medical Images Guided by the User´s Perception Silva, Marcelo Ponciano da 19 March 2009 (has links) O aumento da geração e do intercâmbio de imagens médicas digitais tem incentivado profissionais da computação a criarem ferramentas para manipulação, armazenamento e busca por similaridade dessas imagens. As ferramentas de recuperação de imagens por conteúdo, foco desse trabalho, têm a função de auxiliar na tomada de decisão e na prática da medicina baseada em estudo de casos semelhantes. Porém, seus principais obstáculos são conseguir uma rápida recuperação de imagens armazenadas em grandes bases e reduzir o gap semântico, caracterizado pela divergência entre o resultado obtido pelo computador e aquele esperado pelo médico. No presente trabalho, uma análise das funções de distância e dos descritores computacionais de características está sendo realizada com o objetivo de encontrar uma aproximação eficiente entre os métodos de extração de características de baixo nível e os parâmetros de percepção do médico (de alto nível) envolvidos na análise de imagens. O trabalho de integração desses três elementos (Extratores de Características, Função de Distância e Parâmetro Perceptual) resultou na criação de operadores de similaridade, que podem ser utilizados para aproximar o sistema computacional ao usuário final, visto que serão recuperadas imagens de acordo com a percepção de similaridade do médico, usuário final do sistema / The continuous growth of the medical images generation and their use in the day-to-day procedures in hospitals and medical centers has motivated the computer science researchers to develop algorithms, methods and tools to store, search and retrieve images by their content. Therefore, the content-based image retrieval (CBIR) field is also growing at a very fast pace. Algorithms and tools for CBIR, which are at the core of this work, can help on the decision making process when the specialist is composing the images analysis. This is based on the fact that the specialist can retrieve similar cases to the one under evaluation. However, the main reservation about the use of CBIR is to achieve a fast and effective retrieval, in the sense that the specialist gets what is expected for. That is, the problem is to bridge the semantic gap given by the divergence among the result automatically delivered by the system and what the user is expecting. In this work it is proposed the perceptual parameter, which adds to the relationship between the feature extraction algorithms and distance functions aimed at finding the best combination to deliver to the user what he/she expected from the query. Therefore, this research integrated the three main elements of similarity queries: the image features, the distance function and the perceptual parameter, what resulted in searching operators. The experiments performed show that these operators can narrow the distance between the system and the specialist, contributing to bridge the semantic gap Consultas por Similaridade Content-Based Image Retrieval Medical Image Processing Processamento de Imagens Médicas Similarity Queries

Search results