41 |
Reconnaissance et classification d’images de documents / Document image retrieval and classificationAugereau, Olivier 14 February 2013 (has links)
Ces travaux de recherche ont pour ambition de contribuer à la problématique de la classification d’images de documents. Plus précisément, ces travaux tendent à répondre aux problèmes rencontrés par des sociétés de numérisation dont l’objectif est de mettre à disposition de leurs clients une version numérique des documents papiers accompagnés d’informations qui leurs sont relatives. Face à la diversité des documents à numériser, l’extraction d’informations peut s’avérer parfois complexe. C’est pourquoi la classification et l’indexation des documents sont très souvent réalisées manuellement. Ces travaux de recherche ont permis de fournir différentes solutions en fonction des connaissances relatives aux images que possède l’utilisateur ayant en charge l’annotation des documents.Le premier apport de cette thèse est la mise en place d’une méthode permettant, de manière interactive, à un utilisateur de classer des images de documents dont la nature est inconnue. Le second apport de ces travaux est la proposition d’une technique de recherche d’images de documents par l’exemple basée sur l’extraction et la mise en correspondance de points d’intérêts. Le dernier apport de cette thèse est l’élaboration d’une méthode de classification d’images de documents utilisant les techniques de sacs de mots visuels. / The aim of this research is to contribute to the document image classification problem. More specifically, these studies address digitizing company issues which objective is to provide the digital version of paper document with information relating to them. Given the diversity of documents, information extraction can be complex. This is why the classification and the indexing of documents are often performed manually. This research provides several solutions based on knowledge of the images that the user has. The first contribution of this thesis is a method for classifying interactively document images, where the content of documents and classes are unknown. The second contribution of this work is a new technique for document image retrieval by giving one example of researched document. This technique is based on the extraction and matching of interest points. The last contribution of this thesis is a method for classifying document images by using bags of visual words techniques.
|
42 |
A Content-Based Image Retrieval System for Fish TaxonomyTeng, Fei 22 May 2006 (has links)
It is estimated that less than ten percent of the world's species have been discovered and described. The main reason for the slow pace of new species description is that the science of taxonomy, as traditionally practiced, can be very laborious: taxonomists have to manually gather and analyze data from large numbers of specimens and identify the smallest subset of external body characters that uniquely diagnoses the new species as distinct from all its known relatives. The pace of data gathering and analysis can be greatly increased by the information technology. In this paper, we propose a content-based image retrieval system for taxonomic research. The system can identify representative body shape characters of known species based on digitized landmarks and provide statistical clues for assisting taxonomists to identify new species or subspecies. The experiments on a taxonomic problem involving species of suckers in the genera Carpiodes demonstrate promising results.
|
43 |
A tree grammar-based visual password schemeOkundaye, Benjamin January 2016 (has links)
A thesis submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, August 31, 2015. / Visual password schemes can be considered as an alternative to alphanumeric
passwords. Studies have shown that alphanumeric passwords
can, amongst others, be eavesdropped, shoulder surfed, or
guessed, and are susceptible to brute force automated attacks. Visual
password schemes use images, in place of alphanumeric characters,
for authentication. For example, users of visual password schemes either
select images (Cognometric) or points on an image (Locimetric)
or attempt to redraw their password image (Drawmetric), in order
to gain authentication. Visual passwords are limited by the so-called
password space, i.e., by the size of the alphabet from which users can
draw to create a password and by susceptibility to stealing of passimages
by someone looking over your shoulders, referred to as shoulder
surfing in the literature. The use of automatically generated highly
similar abstract images defeats shoulder surfing and means that an almost
unlimited pool of images is available for use in a visual password
scheme, thus also overcoming the issue of limited potential password
space.
This research investigated visual password schemes. In particular,
this study looked at the possibility of using tree picture grammars to
generate abstract graphics for use in a visual password scheme. In this
work, we also took a look at how humans determine similarity of abstract
computer generated images, referred to as perceptual similarity
in the literature. We drew on the psychological idea of similarity and
matched that as closely as possible with a mathematical measure of
image similarity, using Content Based Image Retrieval (CBIR) and
tree edit distance measures. To this end, an online similarity survey
was conducted with respondents ordering answer images in order
of similarity to question images, involving 661 respondents and 50
images. The survey images were also compared with eight, state of
the art, computer based similarity measures to determine how closely
they model perceptual similarity. Since all the images were generated
with tree grammars, the most popular measure of tree similarity, the
tree edit distance, was also used to compare the images. Eight different
types of tree edit distance measures were used in order to cover
the broad range of tree edit distance and tree edit distance approximation
methods. All the computer based similarity methods were
then correlated with the online similarity survey results, to determine
which ones more closely model perceptual similarity. The results were
then analysed in the light of some modern psychological theories of
perceptual similarity.
This work represents a novel approach to the Passfaces type of visual
password schemes using dynamically generated pass-images and their
highly similar distractors, instead of static pictures stored in an online
database. The results of the online survey were then accurately
modelled using the most suitable tree edit distance measure, in order
to automate the determination of similarity of our generated distractor
images. The information gathered from our various experiments
was then used in the design of a prototype visual password scheme.
The generated images were similar, but not identical, in order to defeat
shoulder surfing. This approach overcomes the following problems
with this category of visual password schemes: shoulder surfing,
bias in image selection, selection of easy to guess pictures and infrastructural
limitations like large picture databases, network speed and
database security issues. The resulting prototype developed is highly
secure, resilient to shoulder surfing and easy for humans to use, and
overcomes the aforementioned limitations in this category of visual
password schemes.
|
44 |
Análise e avaliação de técnicas de interação humano-computador para sistemas de recuperação de imagens por conteúdo baseadas em estudo de caso / Evaluating human-computer interaction techniques for content-based image retrieval systems through a case studyFilardi, Ana Lúcia 30 August 2007 (has links)
A recuperação de imagens baseada em conteúdo, amplamente conhecida como CBIR (do inglês Content-Based Image Retrieval), é um ramo da área da computação que vem crescendo muito nos últimos anos e vem contribuindo com novos desafios. Sistemas que utilizam tais técnicas propiciam o armazenamento e manipulação de grandes volumes de dados e imagens e processam operações de consultas de imagens a partir de características visuais extraídas automaticamente por meio de métodos computacionais. Esses sistemas devem prover uma interface de usuário visando uma interação fácil, natural e atraente entre o usuário e o sistema, permitindo que o usuário possa realizar suas tarefas com segurança, de modo eficiente, eficaz e com satisfação. Desse modo, o design da interface firma-se como um elemento fundamental para o sucesso de sistemas CBIR. Contudo, dentro desse contexto, a interface do usuário ainda é um elemento constituído de pouca pesquisa e desenvolvimento. Um dos obstáculos para eficácia de design desses sistemas consiste da necessidade em prover aos usuários uma interface de alta qualidade para permitir que o usuário possa consultar imagens similares a uma dada imagem de referência e visualizar os resultados. Para atingir esse objetivo, este trabalho visa analisar a interação do usuário em sistemas de recuperação de imagens por conteúdo e avaliar sua funcionalidade e usabilidade, aplicando técnicas de interação humano-computador que apresentam bons resultados em relação à performance de sistemas com grande complexidade, baseado em um estudo de caso aplicado à medicina / The content-based image retrieval (CBIR) is a challenging area of the computer science that has been growing in a very fast pace in the last years. CBIR systems employ techniques for extracting features from the images, composing the features vectores, and store them together with the images in data bases management system, allowing indexing and querying. CBIR systems deal with large volumes of images. Therefore, the feature vectors are extracted by automatic methods. These systems allow to query the images by content, processing similarity queries, which inherently demands user interaction. Consequently, CBIR systems must pay attention to the user interface, aiming at providing friendly, intuitive and attractive interaction, leading the user to do the tasks efficiently, getting the desired results, feeling safe and fulfilled. From the points highlighted beforehand, we can state that the human-computer interaction (HCI) is a key element of a CBIR system. However, there is still little research on HCI for CBIR systems. One of the requirements of HCI for CBIR is to provide a high quality interface to allow the user to search for similar images to a given query image, and to display the results properly, allowing further interaction. The present dissertation aims at analyzing the user interaction in CBIR systems specially suited to medical applications, evaluating their usability by applying HCI techniques. To do so, a case study was employed, and the results presented
|
45 |
Visão computacional : indexação automatizada de imagens / Computer vision : automated indexing of imagesFerrugem, Anderson Priebe January 2004 (has links)
O avanço tecnológico atual está permitindo que as pessoas recebam cada vez mais informações visuais dos mais diferentes tipos, nas mais variadas mídias. Esse aumento fantástico está obrigando os pesquisadores e as indústrias a imaginar soluções para o armazenamento e recuperação deste tipo de informação, pois nossos computadores ainda utilizam, apesar dos grandes avanços nessa área, um sistema de arquivos imaginado há décadas, quando era natural trabalhar com informações meramente textuais. Agora, nos deparamos com novos problemas: Como encontrar uma paisagem específica em um banco de imagens, em que trecho de um filme aparece um cavalo sobre uma colina, em que parte da fotografia existe um gato, como fazer um robô localizar um objeto em uma cena, entre outras necessidades. O objetivo desse trabalho é propor uma arquitetura de rede neural artificial que permita o reconhecimento de objetos genéricos e de categorias em banco de imagens digitais, de forma que se possa recuperar imagens específicas a partir da descrição da cena fornecida pelo usuário. Para que esse objetivo fosse alcançado, foram utilizadas técnicas de Visão Computacional e Processamento de Imagens na etapa de extração de feições de baixo nível e de Redes Neurais(Mapas Auto-Organizáveis de Kohonen) na etapa de agrupamento de classes de objetos. O resultado final desse trabalho pretende ser um embrião para um sistema de reconhecimento de objetos mais genérico, que possa ser estendido para a criação de indices de forma automática ou semi-automática em grandes bancos de imagens. / The current technological progress allows people to receive more and more visual information of the most different types, in different medias. This huge augmentation of image availability forces researchers and industries to propose efficient solutions for image storage and recovery. Despite the extraordinary advances in computational power, the data files system remain the same for decades, when it was natural to deal only with textual information. Nowadays, new problems are in front of us in this field. For instance, how can we find an specific landscape in a image database, in which place of a movie there is a horse on a hill, in which part of a photographic picture there is a cat, how can a robot find an object in a scene, among other queries. The objective of this work is to propose an Artificial Neural Network (ANN) architecture that performs the recognition of generic objects and object’s categories in a digital image database. With this implementation, it becomes possible to do image retrieval through the user´s scene description. To achieve our goal, we have used Computer Vision and Image Processing techniques in low level features extraction and Neural Networks (namely Kohonen’s Self-Organizing Maps) in the phase of object classes clustering. The main result of this work aims to be a seed for a more generic object recognition system, which can be extended to the automatic or semi-automatic index creation in huge image databases.
|
46 |
Vad säger bilden? : En utvärdering av återvinningseffektiviteten i ImBrowse / What can an Image tell? : An Evaluation of the Retrieval Performance in ImBrowseHenrysson, Jennie, Johansson, Kristina, Juhlin, Charlotte January 2006 (has links)
The aim of this master thesis is to evaluate the performance of the content-based image retrieval system ImBrowse from a semantic point of view. Evaluation of retrieval performance is a problem in content-based image retrieval (CBIR). There are many different methods for measuring the performance of content-based image retrieval systems, but no common way for performing the evaluation. The main focus is on image retrieval regarding the extraction of the visual features in the image, from three semantic levels. The thesis tries to elucidate the semantic gap, which is the problem when the systems extraction of the visual features from the image and the user’s interpretation of that same information do not correspond. The method of this thesis is based on similar methods in evaluation studies of CBIR systems. The thesis is an evaluation of ImBrowse’s feature descriptors for 30 topics at three semantic levels and compared the descriptors performance based on our relevance assessment. For the computation of the results the precision at DCV = 20 is used. The results are presented in tables and a chart. The conclusion from this evaluation is that the retrieval effectiveness from a general point of view did not meet the semantic level of our relevance assessed topics. However, since the thesis do not have another system with the same search functions to compare with it is difficult to draw a comprehensive conclusion of the results. / Uppsatsnivå: D
|
47 |
Optimerad bildsökning : Bör vissa egenskaper prioriteras vid sökning efter en viss kategori av bilder? / Optimized image search : Should certain features be emphasized when searching for a specific image category?Larsson, Carl January 2009 (has links)
As the information society becomes increasingly flooded with digital images, the need for efficient image retrieval systems increases as well. To handle the vast amounts of data involved, the indexing process needs to be run automatically, using content-based descriptors extracted directly from the digital image, such as colour composition, shape and texture features. These content-based image retrieval systems are often slow and cumbersome, and can appear confusing to an ordinary user who does not understand the underlying mechanisms. One step towards more efficient and user-friendly retrieval systems might be to adjust the weight placed on various descriptors depending on which image category is being searched for. The results of this thesis show that certain categories of digital images would benefit from having extra weight assigned to colour, texture or shape features when searching for images of that category.
|
48 |
Indexação e recuperação de imagens por cor e estrutura / Image indexing and retrieval by color and shapeCosta, Yandre Maldonado e Gomes da January 2002 (has links)
Este trabalho descreve um conjunto de técnicas para a recuperação de imagens baseada nos aspectos cromático e estrutural das mesmas. A abordagem aqui descrita utiliza mecanismos que permitem a preservação de informação espacial referente aos conteúdos extraídos da imagem de forma que a sua precisão possa ser ajustada de acordo com a necessidade da consulta. Um outro importante aspecto aqui considerado, é a possibilidade de se optar por um dos seguintes espaços de cores para a verificação de distâncias entre cores no momento da recuperação: RGB, L*u*v*, ou L*a*b*. Com estas diferentes possibilidades de espaços de cores, será verificada a influência que os mesmos podem provocar no processo de recuperação de imagens baseado em aspectos cromáticos. O conjunto de técnicas para a recuperação de imagens abordadas neste trabalho levou à construção do sistema RICE, um ambiente computacional através do qual pode-se realizar consultas a partir de um repositório de imagens. Para a verificação do desempenho dos diferentes parâmetros ajustáveis na recuperação de imagens aqui descrita e implementada no sistema RICE, foram utilizadas curvas de “Recall x Precision”. / This work describes a set of image retrieval techniques by color and shape similarity. The approach presented here allows to preserve spacial relantionships of the contents extracted from the image. And it can be adjusted accordingly to the query needs. Another important feature considered here, is the possibility of choosing between the RGB, L*u*v*, and L*a*b* color spaces to compute color distances during the image retrieval operation. With these three options of color spaces, the influence of each one in the image retrieval process based in chromatic contents will be verified. The set of techniques for image retrieval described here led to development of the RICE system, a computational environment for image retrieval by color and shape similarity. Furthermore, the recall x precision graph was applied in order to verify the performance of the RICE system in several configuration modes of image retrieval.
|
49 |
Improved Scoring Models for Semantic Image Retrieval Using Scene GraphsConser, Erik Timothy 28 September 2017 (has links)
Image retrieval via a structured query is explored in Johnson, et al. [7]. The query is structured as a scene graph and a graphical model is generated from the scene graph's object, attribute, and relationship structure. Inference is performed on the graphical model with candidate images and the energy results are used to rank the best matches. In [7], scene graph objects that are not in the set of recognized objects are not represented in the graphical model. This work proposes and tests two approaches for modeling the unrecognized objects in order to leverage the attribute and relationship models to improve image retrieval performance.
|
50 |
Topics in Content Based Image Retrieval : Fonts and Color EmotionsSolli, Martin January 2009 (has links)
<p>Two novel contributions to Content Based Image Retrieval are presented and discussed. The first is a search engine for font recognition. The intended usage is the search in very large font databases. The input to the search engine is an image of a text line, and the output is the name of the font used when printing the text. After pre-processing and segmentation of the input image, a local approach is used, where features are calculated for individual characters. The method is based on eigenimages calculated from edge filtered character images, which enables compact feature vectors that can be computed rapidly. A system for visualizing the entire font database is also proposed. Applying geometry preserving linear- and non-linear manifold learning methods, the structure of the high-dimensional feature space is mapped to a two-dimensional representation, which can be reorganized into a grid-based display. The performance of the search engine and the visualization tool is illustrated with a large database containing more than 2700 fonts.</p><p>The second contribution is the inclusion of color-based emotion-related properties in image retrieval. The color emotion metric used is derived from psychophysical experiments and uses three scales: <em>activity</em>, <em>weight </em>and <em>heat</em>. It was originally designed for single-color combinations and later extended to include pairs of colors. A modified approach for statistical analysis of color emotions in images, involving transformations of ordinary RGB-histograms, is used for image classification and retrieval. The methods are very fast in feature extraction, and descriptor vectors are very short. This is essential in our application where the intended use is the search in huge image databases containing millions or billions of images. The proposed method is evaluated in psychophysical experiments, using both category scaling and interval scaling. The results show that people in general perceive color emotions for multi-colored images in similar ways, and that observer judgments correlate with derived values.</p><p>Both the font search engine and the emotion based retrieval system are implemented in publicly available search engines. User statistics gathered during a period of 20 respectively 14 months are presented and discussed.</p>
|
Page generated in 0.0657 seconds