• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 150
  • 54
  • 15
  • 13
  • 13
  • 7
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 303
  • 303
  • 119
  • 90
  • 73
  • 72
  • 70
  • 56
  • 46
  • 46
  • 46
  • 44
  • 43
  • 42
  • 42
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Proposta de atualização de cadastro urbano a partir de detecção de alterações em imagens QUICK BIRD tomadas em diferentes épocas /

Souza, Gabriel Gustavo Barros de. January 2009 (has links)
Orientador: Amilton Amorim / Banca: Maria de Lourdes Bueno Trindade Galo / Banca: Alzir Felippe Buffara Antunes / Resumo: A atualização cadastral de área urbana é uma das questões mais importantes a ser considerada no planejamento municipal. Por esta área tratar de uma riqueza de detalhes acentuada, quando comparada as área rurais e de expansão urbana, torna-se difícil traçar uma metodologia de atualização de dados cadastrais que possa ser generalizada às áreas urbanas dos municípios. Isso não apenas em metodologia, mas para atender as necessidades e realidades que se deseja atualizar no Cadastro. Neste trabalho é apresentada uma proposta de atualização cadastral de área urbana a partir da utilização de imagens de satélite de alta resolução espacial (Quick Bird). São empregados, para isso, alguns métodos e técnicas nos processos de utilização das imagens adotadas. As imagens utilizadas abrangem a área teste, definida no município de Presidente Prudente. Para a detecção das alterações a serem atualizadas no banco de dados cadastrais foram utilizadas imagens pancromáticas e multiespectrais de épocas diferentes e empregaram-se técnicas de classificação de imagens para identificar e descrever visualmente os tipos de alvos alterados. De acordo com um limiar adotado, a partir das imagens e processos descritos, as alterações identificadas foram atualizadas no banco de dados cadastrais. As implicações para a seqüência adotada são apresentadas e discutidas nos capítulos desta pesquisa. / Abstract: The urban cadastre updating is one of the most important questions about urban planning. In this area there are many details when compared to the rural area and urban expansion area that hinds the introduction cadastral updating approach could be generalized to urban areas. The demand of public administration and the reality of the cities must be considered in all process including Cadastre. In this work there is presented a urban area cadastral updating approach with the use of high resolution imagery (Quick Bird satellite). The methods and techniques are employed in the processes of use of the adopted images. The used images are of the city of Presidente Prudente. Images of different times were used for the change detection to be updated in the cadastral database. Image Classifications were used to identify and to describe visually the changes. In accordance with an adopted threshold, from the images and described processes, the identified changes were updated in the cadastral database. The implications for the adopted sequence are presented and discussed in the chapters of this report. / Mestre
72

Building and Using Knowledge Models for Semantic Image Annotation / Construction et utilisation de modèles à base de connaissance pour l’annotation sémantique des images

Bannour, Hichem 08 February 2013 (has links)
Cette thèse propose une nouvelle méthodologie pour la construction et l’utilisation de modèles à base de connaissances pour l'annotation automatique d'images. Plus précisément, nous proposons dans un premier lieu des approches pour la construction automatique de modèles de connaissances explicites et structurés, à savoir des hiérarchies sémantiques et des ontologies multimédia adaptées pour l'annotation d'images. Ainsi, nous proposons une approche pour la construction automatique de hiérarchies sémantiques. Notre approche est basée sur une nouvelle mesure « sémantico-visuelle » entre concepts et un ensemble de règles qui permettent de relier les concepts les plus apparentés jusqu'à l'aboutissement à la hiérarchie finale. Ensuite, nous proposons de construire des modèles de connaissances plus riches en terme de sémantique et qui modélisent donc d'autres types de relations entre les concepts de l’image. Par conséquent, nous proposons une nouvelle approche pour la construction automatique d'une ontologie multimédia qui modélise non seulement les relations de subsomption, mais aussi les relations spatiales et contextuelles entre les concepts de l'image. L'ontologie proposée est adaptée pour raisonner sur la cohérence de l’annotation d'images. Afin d'évaluer l'efficacité des modèles de connaissances construits, nous proposons de les utiliser par la suite dans un cadre d'annotation d'images. Nous proposons donc une approche, basée sur la structure des hiérarchies sémantiques, pour la classification hiérarchique d'images. Puis, nous proposons une approche générique, combinant des techniques d'apprentissage automatique et le raisonnement ontologique flou, afin de produire des annotations d’images sémantiquement pertinentes. Des évaluations empiriques de nos approches ont montré une amélioration significative de la précision des annotations d'images. / This dissertation proposes a new methodology for building and using structured knowledge models for automatic image annotation. Specifically, our first proposals deal with the automatic building of explicit and structured knowledge models, such as semantic hierarchies and multimedia ontologies, dedicated to image annotation. Thereby, we propose a new approach for building semantic hierarchies faithful to image semantics. Our approach is based on a new image-semantic similarity measure between concepts and on a set of rules that allow connecting the concepts with higher relatedness till the building of the final hierarchy. Afterwards, we propose to go further in the modeling of image semantics through the building of explicit knowledge models that incorporate richer semantic relationships between image concepts. Therefore, we propose a new approach for automatically building multimedia ontologies consisting of subsumption relationships between concepts, and also other semantic relationships such as contextual and spatial relations. Fuzzy description logics are used as a formalism to represent our ontology and to deal with the uncertainty and the imprecision of concept relationships. In order to assess the effectiveness of the built structured knowledge models, we propose subsequently to use them in a framework for image annotation. We propose therefore an approach, based on the structure of semantic hierarchies, to effectively perform hierarchical image classification. Furthermore, we propose a generic approach for image annotation combining machine learning techniques, such as hierarchical image classification, and fuzzy ontological-reasoning in order to achieve a semantically relevant image annotation. Empirical evaluations of our approaches have shown significant improvement in the image annotation accuracy.
73

Recherche de motifs fréquents dans une base de cartes combinatoires / Frequent pattern discovery in combinatorial maps databases

Gosselin, Stéphane 24 October 2011 (has links)
Une carte combinatoire est un modèle topologique qui permet de représenter les subdivisions de l’espace en cellules et les relations d’adjacences et d’incidences entre ces cellules en n dimensions. Cette structure de données est de plus en plus utilisée en traitement d’images, mais elle manque encore d’outils pour les analyser. Notre but est de définir de nouveaux outils pour les cartes combinatoires nD. Nous nous intéressons plus particulièrement à l’extraction de sous-cartes fréquentes dans une base de cartes. Nous proposons deux signatures qui sont également des formes canoniques de cartes combinatoires. Ces signatures ont chacune leurs avantages et leurs inconvénients. La première permet de décider de l’isomorphisme entre deux cartes en temps linéaire, en contrepartie le coût de stockage en mémoire est quadratique en la taille de la carte. La seconde signature a un coût de stockage en mémoire linéaire en la taille de la carte, cependant le temps de calcul de l’isomorphisme est quadratique. Elles sont utilisables à la fois pour des cartes connexes, non connexes, valuées ou non valuées. Ces signatures permettent de représenter une base de cartes combinatoires et de rechercher un élément de manière efficace. De plus, le temps de recherche ne dépend pas du nombre de cartes présent dans la base. Ensuite, nous formalisons le problème de recherche de sous-cartes fréquentes dans une base de cartes combinatoires nD. Nous implémentons deux algorithmes pour résoudre ce problème. Le premier algorithme extrait les sous-cartes fréquentes par une approche en largeur tandis que le second utilise une approche en profondeur. Nous comparons les performances de ces deux algorithmes sur des bases de cartes synthétiques. Enfin, nous proposons d’utiliser les motifs fréquents dans une application de classification d’images. Chaque image est décrite par une carte qui est transformée en un vecteur représentant le nombre d’occurrences des motifs fréquents. À partir de ces vecteurs, nous utilisons des techniques classiques de classification définies sur les espaces vectoriels. Nous proposons des expérimentations en classification supervisée et non supervisée sur deux bases d’images. / A combinatorial map is a topological model that can represent the subdivisions of space into cells and their adjacency relations in n dimensions. This data structure is increasingly used in image processing, but it still lacks tools for analysis. Our goal is to define new tools for combinatorial maps nD. We are particularly interested in the extraction of submaps in a database of maps. We define two combinatorial map signatures : the first one has a quadratic space complexity and may be used to decide of isomorphism with a new map in linear time whereas the second one has a linear space complexity and may be used to decide of isomorphism in quadratic time. They can be used for connected maps, non connected maps, labbeled maps or non labelled maps. These signatures can be used to efficiently search for a map in a database.Moreover, the search time does not depend on the number of maps in the database. Then, we formalize the problem of finding frequent submaps in a database of combinatorial nD maps. We implement two algorithms for solving this problem. The first algorithm extracts the submaps with a breadth-first search approach and the second uses a depth-first search approach. We compare the performance of these two algorithms on synthetic database of maps. Finally, we propose to use the frequent patterns in an image classification application. Each image is described by a map that is transformed into a vector representing the number of occurrences of frequent patterns. From these vectors, we use standard techniques of classification defined on vector spaces. We propose experiments in supervised and unsupervised classification on two images databases.
74

Análise de imagens multiespectrais através de redes complexas / Multispectral image analysis through complex networks

Leonardo Felipe dos Santos Scabini 26 July 2018 (has links)
Imagens multiespectrais estão presentes na grande maioria de dispositivos de imageamento atuais, desde câmeras pessoais até microscópios, telescópios e satélites. No entanto, grande parte dos trabalhos em análise de texturas e afins propõem abordagens monocromáticas, que muitas vezes consideram apenas níveis de cinza. Nesse contexto e considerando o aumento da capacidade dos computadores atuais, o uso da informação espectral deve ser considerada na construção de modelos melhores. Ultimamente redes neurais convolucionais profundas pré-treinadas tem sido usadas em imagens coloridas de 3 canais, porém são limitadas a apenas esse formato e computam muitas convoluções, o que demanda por hardware específico (GPU). Esses fatos motivaram esse trabalho, que propõem técnicas para a modelagem e caracterização de imagens multiespectrais baseadas em redes complexas, que tem se mostrado uma ferramenta eficiente em trabalhos anteriores e possui complexidade computacional similar à métodos tradicionais. São introduzidas duas abordagens para aplicação em imagens coloridas de três canais, denominadas Rede Multicamada (RM) e Rede Multicamada Direcionada (RMD). Esses métodos modelam todos os canais da imagem de forma conjunta, onde as redes possuem conexões intra e entre canais, de forma parecida ao processamento oponente de cor do sistema visual humano. Experimentos em cinco bases de textura colorida mostram a proposta RMD supera vários métodos da literatura no geral, incluindo redes convolucionais e métodos tradicionais integrativos. Além disso, as propostas demonstraram alta robustez a diferentes espaços de cor (RGB, LAB, HSV e I1I2I3), enquanto que outros métodos oscilam de base para base. Também é proposto um método para caracterizar imagens multiespectrais de muitos canais, denominado Rede Direcionada de Similaridade Angular (RDSA). Nessa proposta, cada pixel multiespectral é considerado como um vetor de dimensão equivalente à quantidade de canais da imagem e o peso das arestas representa sua similaridade do cosseno, apontando para o pixel de maior valor absoluto. Esse método é aplicado em um conjunto de imagens de microscopia por fluorescência de 32 canais, em um experimento para identificar variações na estrutura foliar do espécime Jacaranda Caroba submetidos à diferentes condições. O método RDSA obtém as maiores taxas de acerto de classificação nesse conjunto de dados, com 91, 9% usando o esquema de validação cruzada Leave-one-out e 90, 5(±1, 1)% com 10-pastas, contra 81, 8% e 84, 7(±2, 2) da rede convolucional VGG16. / Multispectral images are present in the vast majority of current imaging devices, from personal cameras to microscopes, telescopes and satellites. However, much of the work in texture analysis and the like proposes monochromatic approaches, which often consider only gray levels. In this context and considering the performance increase of current computers, the use of the spectral information must be considered in the construction of better models. Lately, pre-trained deep convolutional neural networks have been used in 3-channel color images, however they are limited to just this format and compute many convolutions, which demands specific hardware (GPU). These facts motivated this work, which propose techniques for the modeling and characterization of multispectral images based on complex networks, which has proved to be an efficient tool in previous works and has computational complexity similar to traditional methods. Two approaches are introduced for application in 3-channel color images, called Multilayer Network (RM) and Directed Multilayer Network (RMD). These methods model all channels of the image together, where the networks have intra- and inter-channel connections, similar to the opponent color processing of the human visual system. Experiments in five color texture datasets shows that the RMD proposal overcomes several methods of the literature in general, including convolutional networks and traditional integrative methods. In addition, the proposals have demonstrated high robustness to different color spaces (RGB, LAB, HSV and I1I2I3), while other methods oscillate from dataset to dataset. Moreover it is proposed a new method to characterize multispectral images of many channels, called Directed Network of Angular Similarity (RDSA). In this proposal, each multispectral pixel is considered as a vector of dimensions equivalent to the number of channels of the image and the weight of the edges represents its cosine similarity, pointing to the pixel of greatest absolute value. This method is applied to a set of fluorescence microscopy images of 32 channels in an experiment to identify variations in the leaf structure of the Jacaranda Caroba specimen under different conditions. The RDSA method obtains the highest classification rates in this dataset, with 91.9% with the Leave-one-out cross-validation scheme and 90.5(±1.1)% with 10-folds, against 81.8% and 84.7(±2.2) of the convolutional network VGG16.
75

Remote sensing and machine learning applied to soil use detection in caatinga bioma / Aprendizado De MÃquina Na DetecÃÃo Do Uso Do Solo No Bioma Caatinga Via Sensoriamento Remoto

Beatriz Fernandes SimplÃcio Sousa 06 March 2009 (has links)
Conselho Nacional de Desenvolvimento CientÃfico e TecnolÃgico / In order to manage adequately natural resources inside a fragile environment, just like Caatinga, one should know its properties and spatial distribution. This work proposes an approach to classify LANDSAT-5 satellite images. These images, corresponding to a semiarid environment located in Iguatu country, Ceara, Brazil, were classified aiming at detecting the Caatinga biome by two type of classifiers based on machinery learning: Multi Layer Perceptron (MLP) and Support Vector Machine (SVM). The static classifier of Maximum Likelihood was also used as comparison to the other two methods. Agriculture, water, anthropical, herbaceous shrub Caatinga (CHA) and dense high Caatinga (CAD) are the five classes defined for classifying. MLP method tests were carried out changing neurons quantity in the intermediate layer. SVM method tests were carried out changing σ, from Gauss function, and penalization parameter (C). Performance of the tests was analyzed by Global Accuracy, Specific Accuracy and Kappa coefficient. The last one calculated by confusion matrix, which has been generated by comparison of classification data and ground control points GPS georreferenced (true points). MLP method presented best performance for tests in which 12 neurons have been attributed to the intermediate layer resulting in Global Accuracy and Kappa values of 82.14% and 0.76, respectively. On the other hand, SVM method presented best performance for tests carried out with C=1000 and σ=2, resulting in Global Accuracy and Kappa values of 86.03% and 0.77, respectively. The Maximum Likelihood classifier presented 81.2% of its pixels correctly classified (Global Accuracy) and K coefficient value of 0.73. The values of Specific Accuracy, which makes it possible to analyze the performance of each individual class, were above 70% in each class. A total 576 km2 area was classified. Between the two types of Caatinga biome considered, herbaceous shrub Caatinga (CHA) comes to be the most common. Therefore, taking into account experimental results, it is possible to conclude that both SVM and MLP methods, which are based on machine learning, show satisfactory performance for classifying Caatinga biome. / O manejo adequado dos recursos naturais em ambientes frÃgeis, como o da Caatinga, requer o conhecimento de suas propriedades e distribuiÃÃo espacial. Desta forma, o presente trabalho propÃe uma abordagem para a classificaÃÃo de imagens do satÃlite LANDSAT-5, correspondente a uma regiÃo semiÃrida localizada no municÃpio de Iguatu no Estado do CearÃ, objetivando detectar o bioma da Caatinga por meio de dois tipos de classificadores baseados em aprendizado de mÃquina: o mÃtodo baseado em Perceptrons de MÃltiplas Camadas-MLP (do inglÃs Multi Layer Perceptron) e o mÃtodo MÃquinas de Vetores de Suporte-SVM (do inglÃs Support Vector Machine). O classificador estatÃstico da mÃxima verossimilhanÃa, por ser amplamente utilizado na literatura, tambÃm foi aplicado à Ãrea em estudo para que o desempenho dos mÃtodos propostos fosse comparado aos destes. Cinco classes foram definidas para a classificaÃÃo, a saber: agricultura, antropizada, Ãgua, caatinga herbÃcea arbustiva (CHA) e caatinga arbÃrea densa (CAD). Para o mÃtodo MLP, foram realizados testes variando a quantidade de neurÃnios na camada intermediÃria. Jà os testes para o mÃtodo SVM consistiram em variar o parÃmetro σ da funÃÃo gaussiana e o parÃmetro de penalizaÃÃo (C). A eficiÃncia dos mÃtodos foi analisada por meio dos coeficientes de ExatidÃo Global, ExatidÃo EspecÃfica e de Kappa calculados por meio dos dados da matriz de confusÃo. Esta, por sua vez, foi gerada para cada mÃtodo a partir da comparaÃÃo entre a classificaÃÃo e os pontos georreferenciados com aparelho GPS (correspondentes à verdade terrestre). O mÃtodo MLP apresentou melhor desempenho para o teste em que 12 neurÃnios foram atribuÃdos à camada intermediÃria, com valores de ExatidÃo Global e de Kappa de 82,14% e 0,76, respectivamente. Jà o mÃtodo SVM apresentou melhor performance para o teste com C=1000 e σ=2 no qual se obteve valores de 86,03% e 0,77 para os coeficientes de ExatidÃo Global e Kappa, respectivamente. O valor de ExatidÃo Global para o classificador estatÃstico da mÃxima verossimilhanÃa permitiu concluir que 81,2% dos pixels foram classificados corretamente e o coeficiente de Kappa para este mÃtodo foi de 0,73. Os valores dos coeficientes de ExatidÃo EspecÃfica, que proporcionam analisar o desempenho dos mÃtodos em cada classe, foram superiores a 70%. A Ãrea total classificada foi de 576 km2 e, dentre as duas classes consideradas para o bioma Caatinga, a predominante à a do tipo caatinga herbÃcea arbustiva (CHA). Assim, por meio dos resultados experimentais obtidos, pode-se afirmar que os mÃtodos SVM e MLP, baseados em aprendizado de mÃquina, apresentaram desempenho satisfatÃrio para a classificaÃÃo do bioma Caatinga.
76

Investigação do uso de imagens de sensor de sensoriamento remoto hiperespectral e com alta resolução espacial no monitoramento da condição de uso de pavimentos rodoviários. / Investigation of use hyperspectral and high spatial resolution images from remote sensing in pavement surface condition monitoring.

Marcos Ribeiro Resende 24 September 2010 (has links)
Segundo a Agência Nacional de Transportes Terrestres (ANTT) em seu Anuário Estatístico dos Transportes Terrestres AETT (2008), o Brasil em todo o seu território possui 211.678 quilômetros de rodovias pavimentadas. O valor de serventia do pavimento diminui com o passar do tempo por dois fatores principais: o tráfego e as intempéries (BERNUCCI et al., 2008). Monitorar a condição de uso de toda a extensão das rodovias brasileiras é tarefa dispendiosa e demorada. A investigação de novas técnicas que permitam o levantamento da condição dos pavimentos de forma ágil e automática é parte da pesquisa deste trabalho. Nos últimos anos, um número crescente de imagens de alta resolução espacial tem surgido no mercado mundial com o aparecimento dos novos satélites e sensores aeroembarcados de sensoriamento remoto. Da mesma forma, imagens multiespectrais e até mesmo hiperespectrais estão sendo disponibilizadas comercialmente e para pesquisa científica. Neste trabalho são utilizadas imagens hiperespectrais de sensor digital aeroembarcado. Uma metodologia para identificação automática dos pavimentos asfaltados e classificação das principais ocorrências dos defeitos do asfalto foi desenvolvida. A primeira etapa da metodologia é a identificação do asfalto na imagem, utilizando uma classificação híbrida baseada inicialmente em pixel e depois refinada por objetos foi possível a extração da informação de asfalto das imagens disponíveis. A segunda etapa da metodologia é a identificação e classificação das ocorrências dos principais defeitos nos pavimentos flexíveis que são observáveis nas imagens de alta resolução espacial. Esta etapa faz uso intensivo das novas técnicas de classificação de imagens baseadas em objetos. O resultado final é a geração de índices da condição do pavimento, a partir das imagens, que possam ser comparados com os indicadores da qualidade da superfície do pavimento já normatizados pelos órgãos competentes no país. / According to Statistical Survey of Land Transportation AETT (2008) of National Agency of Land Transportation (ANTT), Brazil has in its territory 211,678 kilometers of paved roads. The pavement Present Serviceability Ratio (PSR) value decreases over time by two main factors: traffic and weather (BERNUCCI et al., 2008). Monitor the condition of use of all Brazilian roads is expensive and time consuming task. The investigation of new techniques that allow a quick and automatic survey of pavement condition is part of this research. In recent years, an increasing number of images with high spatial resolution has emerged on the world market with the advent of new remote sensing satellites and airborne sensors. Similarly, multispectral and even hyperspectral imagery are become available commercially and for scientific research nowadays. Hyperspectral images from digital airborne sensor have been used in this work. A new methodology for automatic identification of asphalted pavement and also for classification of the main defects of the asphalt has been developed. The first step of the methodology is the identification of the asphalt in the image, using hybrid classification based on pixel initially and after improved by objects. Using this approach was feasible to extract asphalt information from the available images. The second step of the methodology is the identification and classification of the main defects of flexible pavement surface that are observable in high spatial resolution imagery. This step makes intensive use of new techniques for classification of images based on objects. The goal, is the generation of pavement surface condition index from the images that can be compared with quality index of pavement surface that are already regulated by the regulatory agency in the country.
77

On the Construction of an Automatic Traffic Sign Recognition System

Jonsson, Fredrik January 2017 (has links)
This thesis proposes an automatic road sign recognition system, including all steps from the initial detection of road signs from a digital image to the final recognition step that determines the class of the sign. We develop a Bayesian approach for image segmentation in the detection step using colour information in the HSV (Hue, Saturation and Value) colour space. The image segmentation uses a probability model which is constructed based on manually extracted data on colours of road signs collected from real images. We show how the colour data is fitted using mixture multivariate normal distributions, where for the case of parameter estimation Gibbs sampling is used. The fitted models are then used to find the (posterior) probability of a pixel colour to belong to a road sign using the Bayesian approach. Following the image segmentation, regions of interest (ROIs) are detected by using the Maximally Stable Extremal Region (MSER) algorithm, followed by classification of the ROIs using a cascade of classifiers. Synthetic images are used in training of the classifiers, by applying various random distortions to a set of template images constituting most road signs in Sweden, and we demonstrate that the construction of such synthetic images provides satisfactory recognition rates. We focus on a large set of the signs on the Swedish road network, including almost 200 road signs. We use classification models such as the Support Vector Machine (SVM), and Random Forest (RF), where for features we use Histogram of Oriented Gradients (HOG).
78

Classifying Environmental Sounds with Image Networks

Boddapati, Venkatesh January 2017 (has links)
Context. Environmental Sound Recognition, unlike Speech Recognition, is an area that is still in the developing stages with respect to using Deep Learning methods. Sound can be converted into images by extracting spectrograms and the like. Object Recognition from images using deep Convolutional Neural Networks is a currently developing area holding high promise. The same technique has been studied and applied, but on image representations of sound. Objectives. In this study, investigation is done to determine the best possible accuracy of performing a sound classification task using existing deep Convolutional Neural Networks by comparing the data pre-processing parameters. Also, a novel method of combining different features into a single image is proposed and its effect tested. Lastly, the performance of an existing network that fuses Convolutional and Recurrent Neural architectures is tested on the selected datasets. Methods. In this, experiments were conducted to analyze the effects of data pre-processing parameters on the best possible accuracy with two CNNs. Also, experiment was also conducted to determine whether the proposed method of feature combination is beneficial or not. Finally, an experiment to test the performance of a combined network was conducted. Results. GoogLeNet had the highest classification accuracy of 73% on 50-class dataset and 90-93% on 10-class datasets. The sampling rate and frame length values of the respective datasets which contributed to the high scores are 16kHz, 40ms and 8kHz, 50ms respectively. The proposed combination of features does not improve the classification accuracy. The fused CRNN network could not achieve high accuracy on the selected datasets. Conclusions. It is concluded that deep networks designed for object recognition can be successfully used to classify environmental sounds and the pre-processing parameters’ values determined for achieving best accuracy. The novel method of feature combination does not significantly improve the accuracy when compared to spectrograms alone. The fused network which learns the special and temporal features from spectral images performs poorly in the classification task when compared to the convolutional network alone.
79

Application of the German Traffic Sign Recognition Benchmark on the VGG16 network using transfer learning and bottleneck features in Keras

Persson, Siri January 2018 (has links)
Convolutional Neural Networks (CNNs) are successful tools in image classification. CNNs are inspired by the animal visual cortex using a similar connectivity pattern as between neurons. The purpose of this thesis is to create a classifier, using transfer learning, that manages to classify images of traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB) with good accuracy and to improve the performance further by tuning the hyperparameters. The pre-trained CNN used is the VGG16 network from the paper "Very deep convolutional networks for large-scale image recognition". The result showed that the VGG16 network got an accuracy of 74.5\% for the hyperparameter set where the learning rate was 1e-6, the batch size was 15 and the dropout rate 0.3. The conclusion was that transfer learning using the bottleneck features is a good tool for building a classifier with only a small amount of training data available and that the results probably could be further improved using more real data or data augmentation both for training and testing and by tuning more of the hyperparameters in the network.
80

Handling imperfections for multimodal image annotation / Gestion des imperfections pour l’annotation multimodale d’images

Znaidia, Amel 11 February 2014 (has links)
La présente thèse s’intéresse à l’annotation multimodale d’images dans le contexte des médias sociaux. Notre objectif est de combiner les modalités visuelles et textuelles (tags) afin d’améliorer les performances d’annotation d’images. Cependant, ces tags sont généralement issus d’une indexation personnelle, fournissant une information imparfaite et partiellement pertinente pour un objectif de description du contenu sémantique de l’image. En outre, en combinant les scores de prédiction de différents classifieurs appris sur les différentes modalités, l’annotation multimodale d’image fait face à leurs imperfections: l’incertitude, l’imprécision et l’incomplétude. Dans cette thèse, nous considérons que l’annotation multimodale d’image est soumise à ces imperfections à deux niveaux : niveau représentation et niveau décision. Inspiré de la théorie de fusion de l’information, nous concentrons nos efforts dans cette thèse sur la définition, l’identification et la prise en compte de ces aspects d’imperfections afin d’améliorer l’annotation d’images. / This thesis deals with multimodal image annotation in the context of social media. We seek to take advantage of textual (tags) and visual information in order to enhance the image annotation performances. However, these tags are often noisy, overly personalized and only a few of them are related to the semantic visual content of the image. In addition, when combining prediction scores from different classifiers learned on different modalities, multimodal image annotation faces their imperfections (uncertainty, imprecision and incompleteness). Consequently, we consider that multimodal image annotation is subject to imperfections at two levels: the representation and the decision. Inspired from the information fusion theory, we focus in this thesis on defining, identifying and handling imperfection aspects in order to improve image annotation.

Page generated in 0.5328 seconds