Spelling suggestions: "subject:"unsupervised 1earning."" "subject:"unsupervised c1earning.""
311 |
Agrupamento de sequências de miRNA utilizando aprendizado não-supervisionado baseado em grafosKasahara, Viviani Akemi 12 August 2016 (has links)
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-10-11T17:36:54Z
No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:21Z (GMT) No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:27Z (GMT) No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Made available in DSpace on 2016-10-21T13:03:34Z (GMT). No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5)
Previous issue date: 2016-08-12 / Não recebi financiamento / Cluster analysis is the organization of a collection of patterns into clusters based on similarity
which is determined by using properties of data. Clustering techniques can be useful in a
variety of knowledge domains such as biotechnology, computer vision, document retrieval and
many others. An interesting area of biology involves the concept of microRNAs (miRNAs) that
are approximately 22 nucleotide-long non-coding RNA molecules that play important roles in
gene regulation. Clustering miRNA sequences can help to understand and explore sequences
belonging to the same cluster that has similar biological functions. This research work
investigates and explores seven unsupervised clustering algorithms based on graphs that can be
divided into three categories: algorithm based on region of influence, algorithm based on
minimum spanning tree and spectral algorithm. To assess the contribution of the proposed
algorithms, data from miRNA families stored in the online miRBase database were used in the
conducted experiments. The results of these experiments were presented, analysed and
evaluated using clustering validation indexes as well as visual analysis. / A análise de agrupamento é uma organização de coleção de padrões em grupos, baseando-se na
similaridade das propriedades pertencentes aos dados. A técnica de agrupamento pode ser
utilizado em muitas áreas de conhecimento como biotecnologia, visão computacional,
recuperação de documentos, entre outras. Uma área interessante da biologia envolve o conceito
de microRNAs (miRNAs), que são moléculas não-codificadas de RNA com aproximadamente
22 nucleotídeos e que desempenham um papel importante na regulação dos genes. O
agrupamento de sequências de miRNA podem ajudar em sua exploração e entendimento, pois
as sequências que pertencem ao mesmo grupo possuem uma função biológica similar. Esse
trabalho explora e investiga sete algoritmos de agrupamentos não-supervisionados baseados em
grafos que podem ser divididos em três categorias: algoritmos baseados em região de
influência, algoritmos baseados em árvore spanning minimal e algoritmo espectral. Para avaliar
a contribuição dos algoritmos propostos, os experimentos conduzidos utilizaram os dados das
famílias de miRNAs disponíveis no banco de dados denominado miRBase. Os resultados dos
experimentos foram apresentados, analisados e avaliados usando índices de validação de
agrupamento e análise visual.
|
312 |
Machine learning via dynamical processes on complex networks / Aprendizado de máquina via processos dinâmicos em redes complexasThiago Henrique Cupertino 20 December 2013 (has links)
Extracting useful knowledge from data sets is a key concept in modern information systems. Consequently, the need of efficient techniques to extract the desired knowledge has been growing over time. Machine learning is a research field dedicated to the development of techniques capable of enabling a machine to \"learn\" from data. Many techniques have been proposed so far, but there are still issues to be unveiled specially in interdisciplinary research. In this thesis, we explore the advantages of network data representation to develop machine learning techniques based on dynamical processes on networks. The network representation unifies the structure, dynamics and functions of the system it represents, and thus is capable of capturing the spatial, topological and functional relations of the data sets under analysis. We develop network-based techniques for the three machine learning paradigms: supervised, semi-supervised and unsupervised. The random walk dynamical process is used to characterize the access of unlabeled data to data classes, configuring a new heuristic we call ease of access in the supervised paradigm. We also propose a classification technique which combines the high-level view of the data, via network topological characterization, and the low-level relations, via similarity measures, in a general framework. Still in the supervised setting, the modularity and Katz centrality network measures are applied to classify multiple observation sets, and an evolving network construction method is applied to the dimensionality reduction problem. The semi-supervised paradigm is covered by extending the ease of access heuristic to the cases in which just a few labeled data samples and many unlabeled samples are available. A semi-supervised technique based on interacting forces is also proposed, for which we provide parameter heuristics and stability analysis via a Lyapunov function. Finally, an unsupervised network-based technique uses the concepts of pinning control and consensus time from dynamical processes to derive a similarity measure used to cluster data. The data is represented by a connected and sparse network in which nodes are dynamical elements. Simulations on benchmark data sets and comparisons to well-known machine learning techniques are provided for all proposed techniques. Advantages of network data representation and dynamical processes for machine learning are highlighted in all cases / A extração de conhecimento útil a partir de conjuntos de dados é um conceito chave em sistemas de informação modernos. Por conseguinte, a necessidade de técnicas eficientes para extrair o conhecimento desejado vem crescendo ao longo do tempo. Aprendizado de máquina é uma área de pesquisa dedicada ao desenvolvimento de técnicas capazes de permitir que uma máquina \"aprenda\" a partir de conjuntos de dados. Muitas técnicas já foram propostas, mas ainda há questões a serem reveladas especialmente em pesquisas interdisciplinares. Nesta tese, exploramos as vantagens da representação de dados em rede para desenvolver técnicas de aprendizado de máquina baseadas em processos dinâmicos em redes. A representação em rede unifica a estrutura, a dinâmica e as funções do sistema representado e, portanto, é capaz de capturar as relações espaciais, topológicas e funcionais dos conjuntos de dados sob análise. Desenvolvemos técnicas baseadas em rede para os três paradigmas de aprendizado de máquina: supervisionado, semissupervisionado e não supervisionado. O processo dinâmico de passeio aleatório é utilizado para caracterizar o acesso de dados não rotulados às classes de dados configurando uma nova heurística no paradigma supervisionado, a qual chamamos de facilidade de acesso. Também propomos uma técnica de classificação de dados que combina a visão de alto nível dos dados, por meio da caracterização topológica de rede, com relações de baixo nível, por meio de medidas de similaridade, em uma estrutura geral. Ainda no aprendizado supervisionado, as medidas de rede modularidade e centralidade Katz são aplicadas para classificar conjuntos de múltiplas observações, e um método de construção evolutiva de rede é aplicado ao problema de redução de dimensionalidade. O paradigma semissupervisionado é abordado por meio da extensão da heurística de facilidade de acesso para os casos em que apenas algumas amostras de dados rotuladas e muitas amostras não rotuladas estão disponíveis. É também proposta uma técnica semissupervisionada baseada em forças de interação, para a qual fornecemos heurísticas para selecionar parâmetros e uma análise de estabilidade mediante uma função de Lyapunov. Finalmente, uma técnica não supervisionada baseada em rede utiliza os conceitos de controle pontual e tempo de consenso de processos dinâmicos para derivar uma medida de similaridade usada para agrupar dados. Os dados são representados por uma rede conectada e esparsa na qual os vértices são elementos dinâmicos. Simulações com dados de referência e comparações com técnicas de aprendizado de máquina conhecidas são fornecidos para todas as técnicas propostas. As vantagens da representação de dados em rede e de processos dinâmicos para o aprendizado de máquina são evidenciadas em todos os casos
|
313 |
Self-Organizing Neural Visual Models to Learn Feature Detectors and Motion Tracking Behaviour by Exposure to Real-World DataYogeswaran, Arjun January 2018 (has links)
Advances in unsupervised learning and deep neural networks have led to increased performance in a number of domains, and to the ability to draw strong comparisons between the biological method of self-organization conducted by the brain and computational mechanisms. This thesis aims to use real-world data to tackle two areas in the domain of computer vision which have biological equivalents: feature detection and motion tracking.
The aforementioned advances have allowed efficient learning of feature representations directly from large sets of unlabeled data instead of using traditional handcrafted features. The first part of this thesis evaluates such representations by comparing regularization and preprocessing methods which incorporate local neighbouring information during training on a single-layer neural network. The networks are trained and tested on the Hollywood2 video dataset, as well as the static CIFAR-10, STL-10, COIL-100, and MNIST image datasets. The induction of topography or simple image blurring via Gaussian filters during training produces better discriminative features as evidenced by the consistent and notable increase in classification results that they produce. In the visual domain, invariant features are desirable such that objects can be classified despite transformations. It is found that most of the compared methods produce more invariant features, however, classification accuracy does not correlate to invariance.
The second, and paramount, contribution of this thesis is a biologically-inspired model to explain the emergence of motion tracking behaviour in early development using unsupervised learning. The model’s self-organization is biased by an original concept called retinal constancy, which measures how similar visual contents are between successive frames. In the proposed two-layer deep network, when exposed to real-world video, the first layer learns to encode visual motion, and the second layer learns to relate that motion to gaze movements, which it perceives and creates through bi-directional nodes. This is unique because it uses general machine learning algorithms, and their inherent generative properties, to learn from real-world data. It also implements a biological theory and learns in a fully unsupervised manner. An analysis of its parameters and limitations is conducted, and its tracking performance is evaluated. Results show that this model is able to successfully follow targets in real-world video, despite being trained without supervision on real-world video.
|
314 |
Machine learning in complex networks: modeling, analysis, and applications / Aprendizado de máquina em redes complexas: modelagem, análise e aplicaçõesThiago Christiano Silva 13 December 2012 (has links)
Machine learning is evidenced as a research area with the main purpose of developing computational methods that are capable of learning with their previously acquired experiences. Although a large amount of machine learning techniques has been proposed and successfully applied in real systems, there are still many challenging issues, which need be addressed. In the last years, an increasing interest in techniques based on complex networks (large-scale graphs with nontrivial connection patterns) has been verified. This emergence is explained by the inherent advantages provided by the complex network representation, which is able to capture the spatial, topological and functional relations of the data. In this work, we investigate the new features and possible advantages offered by complex networks in the machine learning domain. In fact, we do show that the network-based approach really brings interesting features for supervised, semisupervised, and unsupervised learning. Specifically, we reformulate a previously proposed particle competition technique for both unsupervised and semisupervised learning using a stochastic nonlinear dynamical system. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition to that, data reliability issues are explored in semisupervised learning. Such matter has practical importance and is found to be of little investigation in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this work, we propose a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the semantic meaning of the data, but also is able to improve the performance of traditional classification techniques. Finally, it is expected that this study will contribute, in a relevant manner, to the machine learning area / Aprendizado de máquina figura-se como uma área de pesquisa que visa a desenvolver métodos computacionais capazes de aprender com a experiência. Embora uma grande quantidade de técnicas de aprendizado de máquina foi proposta e aplicada, com sucesso, em sistemas reais, existem ainda inúmeros problemas desafiantes que necessitam ser explorados. Nos últimos anos, um crescente interesse em técnicas baseadas em redes complexas (grafos de larga escala com padrões de conexão não triviais) foi verificado. Essa emergência é explicada pelas inerentes vantagens que a representação em redes complexas traz, sendo capazes de capturar as relações espaciais, topológicas e funcionais dos dados. Nesta tese, serão investigadas as possíveis vantagens oferecidas por redes complexas quando utilizadas no domínio de aprendizado de máquina. De fato, será mostrado que a abordagem por redes realmente proporciona melhorias nos aprendizados supervisionado, semissupervisionado e não supervisionado. Especificamente, será reformulada uma técnica de competição de partículas para o aprendizado não supervisionado e semissupervisionado por meio da utilização de um sistema dinâmico estocástico não linear. Em complemento, uma análise analítica de tal modelo será desenvolvida, permitindo o entendimento evolucional do modelo no tempo. Além disso, a questão de confiabilidade de dados será investigada no aprendizado semissupervisionado. Tal tópico tem importância prática e é pouco estudado na literatura. Com o objetivo de validar essas técnicas em problemas reais, simulações computacionais em bases de dados consagradas pela literatura serão conduzidas. Ainda nesse trabalho, será proposta uma técnica híbrica de classificação supervisionada que combina tanto o aprendizado de baixo como de alto nível. O termo de baixo nível pode ser implementado por qualquer técnica de classificação tradicional, enquanto que o termo de alto nível é realizado pela extração das características de uma rede construída a partir dos dados de entrada. Nesse contexto, aquele classifica as instâncias de teste segundo qualidades físicas, enquanto que esse estima a conformidade da instância de teste com a formação de padrões dos dados. Os estudos aqui desenvolvidos mostram que o método proposto pode melhorar o desempenho de técnicas tradicionais de classificação, além de permitir uma classificação de acordo com o significado semântico dos dados. Enfim, acredita-se que este estudo possa gerar contribuições relevantes para a área de aprendizado de máquina.
|
315 |
Entity-centric representations in deep learningAssouel, Rim 08 1900 (has links)
Humans' incredible capacity to model the complexity of the physical world is possible because they cast this complexity as the composition of simpler entities and rules to process them. Extensive work in cognitive science indeed shows that human perception and reasoning ability is structured around objects. Motivated by this observation, a growing number of recent work focused on entity-centric approaches to learning representation and their potential to facilitate downstream tasks.
In the first contribution, we show how an entity-centric approach to learning a transition model allows us to extract meaningful visual entities and to learn transition rules that achieve better compositional generalization.
In the second contribution, we show how an entity-centric approach to generating graphs allows us to design a model for conditional graph generation that permits direct optimisation of the graph properties. We investigate the performance of our model in a prototype-based molecular graph generation task. In this task, called lead optimization in drug discovery, we wish to adjust a few physico-chemical properties of a molecule that has proven efficient in vitro in order to make a drug out of it. / L'incroyable capacité des humains à modéliser la complexité du monde physique est rendue possible par la décomposition qu'ils en font en un ensemble d'entités et de règles simples. De nombreux travaux en sciences cognitives montre que la perception humaine et sa capacité à raisonner est essentiellement centrée sur la notion d'objet. Motivés par cette observation, de récents travaux se sont intéressés aux différentes approches d'apprentissage de représentations centrées sur des entités et comment ces représentations peuvent être utilisées pour résoudre plus facilement des tâches sous-jacentes.
Dans la première contribution on montre comment une architecture centrée sur la notion d'entité va permettre d'extraire des entités visuelles interpretables et d'apprendre un modèle du monde plus robuste aux différentes configurations d'objets.
Dans la deuxième contribution on s’intéresse à un modèle de génération de graphes dont l'architecture est également centrée sur la notion d'entités et comment cette architecture rend plus facile l'apprentissage d'une génération conditionelle à certaines propriétés du graphe. On s’intéresse plus particulièrement aux applications en découverte de médicaments. Dans cette tâche, on souhaite optimiser certaines propriétés physico-chmiques du graphe d'une molécule qui a été efficace in-vitro et dont on veut faire un médicament.
|
316 |
Deep learning of representations and its application to computer visionGoodfellow, Ian 04 1900 (has links)
No description available.
|
317 |
Designing Regularizers and Architectures for Recurrent Neural NetworksKrueger, David 01 1900 (has links)
No description available.
|
318 |
Towards deep semi supervised learningPezeshki, Mohammad 05 1900 (has links)
No description available.
|
319 |
Feedforward deep architectures for classification and synthesisWarde-Farley, David 08 1900 (has links)
No description available.
|
320 |
An Effective Framework of Autonomous Driving by Sensing Road/motion ProfilesZheyuan Wang (11715263) 22 November 2021 (has links)
<div>With more and more videos taken from dash cams on thousands of cars, retrieving these videos and searching for important information is a daunting task. The purpose of this work is to mine some key road and vehicle motion attributes in a large-scale driving video data set for traffic analysis, sensing algorithm development and autonomous driving test benchmarks. Current sensing and control of autonomous cars based on full-view identification makes it difficult to maintain a high-frequency with a fast-moving vehicle, since computation is increasingly used to cope with driving environment changes.</div><div><br></div><div>A big challenge in video data mining is how to deal with huge amounts of data. We use a compact representation called the road profile system to visualize the road environment in long 2D images. It reduces the data from each frame of image to one line, thereby compressing the video clip to the image. This data dimensionality reduction method has several advantages: First, the data size is greatly compressed. The data is compressed from a video to an image, and each frame in the video is compressed into a line. The data size is compressed hundreds of times. While the size and dimensionality of the data has been compressed greatly, the useful information in the driving video is still completely preserved, and motion information is even better represented more intuitively. Because of the data and dimensionality reduction, the identification algorithm computational efficiency is higher than the full-view identification method, and it makes the real-time identification on road is possible. Second, the data is easier to be visualized, because the data is reduced in dimensionality, and the three-dimensional video data is compressed into two-dimensional data, the reduction is more conducive to the visualization and mutual comparison of the data. Third, continuously changing attributes are easier to show and be captured. Due to the more convenient visualization of two-dimensional data, the position, color and size of the same object within a few frames will be easier to compare and capture. At the same time, in many cases, the trouble caused by tracking and matching can be eliminated. Based on the road profile system, there are three tasks in autonomous driving are achieved using the road profile images.</div><div><br></div><div>The first application is road edge detection under different weather and appearance for road following in autonomous driving to capture the road profile image and linearity profile image in the road profile system. This work uses naturalistic driving video data mining to study the appearance of roads, which covers large-scale road data and changes. This work excavated a large number of naturalistic driving video sets to sample the light-sensitive area for color feature distribution. The effective road contour image is extracted from the long-time driving video, thereby greatly reducing the amount of video data. Then, the weather and lighting type can be identified. For each weather and lighting condition obvious features are I identified at the edge of the road to distinguish the road edge. </div><div><br></div><div>The second application is detecting vehicle interactions in driving videos via motion profile images to capture the motion profile image in the road profile system. This work uses visual actions recorded in driving videos taken by a dashboard camera to identify this interaction. The motion profile images of the video are filtered at key locations, thereby reducing the complexity of object detection, depth sensing, target tracking and motion estimation. The purpose of this reduction is for decision making of vehicle actions such as lane changing, vehicle following, and cut-in handling.</div><div><br></div><div>The third application is motion planning based on vehicle interactions and driving video. Taking note of the fact that a car travels in a straight line, we simply identify a few sample lines in the view to constantly scan the road, vehicles, and environment, generating a portion of the entire video data. Without using redundant data processing, we performed semantic segmentation to streaming road profile images. We plan the vehicle's path/motion using the smallest data set possible that contains all necessary information for driving.</div><div><br></div><div>The results are obtained efficiently, and the accuracy is acceptable. The results can be used for driving video mining, traffic analysis, driver behavior understanding, etc.</div>
|
Page generated in 0.087 seconds