• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 201
  • 21
  • 18
  • 9
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 335
  • 335
  • 124
  • 113
  • 84
  • 81
  • 81
  • 65
  • 64
  • 63
  • 58
  • 49
  • 48
  • 48
  • 46
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

[pt] ESTIMAÇÃO DA TENSÃO MECÂNICA USANDO ONDAS ULTRASSÔNICAS GUIADAS E MACHINE LEARNING / [en] MECHANICAL STRESS ESTIMATION USING GUIDED ULTRASONIC WAVES AND MACHINE LEARNING

CHRISTIAN DEYVI VILLARES HOLGUIN 11 July 2022 (has links)
[pt] Devido ao efeito acoustoelástico, as Ondas guiadas ultrassônicas (UGWs) têm sido usadas para estimar a tensão mecânica com baixo custo de forma não destrutiva. O Aprendizado de maquina (ML) tem sido aplicado para mapear formas complexas de ondas para estimar a tensão mecânica, embora aspectos importantes como precisão e consumo computacional não tenham sido explorados. Na literatura também não há muito trabalho sobre o uso do aprendizado não supervisionado para a rotulagem automática de amostras com diferentes estados de tensão. Portanto, esta tese apresenta duas abordagens: i) a abordagem supervisionada propõe uma metodologia de modelagem de dados que otimiza a precisão e a implementação computacional, para a estimação da tensão baseada em UGWs em tempo real e ii) a abordagem não supervisionada compara estruturas não supervisionadas para rotular um pequeno conjunto de dados de acordo com o estado de tensão. Para o primeiro, foram avaliados modelos de aprendizagem superficial e profunda com redução de dimensionalidade, estes modelos são criados e testados usando um procedimento de hold-out Monte-Carlo para avaliar sua robustez. Os resultados mostram que, utilizando modelos superficiais e Análise de componentes principais (PCA), foi obtida uma melhoria de precisão e no consumo de hardware em comparação com o estado da arte com modelos de redes neurais profundas. Para o segundo, métodos de redução de dimensionalidade: PCA e t-distributed stochastic neighbor embedding (t-SNE), são usados para extrair características de sinais UGWs. As características são usadas para agrupar as amostras em estados de baixa, média e alta tensão. Uma análise qualitativa e quantitativa dos resultados foi realizada, considerando a análise de métricas para agrupamento, o PCA realizou o melhor agrupamento, qualitativamente, mostrando menos sobreposição en grupos do que t-SNE. As duas abordagens utilizadas nesta tese, conseguiram extrair características significativas que ajudam tanto na estimativa quanto tanto na rotulagem de dados, contribuindo para a criação de modelos de ML mais eficientes e no problema de interpretação de UGWs. / [en] Due to the acoustoelastic effect, Ultrasonic Guided Waves (UGWs) have been used to estimate mechanical stress in a non-expensive and nondestructively fashion. Machine Learning (ML) has been applied to map complex waveforms to stress estimates, though important aspects, such as accuracy and hardware consumption, have not been explored. Previously in the literature, there are also not many works on the use of unsupervised learning for automatic labeling of samples with different stress states. Therefore, this thesis presents two approaches, (i) the supervised approach aims to propose a data modeling methodology that optimizes accuracy and computational implementation, for real-time ultrasonic based stress estimation and (ii) the unsupervised approach aims at comparing unsupervised frameworks to label a small dataset according to the stress state. For the former, shallow and deep learning models with dimensionality reduction were evaluated, these models are created and tested using a Monte-Carlo holdout procedure to evaluate their robustness under different stress conditions. The results show that, using shallow models and Principal Component Analysis (PCA), an accuracy improvement and hardware consumption as compared to the state of the art reported with deep neural network models were obtained. For the latter, dimensionality reduction methods: PCA and t-distributed stochastic neighbor embedding (t-SNE), are used to extract features from UGWs signals with different stress levels. The features are used to group the samples into low, medium and high stress states. A qualitative and quantitative analysis of the results was performed. Considering the analysis of metrics for clustering, PCA performed the best clustering, qualitatively, showing less overlapping of clusters than t-SNE. The two approaches used in this thesis, managed to extract meaningful features which helped in both estimation and stress labeling, contributing to the creation of more efficient ML models and in the problem of interpreting UGWs.
302

AUTOMATED EVALUATION OF NEUROLOGICAL DISORDERS THROUGH ELECTRONIC HEALTH RECORD ANALYSIS

Md Rakibul Islam Prince (18771646) 03 September 2024 (has links)
<p dir="ltr">Neurological disorders present a considerable challenge due to their variety and diagnostic complexity especially for older adults. Early prediction of the onset and ongoing assessment of the severity of these disease conditions can allow timely interventions. Currently, most of the assessment tools are time-consuming, costly, and not suitable for use in primary care. To reduce this burden, the present thesis introduces passive digital markers for different disease conditions that can effectively automate the severity assessment and risk prediction from different modalities of electronic health records (EHR). The focus of the first phase of the present study in on developing passive digital markers for the functional assessment of patients suffering from Bipolar disorder and Schizophrenia. The second phase of the study explores different architectures for passive digital markers that can predict patients at risk for dementia. The functional severity PDM uses only a single EHR modality, namely medical notes in order to assess the severity of the functioning of schizophrenia, bipolar type I, or mixed bipolar patients. In this case, the input of is a single medical note from the electronic medical record of the patient. This note is submitted to a hierarchical BERT model which classifies at-risk patients. A hierarchical attention mechanism is adopted because medical notes can exceed the maximum allowed number of tokens by most language models including BERT. The functional severity PDM follows three steps. First, a sentence-level embedding is produced for each sentence in the note using a token-level attention mechanism. Second, an embedding for the entire note is constructed using a sentence-level attention mechanism. Third, the final embedding is classified using a feed-forward neural network which estimates the impairment level of the patient. When used prior to the onset of the disease, this PDM is able to differentiate between severe and moderate functioning levels with an AUC of 76%. Disease-specific severity assessment PDMs are only applicable after the onset of the disease and have AUCs of nearly 85% for schizophrenia and bipolar patients. The dementia risk prediction PDM considers multiple EHR modalities including socio-demographic data, diagnosis codes and medical notes. Moreover, the observation period and prediction horizon are varied for a better understanding of the practical limitations of the model. This PDM is able to identify patients at risk of dementia with AUCs ranging from 70% to 92% as the observation period approaches the index date. The present study introduces methodologies for the automation of important clinical outcomes such as the assessment of the general functioning of psychiatric patients and the prediction of risk for dementia using only routine care data.</p>
303

Improving sampling, optimization and feature extraction in Boltzmann machines

Desjardins, Guillaume 12 1900 (has links)
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables.
304

Machine learning via dynamical processes on complex networks / Aprendizado de máquina via processos dinâmicos em redes complexas

Cupertino, Thiago Henrique 20 December 2013 (has links)
Extracting useful knowledge from data sets is a key concept in modern information systems. Consequently, the need of efficient techniques to extract the desired knowledge has been growing over time. Machine learning is a research field dedicated to the development of techniques capable of enabling a machine to \"learn\" from data. Many techniques have been proposed so far, but there are still issues to be unveiled specially in interdisciplinary research. In this thesis, we explore the advantages of network data representation to develop machine learning techniques based on dynamical processes on networks. The network representation unifies the structure, dynamics and functions of the system it represents, and thus is capable of capturing the spatial, topological and functional relations of the data sets under analysis. We develop network-based techniques for the three machine learning paradigms: supervised, semi-supervised and unsupervised. The random walk dynamical process is used to characterize the access of unlabeled data to data classes, configuring a new heuristic we call ease of access in the supervised paradigm. We also propose a classification technique which combines the high-level view of the data, via network topological characterization, and the low-level relations, via similarity measures, in a general framework. Still in the supervised setting, the modularity and Katz centrality network measures are applied to classify multiple observation sets, and an evolving network construction method is applied to the dimensionality reduction problem. The semi-supervised paradigm is covered by extending the ease of access heuristic to the cases in which just a few labeled data samples and many unlabeled samples are available. A semi-supervised technique based on interacting forces is also proposed, for which we provide parameter heuristics and stability analysis via a Lyapunov function. Finally, an unsupervised network-based technique uses the concepts of pinning control and consensus time from dynamical processes to derive a similarity measure used to cluster data. The data is represented by a connected and sparse network in which nodes are dynamical elements. Simulations on benchmark data sets and comparisons to well-known machine learning techniques are provided for all proposed techniques. Advantages of network data representation and dynamical processes for machine learning are highlighted in all cases / A extração de conhecimento útil a partir de conjuntos de dados é um conceito chave em sistemas de informação modernos. Por conseguinte, a necessidade de técnicas eficientes para extrair o conhecimento desejado vem crescendo ao longo do tempo. Aprendizado de máquina é uma área de pesquisa dedicada ao desenvolvimento de técnicas capazes de permitir que uma máquina \"aprenda\" a partir de conjuntos de dados. Muitas técnicas já foram propostas, mas ainda há questões a serem reveladas especialmente em pesquisas interdisciplinares. Nesta tese, exploramos as vantagens da representação de dados em rede para desenvolver técnicas de aprendizado de máquina baseadas em processos dinâmicos em redes. A representação em rede unifica a estrutura, a dinâmica e as funções do sistema representado e, portanto, é capaz de capturar as relações espaciais, topológicas e funcionais dos conjuntos de dados sob análise. Desenvolvemos técnicas baseadas em rede para os três paradigmas de aprendizado de máquina: supervisionado, semissupervisionado e não supervisionado. O processo dinâmico de passeio aleatório é utilizado para caracterizar o acesso de dados não rotulados às classes de dados configurando uma nova heurística no paradigma supervisionado, a qual chamamos de facilidade de acesso. Também propomos uma técnica de classificação de dados que combina a visão de alto nível dos dados, por meio da caracterização topológica de rede, com relações de baixo nível, por meio de medidas de similaridade, em uma estrutura geral. Ainda no aprendizado supervisionado, as medidas de rede modularidade e centralidade Katz são aplicadas para classificar conjuntos de múltiplas observações, e um método de construção evolutiva de rede é aplicado ao problema de redução de dimensionalidade. O paradigma semissupervisionado é abordado por meio da extensão da heurística de facilidade de acesso para os casos em que apenas algumas amostras de dados rotuladas e muitas amostras não rotuladas estão disponíveis. É também proposta uma técnica semissupervisionada baseada em forças de interação, para a qual fornecemos heurísticas para selecionar parâmetros e uma análise de estabilidade mediante uma função de Lyapunov. Finalmente, uma técnica não supervisionada baseada em rede utiliza os conceitos de controle pontual e tempo de consenso de processos dinâmicos para derivar uma medida de similaridade usada para agrupar dados. Os dados são representados por uma rede conectada e esparsa na qual os vértices são elementos dinâmicos. Simulações com dados de referência e comparações com técnicas de aprendizado de máquina conhecidas são fornecidos para todas as técnicas propostas. As vantagens da representação de dados em rede e de processos dinâmicos para o aprendizado de máquina são evidenciadas em todos os casos
305

Machine learning in complex networks: modeling, analysis, and applications / Aprendizado de máquina em redes complexas: modelagem, análise e aplicações

Silva, Thiago Christiano 13 December 2012 (has links)
Machine learning is evidenced as a research area with the main purpose of developing computational methods that are capable of learning with their previously acquired experiences. Although a large amount of machine learning techniques has been proposed and successfully applied in real systems, there are still many challenging issues, which need be addressed. In the last years, an increasing interest in techniques based on complex networks (large-scale graphs with nontrivial connection patterns) has been verified. This emergence is explained by the inherent advantages provided by the complex network representation, which is able to capture the spatial, topological and functional relations of the data. In this work, we investigate the new features and possible advantages offered by complex networks in the machine learning domain. In fact, we do show that the network-based approach really brings interesting features for supervised, semisupervised, and unsupervised learning. Specifically, we reformulate a previously proposed particle competition technique for both unsupervised and semisupervised learning using a stochastic nonlinear dynamical system. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition to that, data reliability issues are explored in semisupervised learning. Such matter has practical importance and is found to be of little investigation in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this work, we propose a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the semantic meaning of the data, but also is able to improve the performance of traditional classification techniques. Finally, it is expected that this study will contribute, in a relevant manner, to the machine learning area / Aprendizado de máquina figura-se como uma área de pesquisa que visa a desenvolver métodos computacionais capazes de aprender com a experiência. Embora uma grande quantidade de técnicas de aprendizado de máquina foi proposta e aplicada, com sucesso, em sistemas reais, existem ainda inúmeros problemas desafiantes que necessitam ser explorados. Nos últimos anos, um crescente interesse em técnicas baseadas em redes complexas (grafos de larga escala com padrões de conexão não triviais) foi verificado. Essa emergência é explicada pelas inerentes vantagens que a representação em redes complexas traz, sendo capazes de capturar as relações espaciais, topológicas e funcionais dos dados. Nesta tese, serão investigadas as possíveis vantagens oferecidas por redes complexas quando utilizadas no domínio de aprendizado de máquina. De fato, será mostrado que a abordagem por redes realmente proporciona melhorias nos aprendizados supervisionado, semissupervisionado e não supervisionado. Especificamente, será reformulada uma técnica de competição de partículas para o aprendizado não supervisionado e semissupervisionado por meio da utilização de um sistema dinâmico estocástico não linear. Em complemento, uma análise analítica de tal modelo será desenvolvida, permitindo o entendimento evolucional do modelo no tempo. Além disso, a questão de confiabilidade de dados será investigada no aprendizado semissupervisionado. Tal tópico tem importância prática e é pouco estudado na literatura. Com o objetivo de validar essas técnicas em problemas reais, simulações computacionais em bases de dados consagradas pela literatura serão conduzidas. Ainda nesse trabalho, será proposta uma técnica híbrica de classificação supervisionada que combina tanto o aprendizado de baixo como de alto nível. O termo de baixo nível pode ser implementado por qualquer técnica de classificação tradicional, enquanto que o termo de alto nível é realizado pela extração das características de uma rede construída a partir dos dados de entrada. Nesse contexto, aquele classifica as instâncias de teste segundo qualidades físicas, enquanto que esse estima a conformidade da instância de teste com a formação de padrões dos dados. Os estudos aqui desenvolvidos mostram que o método proposto pode melhorar o desempenho de técnicas tradicionais de classificação, além de permitir uma classificação de acordo com o significado semântico dos dados. Enfim, acredita-se que este estudo possa gerar contribuições relevantes para a área de aprendizado de máquina.
306

Consensus and analia: new challenges in detection and management of security vulnerabilities in data networks

Corral Torruella, Guiomar 10 September 2009 (has links)
A mesura que les xarxes passen a ser un element integral de les corporacions, les tecnologies de seguretat de xarxa es desenvolupen per protegir dades i preservar la privacitat. El test de seguretat en una xarxa permet identificar vulnerabilitats i assegurar els requisits de seguretat de qualsevol empresa. L'anàlisi de la seguretat permet reconèixer informació maliciosa, tràfic no autoritzat, vulnerabilitats de dispositius o de la xarxa, patrons d'intrusió, i extreure conclusions de la informació recopilada en el test. Llavors, on està el problema? No existeix un estàndard de codi obert ni un marc integral que segueixi una metodologia de codi obert per a tests de seguretat, la informació recopilada després d'un test inclou moltes dades, no existeix un patró exacte i objectiu sobre el comportament dels dispositius de xarxa ni sobre les xarxes i, finalment, el nombre de vulnerabilitats potencials és molt extens. El desafiament d'aquest domini resideix a tenir un gran volum de dades complexes, on poden aparèixer diagnòstics inconsistents. A més, és un domini no supervisat on no s'han aplicat tècniques d'aprenentatge automàtic anteriorment. Per això cal una completa caracterització del domini. Consensus és l'aportació principal d'aquesta tesi: un marc integrat que inclou un sistema automatitzat per millorar la realització de tests en una xarxa i l'anàlisi de la informació recollida. El sistema automatitza els mecanismes associats a un test de seguretat i minimitza la durada de l'esmentat test, seguint la metodologia OSSTMM. Pot ser usat en xarxes cablejades i sense fils. La seguretat es pot avaluar des d'una perspectiva interna, o bé externa a la pròpia xarxa. Es recopilen dades d'ordinadors, routers, firewalls i detectors d'intrusions. Consensus gestionarà les dades a processar per analistes de seguretat. Informació general i específica sobre els seus serveis, sistema operatiu, la detecció de vulnerabilitats, regles d'encaminament i de filtrat, la resposta dels detectors d'intrusions, la debilitat de les contrasenyes, i la resposta a codi maliciós o a atacs de denegació de servei són un exemple de les dades a emmagatzemar per cada dispositiu. Aquestes dades són recopilades per les eines de test incloses a Consensus.La gran quantitat de dades per cada dispositiu i el diferent número i tipus d'atributs que els caracteritzen, compliquen l'extracció manual d'un patró de comportament. Les eines de test automatitzades poden obtenir diferents resultats sobre el mateix dispositiu i la informació recopilada pot arribar a ser incompleta o inconsistent. En aquest entorn sorgeix la segona principal aportació d'aquesta tesi: Analia, el mòdul d'anàlisi de Consensus. Mentre que Consensus s'encarrega de recopilar dades sobre la seguretat dels dispositius, Analia inclou tècniques d'Intel·ligència Artificial per ajudar als analistes després d'un test de seguretat. Diferents mètodes d 'aprenentatge no supervisat s'han analitzat per ser adaptats a aquest domini. Analia troba semblances dins dels dispositius analitzats i l'agrupació dels esmentats dispositius ajuda als analistes en l'extracció de conclusions. Les millors agrupacions són seleccionades mitjançant l'aplicació d'índexs de validació. A continuació, el sistema genera explicacions sobre cada agrupació per donar una resposta més detallada als analistes de seguretat.La combinació de tècniques d'aprenentatge automàtic en el domini de la seguretat de xarxes proporciona beneficis i millores en la realització de tests de seguretat mitjançant la utilització del marc integrat Consensus i el seu sistema d'anàlisi de resultats Analia. / A medida que las redes pasan a ser un elemento integral de las corporaciones, las tecnologías de seguridad de red se desarrollan para proteger datos y preservar la privacidad. El test de seguridad en una red permite identificar vulnerabilidades y asegurar los requisitos de seguridad de cualquier empresa. El análisis de la seguridad permite reconocer información maliciosa, tráfico no autorizado, vulnerabilidades de dispositivos o de la red, patrones de intrusión, y extraer conclusiones de la información recopilada en el test. Entonces, ¿dónde está el problema? No existe un estándar de código abierto ni un marco integral que siga una metodología de código abierto para tests de seguridad, la información recopilada después de un test incluye muchos datos, no existe un patrón exacto y objetivo sobre el comportamiento de los dispositivos de red ni sobre las redes y, finalmente, el número de vulnerabilidades potenciales es muy extenso. El desafío de este dominio reside en tener un gran volumen de datos complejos, donde pueden aparecer diagnósticos inconsistentes. Además, es un dominio no supervisado donde no se han aplicado técnicas de aprendizaje automático anteriormente. Por ello es necesaria una completa caracterización del dominio.Consensus es la aportación principal de esta tesis: un marco integrado que incluye un sistema automatizado para mejorar la realización de tests en una red y el análisis de la información recogida. El sistema automatiza los mecanismos asociados a un test de seguridad y minimiza la duración de dicho test, siguiendo la metodología OSSTMM. Puede ser usado en redes cableadas e inalámbricas. La seguridad se puede evaluar desde una perspectiva interna, o bien externa a la propia red. Se recopilan datos de ordenadores, routers, firewalls y detectores de intrusiones. Consensus gestionará los datos a procesar por analistas de seguridad. Información general y específica sobre sus servicios, sistema operativo, la detección de vulnerabilidades, reglas de encaminamiento y de filtrado, la respuesta de los detectores de intrusiones, la debilidad de las contraseñas, y la respuesta a código malicioso o a ataques de denegación de servicio son un ejemplo de los datos a almacenar por cada dispositivo. Estos datos son recopilados por las herramientas de test incluidas en Consensus. La gran cantidad de datos por cada dispositivo y el diferente número y tipo de atributos que les caracterizan, complican la extracción manual de un patrón de comportamiento. Las herramientas de test automatizadas pueden obtener diferentes resultados sobre el mismo dispositivo y la información recopilada puede llegar a ser incompleta o inconsistente. En este entorno surge la segunda principal aportación de esta tesis: Analia, el módulo de análisis de Consensus. Mientras que Consensus se encarga de recopilar datos sobre la seguridad de los dispositivos, Analia incluye técnicas de Inteligencia Artificial para ayudar a los analistas después de un test de seguridad. Distintos métodos de aprendizaje no supervisado se han analizado para ser adaptados a este dominio. Analia encuentra semejanzas dentro de los dispositivos analizados y la agrupación de dichos dispositivos ayuda a los analistas en la extracción de conclusiones. Las mejores agrupaciones son seleccionadas mediante la aplicación de índices de validación. A continuación, el sistema genera explicaciones sobre cada agrupación para dar una respuesta más detallada a los analistas de seguridad.La combinación de técnicas de aprendizaje automático en el dominio de la seguridad de redes proporciona beneficios y mejoras en la realización de tests de seguridad mediante la utilización del marco integrado Consensus y su sistema de análisis de resultados Analia. / As networks become an integral part of corporations and everyone's lives, advanced network security technologies are being developed to protect data and preserve privacy. Network security testing is necessary to identify and report vulnerabilities, and also to assure enterprise security requirements. Security analysis is necessary to recognize malicious data, unauthorized traffic, detected vulnerabilities, intrusion data patterns, and also to extract conclusions from the information gathered in the security test. Then, where is the problem? There is no open-source standard for security testing, there is no integral framework that follows an open-source methodology for security testing, information gathered after a security test includes large data sets, there is not an exact and objective pattern of behavior among network devices or, furthermore, among data networks and, finally, there are too many potentially vulnerabilities. The challenge of this domain resides in having a great volume of data; data are complex and can appear inconsistent diagnostics. It is also an unsupervised domain where no machine learning techniques have been applied before. Thus a complete characterization of the domain is needed.Consensus is the main contribution of this thesis. Consensus is an integrated framework that includes a computer-aided system developed to help security experts during network testing and analysis. The system automates mechanisms related to a security assessment in order to minimize the time needed to perform an OSSTMM security test. This framework can be used in wired and wireless networks. Network security can be evaluated from inside or from outside the system. It gathers data of different network devices, not only computers but also routers, firewalls and Intrusion Detection Systems (IDS). Consensus manages many data to be processed by security analysts after an exhaustive test. General information, port scanning data, operating system fingerprinting, vulnerability scanning data, routing and filtering rules, IDS response, answer to malicious code, weak passwords reporting, and response to denial of service attacks can be stored for each tested device. This data is gathered by the automated testing tools that have been included in Consensus.The great amount of data for every device and the different number and type of attributes complicates a manually traffic pattern finding. The automated testing tools can obtain different results, incomplete or inconsistent information. Then data obtained from a security test can be uncertain, approximate, complex and partial true. In this environment arises the second main contribution of this thesis: Analia, the data analysis module of Consensus. Whereas Consensus gathers security data, Analia includes Artificial Intelligence to help analysts after a vulnerability assessment. Unsupervised learning has been analyzed to be adapted to this domain. Analia finds resemblances within tested devices and clustering aids analysts in the extraction of conclusions. Afterwards, the best results are selected by applying cluster validity indices. Then explanations of clustering results are included to give a more comprehensive response to security analysts.The combination of machine learning techniques in the network security domain provides benefits and improvements when performing security assessments with the Consensus framework and processing its results with Analia.
307

Sur quelques problèmes non-supervisés impliquant des séries temporelles hautement dèpendantes

Khaleghi, Azadeh 18 November 2013 (has links) (PDF)
Cette thèse est consacrée à l'analyse théorique de problèmes non supervisés impliquant des séries temporelles hautement dépendantes. Plus particulièrement, nous abordons les deux problèmes fondamentaux que sont le problème d'estimation des points de rupture et le partitionnement de séries temporelles. Ces problèmes sont abordés dans un cadre extrêmement général oùles données sont générées par des processus stochastiques ergodiques stationnaires. Il s'agit de l'une des hypothèses les plus faibles en statistiques, comprenant non seulement, les hypothèses de modèles et les hypothèses paramétriques habituelles dans la littérature scientifique, mais aussi des hypothèses classiques d'indépendance, de contraintes sur l'espace mémoire ou encore des hypothèses de mélange. En particulier, aucune restriction n'est faite sur la forme ou la nature des dépendances, de telles sortes que les échantillons peuvent être arbitrairement dépendants. Pour chaque problème abordé, nous proposons de nouvelles méthodes non paramétriques et nous prouvons de plus qu'elles sont, dans ce cadre, asymptotiquement consistantes. Pour l'estimation de points de rupture, la consistance asymptotique se rapporte à la capacité de l'algorithme à produire des estimations des points de rupture qui sont asymptotiquement arbitrairement proches des vrais points de rupture. D'autre part, un algorithme de partitionnement est asymptotiquement consistant si le partitionnement qu'il produit, restreint à chaque lot de séquences, coïncides, à partir d'un certain temps et de manière consistante, avec le partitionnement cible. Nous montrons que les algorithmes proposés sont implémentables efficacement, et nous accompagnons nos résultats théoriques par des évaluations expérimentales. L'analyse statistique dans le cadre stationnaire ergodique est extrêmement difficile. De manière générale, il est prouvé que les vitesses de convergence sont impossibles à obtenir. Dès lors, pour deux échantillons générés indépendamment par des processus ergodiques stationnaires, il est prouvé qu'il est impossible de distinguer le cas où les échantillons sont générés par le même processus de celui où ils sont générés par des processus différents. Ceci implique que des problèmes tels le partitionnement de séries temporelles sans la connaissance du nombre de partitions ou du nombre de points de rupture ne peut admettre de solutions consistantes. En conséquence, une tâche difficile est de découvrir les formulations du problème qui en permettent une résolution dans ce cadre général. La principale contribution de cette thèse est de démontrer (par construction) que malgré ces résultats d'impossibilités théoriques, des formulations naturelles des problèmes considérés existent et admettent des solutions consistantes dans ce cadre général. Ceci inclut la démonstration du fait que le nombre de points de rupture corrects peut être trouvé, sans recourir à des hypothèses plus fortes sur les processus stochastiques. Il en résulte que, dans cette formulation, le problème des points de rupture peut être réduit à du partitionnement de séries temporelles. Les résultats présentés dans ce travail formulent les fondations théoriques pour l'analyse des données séquentielles dans un espace d'applications bien plus large.
308

Improving sampling, optimization and feature extraction in Boltzmann machines

Desjardins, Guillaume 12 1900 (has links)
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables.
309

Representation Learning for Visual Data

Dumoulin, Vincent 09 1900 (has links)
No description available.
310

Agrupamento de sequências de miRNA utilizando aprendizado não-supervisionado baseado em grafos

Kasahara, Viviani Akemi 12 August 2016 (has links)
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-10-11T17:36:54Z No. of bitstreams: 1 DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:21Z (GMT) No. of bitstreams: 1 DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:27Z (GMT) No. of bitstreams: 1 DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Made available in DSpace on 2016-10-21T13:03:34Z (GMT). No. of bitstreams: 1 DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) Previous issue date: 2016-08-12 / Não recebi financiamento / Cluster analysis is the organization of a collection of patterns into clusters based on similarity which is determined by using properties of data. Clustering techniques can be useful in a variety of knowledge domains such as biotechnology, computer vision, document retrieval and many others. An interesting area of biology involves the concept of microRNAs (miRNAs) that are approximately 22 nucleotide-long non-coding RNA molecules that play important roles in gene regulation. Clustering miRNA sequences can help to understand and explore sequences belonging to the same cluster that has similar biological functions. This research work investigates and explores seven unsupervised clustering algorithms based on graphs that can be divided into three categories: algorithm based on region of influence, algorithm based on minimum spanning tree and spectral algorithm. To assess the contribution of the proposed algorithms, data from miRNA families stored in the online miRBase database were used in the conducted experiments. The results of these experiments were presented, analysed and evaluated using clustering validation indexes as well as visual analysis. / A análise de agrupamento é uma organização de coleção de padrões em grupos, baseando-se na similaridade das propriedades pertencentes aos dados. A técnica de agrupamento pode ser utilizado em muitas áreas de conhecimento como biotecnologia, visão computacional, recuperação de documentos, entre outras. Uma área interessante da biologia envolve o conceito de microRNAs (miRNAs), que são moléculas não-codificadas de RNA com aproximadamente 22 nucleotídeos e que desempenham um papel importante na regulação dos genes. O agrupamento de sequências de miRNA podem ajudar em sua exploração e entendimento, pois as sequências que pertencem ao mesmo grupo possuem uma função biológica similar. Esse trabalho explora e investiga sete algoritmos de agrupamentos não-supervisionados baseados em grafos que podem ser divididos em três categorias: algoritmos baseados em região de influência, algoritmos baseados em árvore spanning minimal e algoritmo espectral. Para avaliar a contribuição dos algoritmos propostos, os experimentos conduzidos utilizaram os dados das famílias de miRNAs disponíveis no banco de dados denominado miRBase. Os resultados dos experimentos foram apresentados, analisados e avaliados usando índices de validação de agrupamento e análise visual.

Page generated in 0.1551 seconds