• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 6
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 34
  • 34
  • 34
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Functional data mining with multiscale statistical procedures

Lee, Kichun 01 July 2010 (has links)
Hurst exponent and variance are two quantities that often characterize real-life, highfrequency observations. We develop the method for simultaneous estimation of a timechanging Hurst exponent H(t) and constant scale (variance) parameter C in a multifractional Brownian motion model in the presence of white noise based on the asymptotic behavior of the local variation of its sample paths. We also discuss the accuracy of the stable and simultaneous estimator compared with a few selected methods and the stability of computations that use adapted wavelet filters. Multifractals have become popular as flexible models in modeling real-life data of high frequency. We developed a method of testing whether the data of high frequency is consistent with monofractality using meaningful descriptors coming from a wavelet-generated multifractal spectrum. We discuss theoretical properties of the descriptors, their computational implementation, the use in data mining, and the effectiveness in the context of simulations, an application in turbulence, and analysis of coding/noncoding regions in DNA sequences. The wavelet thresholding is a simple and effective operation in wavelet domains that selects the subset of wavelet coefficients from a noised signal. We propose the selection of this subset in a semi-supervised fashion, in which a neighbor structure and classification function appropriate for wavelet domains are utilized. The decision to include an unlabeled coefficient in the model depends not only on its magnitude but also on the labeled and unlabeled coefficients from its neighborhood. The theoretical properties of the method are discussed and its performance is demonstrated on simulated examples.
32

Computational Analysis of Flow Cytometry Data

Irvine, Allison W. 12 July 2013 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The objective of this thesis is to compare automated methods for performing analysis of flow cytometry data. Flow cytometry is an important and efficient tool for analyzing the characteristics of cells. It is used in several fields, including immunology, pathology, marine biology, and molecular biology. Flow cytometry measures light scatter from cells and fluorescent emission from dyes which are attached to cells. There are two main tasks that must be performed. The first is the adjustment of measured fluorescence from the cells to correct for the overlap of the spectra of the fluorescent markers used to characterize a cell’s chemical characteristics. The second is to use the amount of markers present in each cell to identify its phenotype. Several methods are compared to perform these tasks. The Unconstrained Least Squares, Orthogonal Subspace Projection, Fully Constrained Least Squares and Fully Constrained One Norm methods are used to perform compensation and compared. The fully constrained least squares method of compensation gives the overall best results in terms of accuracy and running time. Spectral Clustering, Gaussian Mixture Modeling, Naive Bayes classification, Support Vector Machine and Expectation Maximization using a gaussian mixture model are used to classify cells based on the amounts of dyes present in each cell. The generative models created by the Naive Bayes and Gaussian mixture modeling methods performed classification of cells most accurately. These supervised methods may be the most useful when online classification is necessary, such as in cell sorting applications of flow cytometers. Unsupervised methods may be used to completely replace manual analysis when no training data is given. Expectation Maximization combined with a cluster merging post-processing step gives the best results of the unsupervised methods considered.
33

Renforcement de la sécurité à travers les réseaux programmables

Abou El Houda, Zakaria 09 1900 (has links)
La conception originale d’Internet n’a pas pris en compte les aspects de sécurité du réseau; l’objectif prioritaire était de faciliter le processus de communication. Par conséquent, de nombreux protocoles de l’infrastructure Internet exposent un ensemble de vulnérabilités. Ces dernières peuvent être exploitées par les attaquants afin de mener un ensemble d’attaques. Les attaques par déni de service distribué (Distributed Denial of Service ou DDoS) représentent une grande menace et l’une des attaques les plus dévastatrices causant des dommages collatéraux aux opérateurs de réseau ainsi qu’aux fournisseurs de services Internet. Les réseaux programmables, dits Software-Defined Networking (SDN), ont émergé comme un nouveau paradigme promettant de résoudre les limitations de l’architecture réseau actuelle en découplant le plan de contrôle du plan de données. D’une part, cette séparation permet un meilleur contrôle du réseau et apporte de nouvelles capacités pour mitiger les attaques par déni de service distribué. D’autre part, cette séparation introduit de nouveaux défis en matière de sécurité du plan de contrôle. L’enjeu de cette thèse est double. D’une part, étudier et explorer l’apport de SDN à la sécurité afin de concevoir des solutions efficaces qui vont mitiger plusieurs vecteurs d’attaques. D’autre part, protéger SDN contre ces attaques. À travers ce travail de recherche, nous contribuons à la mitigation des attaques par déni de service distribué sur deux niveaux (intra-domaine et inter-domaine), et nous contribuons au renforcement de l’aspect sécurité dans les réseaux programmables. / The original design of Internet did not take into consideration security aspects of the network; the priority was to facilitate the process of communication. Therefore, many of the protocols that are part of the Internet infrastructure expose a set of vulnerabilities that can be exploited by attackers to carry out a set of attacks. Distributed Denial-of-Service (DDoS) represents a big threat and one of the most devastating and destructive attacks plaguing network operators and Internet service providers (ISPs) in a stealthy way. Software defined networks (SDN), an emerging technology, promise to solve the limitations of the conventional network architecture by decoupling the control plane from the data plane. On one hand, the separation of the control plane from the data plane allows for more control over the network and brings new capabilities to deal with DDoS attacks. On the other hand, this separation introduces new challenges regarding the security of the control plane. This thesis aims to deal with various types of attacks including DDoS attacks while protecting the resources of the control plane. In this thesis, we contribute to the mitigation of both intra-domain and inter-domain DDoS attacks, and to the reinforcement of security aspects in SDN.
34

A nonparametric Bayesian perspective for machine learning in partially-observed settings

Akova, Ferit 31 July 2014 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Robustness and generalizability of supervised learning algorithms depend on the quality of the labeled data set in representing the real-life problem. In many real-world domains, however, we may not have full knowledge of the underlying data-generating mechanism, which may even have an evolving nature introducing new classes continually. This constitutes a partially-observed setting, where it would be impractical to obtain a labeled data set exhaustively defined by a fixed set of classes. Traditional supervised learning algorithms, assuming an exhaustive training library, would misclassify a future sample of an unobserved class with probability one, leading to an ill-defined classification problem. Our goal is to address situations where such assumption is violated by a non-exhaustive training library, which is a very realistic yet an overlooked issue in supervised learning. In this dissertation we pursue a new direction for supervised learning by defining self-adjusting models to relax the fixed model assumption imposed on classes and their distributions. We let the model adapt itself to the prospective data by dynamically adding new classes/components as data demand, which in turn gradually make the model more representative of the entire population. In this framework, we first employ suitably chosen nonparametric priors to model class distributions for observed as well as unobserved classes and then, utilize new inference methods to classify samples from observed classes and discover/model novel classes for those from unobserved classes. This thesis presents the initiating steps of an ongoing effort to address one of the most overlooked bottlenecks in supervised learning and indicates the potential for taking new perspectives in some of the most heavily studied areas of machine learning: novelty detection, online class discovery and semi-supervised learning.

Page generated in 0.2053 seconds