Global ETD Search

1	Statistical Theory for Adversarial Robustness in Machine Learning Yue Xing (14142297) 21 November 2022 (has links) <p>Deep learning plays an important role in various disciplines, such as auto-driving, information technology, manufacturing, medical studies, and financial studies. In the past decade, there have been fruitful studies on deep learning in which training and testing data are assumed to follow the same distribution to humans. Recent studies reveal that these dedicated models are vulnerable to adversarial attack, i.e., the predicting label may be changed even if the testing input has an unaware perturbation. However, most existing studies aim to develop computationally efficient adversarial learning algorithms without a thorough understanding of the statistical properties of these algorithms. This dissertation aims to provide theoretical understandings of adversarial training to figure out potential improvements in this area of research. </p> <p><br></p> <p>The first part of this dissertation focuses on the algorithmic stability of adversarial training. We reveal that the algorithmic stability of the vanilla adversarial training method is sub-optimal, and we study the effectiveness of a simple noise injection method. While noise injection improves stability, it also does not deteriorate the consistency of adversarial training.</p> <p><br></p> <p>The second part of this dissertation reveals a phase transition phenomenon in adversarial training. When the attack strength increases, the training trajectory of adversarial training will deviate from its natural counterpart. Consequently, various properties of adversarial training are different from clean training. It is essential to have adaptations in the training configuration and the neural network structure to improve adversarial training.</p> <p><br></p> <p>The last part of this dissertation focuses on how artificially generated data improves adversarial training. It is observed that utilizing synthetic data improves adversarial robustness, even if the data are generated using the original training data, i.e., no extra information is introduced. We use a theory to explain the reason behind this observation and propose further adaptations to utilize the generated data better.</p> Statistical theory adversarial robustness statistical machine learning
2	A data-centric framework for assessing environmental sustainability Aiyshwariya Paulvannan Kanmani (7036478) 15 August 2019 (has links) Necessity to sustain resources has risen in recent years with significant number of people affected by lack of access to essential resources. Framing policies that support environmental sustainability is necessary for addressing the issue. Effective policies necessitate access to a framework which assesses and keeps track of sustainability. Conventional frameworks that support such policy-making involve ranking of countries based on a weighted sum of several environmental performance metrics. However, the selection and weighing of metrics is often biased. This study proposes a new framework to assess environmental sustainability of countries via leveraging unsupervised learning. Specifically, this framework harnesses a clustering technique and tracks progressions in terms of shifts within clusters over time. It is observed that using the proposed framework, countries can identify specific ways to improve their progress towards environmental sustainability. Engineering not elsewhere classified Operations Research Environmental Sustainability Assessment Clustering Methods Statistical machine learning Neural Network
3	Formal language for statistical inference of uncertain stochastic systems Georgoulas, Anastasios-Andreas January 2016 (has links) Stochastic models, in particular Continuous Time Markov Chains, are a commonly employed mathematical abstraction for describing natural or engineered dynamical systems. While the theory behind them is well-studied, their specification can be problematic in a number of ways. Firstly, the size and complexity of the model can make its description difficult without using a high-level language. Secondly, knowledge of the system is usually incomplete, leaving one or more parameters with unknown values, thus impeding further analysis. Sophisticated machine learning algorithms have been proposed for the statistically rigorous estimation and handling of this uncertainty; however, their applicability is often limited to systems with finite state-space, and there has not been any consideration for their use on high-level descriptions. Similarly, high-level formal languages have been long used for describing and reasoning about stochastic systems, but require a full specification; efforts to estimate parameters for such formal models have been limited to simple inference algorithms. This thesis explores how these two approaches can be brought together, drawing ideas from the probabilistic programming paradigm. We introduce ProPPA, a process algebra for the specification of stochastic systems with uncertain parameters. The language is equipped with a semantics, allowing a formal interpretation of models written in it. This is the first time that uncertainty has been incorporated into the syntax and semantics of a formal language, and we describe a new mathematical object capable of capturing this information. We provide a series of algorithms for inference which can be automatically applied to ProPPA models without the need to write extra code. As part of these, we develop a novel inference scheme for infinite-state systems, based on random truncations of the state-space. The expressive power and inference capabilities of the framework are demonstrated in a series of small examples as well as a larger-scale case study. We also present a review of the state-of-the-art in both machine learning and formal modelling with respect to stochastic systems. We close with a discussion of potential extensions of this work, and thoughts about different ways in which the fields of statistical machine learning and formal modelling can be further integrated. 006.3
4	Latent feature networks for statistical relational learning Khoshneshin, Mohammad 01 July 2012 (has links) In this dissertation, I explored relational learning via latent variable models. Traditional machine learning algorithms cannot handle many learning problems where there is a need for modeling both relations and noise. Statistical relational learning approaches emerged to handle these applications by incorporating both relations and uncertainties in these problems. Latent variable models are one of the successful approaches for statistical relational learning. These models assume a latent variable for each entity and then the probability distribution over relationships between entities is modeled via a function over latent variables. One important example of relational learning via latent variables is text data modeling. In text data modeling, we are interested in modeling the relationship between words and documents. Latent variable models learn this data by assuming a latent variable for each word and document. The co-occurrence value is defined as a function of these random variables. For modeling co-occurrence data in general (and text data in particular), we proposed latent logistic allocation (LLA). LLA outperforms the-state-of-the-art model --- latent Dirichlet allocation --- in text data modeling, document categorization and information retrieval. We also proposed query-based visualization which embeds documents relevant to a query in a 2-dimensional space. Additionally, I used latent variable models for other single-relational problems such as collaborative filtering and educational data mining. To move towards multi-relational learning via latent variable models, we propose latent feature networks (LFN). Multi-relational learning approaches model multiple relationships simultaneously. LFN assumes a component for each relationship. Each component is a latent variable model where a latent variable is defined for each entity and the relationship is a function of latent variables. However, if an entity participates in more than one relationship, then it will have a separate random variable for each relationship. We used LFN for modeling two different problems: microarray classification and social network analysis with a side network. In the first application, LFN outperforms support vector machines --- the best propositional model for that application. In the second application, using the side information via LFN can drastically improve the link prediction task in a social network. latent variable models multi-relational learning statistical machine learning statistical relational learning
5	On the Modelling of Stochastic Gradient Descent with Stochastic Differential Equations Leino, Martin January 2023 (has links) Stochastic gradient descent (SGD) is arguably the most important algorithm used in optimization problems for large-scale machine learning. Its behaviour has been studied extensively from the viewpoint of mathematical analysis and probability theory; it is widely held that in the limit where the learning rate in the algorithm tends to zero, a specific stochastic differential equation becomes an adequate model of the dynamics of the algorithm. This study exhibits some of the research in this field by analyzing the application of a recently proven theorem to the problem of tensor principal component analysis. The results, originally discovered in an article by Gérard Ben Arous, Reza Gheissari and Aukosh Jagannath from 2022, illustrate how the phase diagram of functions of SGD differ in the high-dimensional regime from that of the classical fixed-dimensional setting. stochastic gradient descent stochastic differential equations statistical machine learning Other Mathematics Annan matematik
6	Bio-interfaced Nanolaminate Surface-enhanced Raman Spectroscopy Substrates Nam, Wonil 30 March 2022 (has links) Surface-enhanced Raman spectroscopy (SERS) is a powerful analytical technique that combines molecular specificity of vibrational fingerprints offered by Raman spectroscopy with single-molecule detection sensitivity from plasmonic hotspots of noble metal nanostructures. Label-free SERS has attracted tremendous interest in bioanalysis over the last two decades due to minimal sample preparation, non-invasive measurement without water background interference, and multiplexing capability from rich chemical information of narrow Raman bands. Nevertheless, significant challenges should be addressed to become a widely accepted technique in bio-related communities. In this dissertation, limitations from different aspects (performance, reliability, and analysis) are articulated with state-of-the-art, followed by how introduced works resolve them. For high SERS performance, SERS substrates consisting of vertically-stacked multiple metal-insulator-metal layers, named nanolaminate, were designed to simultaneously achieve high sensitivity and excellent uniformity, two previously deemed mutually exclusive properties. Two unique factors of nanolaminate SERS substrates were exploited for the improved reliability of label-free in situ classification using living cancer cells, including background refractive index (RI) insensitivity from 1.30 to 1.60, covering extracellular components, and 3D protruding nanostructures that can generate a tight nano-bio interface (e.g., hotspot-cell coupling). Discrete nanolamination by new nanofabrication additionally provides optical transparency, offering backside-excitation, thereby label-free glucose sensing on a skin-phantom model. Towards reliable quantitative SERS analysis, an electronic Raman scattering (ERS) calibration method was developed. ERS from metal is omnipresent in plasmonic constructs and experiences identical hotspot enhancements. Rigorous experimental results support that ERS can serve as internal standards for spatial and temporal calibration of SERS signals with significant potential for complex samples by overcoming intrinsic limitations of state-of-art Raman tags. ERS calibration was successfully applied to label-free living cell SERS datasets for classifying cancer subtypes and cellular drug responses. Furthermore, dual-recognition label-SERS with digital assay revealed improved accuracy in quantitative dopamine analysis. Artificial neural network-based advanced machine learning method was exploited to improve the interpretability of bioanalytical SERS for multiple living cell responses. Finally, this dissertation provides future perspectives with different aspects to design bio-interfaced SERS devices for clinical translation, followed by guidance for SERS to become a standard analytical method that can compete with or complement existing technologies. / Doctor of Philosophy / In photonics, metals were thought to be not very useful, except mirrors. However, at a length scale smaller than wavelength, it has been realized that metallic structures can provide unique ways of light manipulation. Maxwell's equations show that an interface between dielectric and metal can support surface plasmons, resulting in collective oscillations of electrons and light confinement. Surface-enhanced Raman spectroscopy (SERS) is a sensing technique that combines enhanced local fields arising from plasmon excitation with molecular fingerprint specificity of vibrational Raman spectroscopy. The million-fold enhancement of Raman signals at hotspots has driven an explosion of research, providing tons of publications over the last two decades with a broad spectrum of physical, chemical, and biological applications. Nevertheless, significant challenges should be addressed for SERS to become a widely accepted technique, especially in bio-related communities. In this dissertation, limitations from different aspects (performance, reliability, and analysis) are articulated with state-of-the-art, followed by how innovative strategies addressed them. Each chapter's unique approach consists of a combination of five aspects, including nanoplasmonics, nanofabrication, nano-bio interface, cancer biology, statistical machine learning. First, high-performance SERS substrates were designed to simultaneously achieve high sensitivity and excellent uniformity, two previously deemed mutually exclusive properties, by vertically stacking multiple metal-insulator-metal layers (i.e., nanolaminate). Their 3D protruding nanotopography and refractive-index-insensitive SERS response enabled label-free in situ classification of living cancer cells. Tweaked nanofabrication produced discrete nanolamination with optical transparency, enabling label-free glucose sensing on a skin phantom. Towards reliable quantitative SERS analysis, an electronic Raman scattering (ERS) calibration method was developed that can overcome the intrinsic limitations of Raman tags, and it was successfully applied to label-free living cell SERS datasets for classifying cancer subtypes and cellular drug responses. Furthermore, dual-recognition label-SERS with digital assay revealed improved accuracy in quantitative dopamine analysis. Advanced machine learning (artificial neural network) was exploited to improve the interpretability of SERS bioanalysis for multiple cellular drug responses. Finally, this dissertation provides future perspectives with different aspects, including SERS, biology, and statistics, for SERS to potentially become a standard analytical method that can compete with or complement existing technologies. plasmonics surface-enhanced Raman spectroscopy nanofabrication nano-bio interface living cancer cells statistical machine learning
7	Probabilistic Graphical Models for Prognosis and Diagnosis of Breast Cancer KHADEMI, MAHMOUD 04 1900 (has links) <p>One in nine women is expected to be diagnosed with breast cancer during her life. In 2013, an estimated 23, 800 Canadian women will be diagnosed with breast cancer and 5, 000 will die of it. Making decisions about the treatment for a patient is difficult since it depends on various clinical features, genomic factors, and pathological and cellular classification of a tumor.</p> <p>In this research, we propose a probabilistic graphical model for prognosis and diagnosis of breast cancer that can help medical doctors make better decisions about the best treatment for a patient. Probabilistic graphical models are suitable for making decisions under uncertainty from big data with missing attributes and noisy evidence.</p> <p>Using the proposed model, we may enter the results of different tests (e.g. estrogen and progesterone receptor test and HER2/neu test), microarray data, and clinical traits (e.g. woman's age, general health, menopausal status, stage of cancer, and size of the tumor) to the model and answer to following questions. How likely is it that the cancer will extend in the body (distant metastasis)? What is the chance of survival? How likely is that the cancer comes back (local or regional recurrence)? How promising is a treatment? For example, how likely metastasis is and how likely recurrence is for a new patient, if certain treatment e.g. surgical removal, radiation therapy, hormone therapy, or chemotherapy is applied. We can also classify various types of breast cancers using this model.</p> <p>Previous work mostly relied on clinical data. In our opinion, since cancer is a genetic disease, the integration of the genomic (microarray) and clinical data can improve the accuracy of the model for prognosis and diagnosis. However, increasing the number of variables may lead to poor results due to the curse of dimensionality dilemma and small sample size problem. The microarray data is high dimensional. It consists of around 25, 000 variables per patient. Moreover, structure learning and parameter learning for probabilistic graphical models require a significant amount of computations. The number of possible structures is also super-exponential with respect to the number of variables. For instance, there are more than 10^18 possible structures with just 10 variables.</p> <p>We address these problems by applying manifold learning and dimensionality reduction techniques to improve the accuracy of the model. Extensive experiments using real-world data sets such as METRIC and NKI show the accuracy of the proposed method for classification and predicting certain events, like recurrence and metastasis.</p> / Master of Science (MSc) Artificial Intelligence Statistical Machine Learning Probabilistic Graphical Models Manifold Learning Microarray Data Breast Cancer Other Computer Engineering Other Computer Engineering
8	Sélection de modèles parcimonieux pour l’apprentissage statistique en grande dimension / Model selection for sparse high-dimensional learning Mattei, Pierre-Alexandre 26 October 2017 (has links) Le déferlement numérique qui caractérise l’ère scientifique moderne a entraîné l’apparition de nouveaux types de données partageant une démesure commune : l’acquisition simultanée et rapide d’un très grand nombre de quantités observables. Qu’elles proviennent de puces ADN, de spectromètres de masse ou d’imagerie par résonance nucléaire, ces bases de données, qualifiées de données de grande dimension, sont désormais omniprésentes, tant dans le monde scientifique que technologique. Le traitement de ces données de grande dimension nécessite un renouvellement profond de l’arsenal statistique traditionnel, qui se trouve inadapté à ce nouveau cadre, notamment en raison du très grand nombre de variables impliquées. En effet, confrontée aux cas impliquant un plus grand nombre de variables que d’observations, une grande partie des techniques statistiques classiques est incapable de donner des résultats satisfaisants. Dans un premier temps, nous introduisons les problèmes statistiques inhérents aux modelés de données de grande dimension. Plusieurs solutions classiques sont détaillées et nous motivons le choix de l’approche empruntée au cours de cette thèse : le paradigme bayésien de sélection de modèles. Ce dernier fait ensuite l’objet d’une revue de littérature détaillée, en insistant sur plusieurs développements récents. Viennent ensuite trois chapitres de contributions nouvelles à la sélection de modèles en grande dimension. En premier lieu, nous présentons un nouvel algorithme pour la régression linéaire bayésienne parcimonieuse en grande dimension, dont les performances sont très bonnes, tant sur données réelles que simulées. Une nouvelle base de données de régression linéaire est également introduite : il s’agit de prédire la fréquentation du musée d’Orsay à l’aide de données vélibs. Ensuite, nous nous penchons sur le problème de la sélection de modelés pour l’analyse en composantes principales (ACP). En nous basant sur un résultat théorique nouveau, nous effectuons les premiers calculs exacts de vraisemblance marginale pour ce modelé. Cela nous permet de proposer deux nouveaux algorithmes pour l’ACP parcimonieuse, un premier, appelé GSPPCA, permettant d’effectuer de la sélection de variables, et un second, appelé NGPPCA, permettant d’estimer la dimension intrinsèque de données de grande dimension. Les performances empiriques de ces deux techniques sont extrêmement compétitives. Dans le cadre de données d’expression ADN notamment, l’approche de sélection de variables proposée permet de déceler sans supervision des ensembles de gènes particulièrement pertinents. / The numerical surge that characterizes the modern scientific era led to the rise of new kinds of data united in one common immoderation: the simultaneous acquisition of a large number of measurable quantities. Whether coming from DNA microarrays, mass spectrometers, or nuclear magnetic resonance, these data, usually called high-dimensional, are now ubiquitous in scientific and technological worlds. Processing these data calls for an important renewal of the traditional statistical toolset, unfit for such frameworks that involve a large number of variables. Indeed, when the number of variables exceeds the number of observations, most traditional statistics becomes inefficient. First, we give a brief overview of the statistical issues that arise with high-dimensional data. Several popular solutions are presented, and we present some arguments in favor of the method utilized and advocated in this thesis: Bayesian model uncertainty. This chosen framework is the subject of a detailed review that insists on several recent developments. After these surveys come three original contributions to high-dimensional model selection. A new algorithm for high-dimensional sparse regression called SpinyReg is presented. It compares favorably to state-of-the-art methods on both real and synthetic data sets. A new data set for high-dimensional regression is also described: it involves predicting the number of visitors in the Orsay museum in Paris using bike-sharing data. We focus next on model selection for high-dimensional principal component analysis (PCA). Using a new theoretical result, we derive the first closed-form expression of the marginal likelihood of a PCA model. This allows us to propose two algorithms for model selection in PCA. A first one called globally sparse probabilistic PCA (GSPPCA) that allows to perform scalable variable selection, and a second one called normal-gamma probabilistic PCA (NGPPCA) that estimates the intrinsic dimensionality of a high-dimensional data set. Both methods are competitive with other popular approaches. In particular, using unlabeled DNA microarray data, GSPPCA is able to select genes that are more biologically relevant than several popular approaches. Apprentissage statistique Grande dimension Parcimonie Sélection de modèles Statistique bayésienne Bayesian statistics High-dimensional data Model selection Sparsity Statistical machine learning 519.501 13
9	Computational Modeling for Censored Time to Event Data Using Data Integration in Biomedical Research Choi, Ickwon 20 June 2011 (has links) No description available. Computer Science Statistical Machine Learning Biomedical Informatics Bioinformatics Censored Time To Event Data Clinico-genomic Model Cox Proportional Hazards Model Microarray analysis
10	Accurate telemonitoring of Parkinson's disease symptom severity using nonlinear speech signal processing and statistical machine learning Tsanas, Athanasios January 2012 (has links) This study focuses on the development of an objective, automated method to extract clinically useful information from sustained vowel phonations in the context of Parkinson’s disease (PD). The aim is twofold: (a) differentiate PD subjects from healthy controls, and (b) replicate the Unified Parkinson’s Disease Rating Scale (UPDRS) metric which provides a clinical impression of PD symptom severity. This metric spans the range 0 to 176, where 0 denotes a healthy person and 176 total disability. Currently, UPDRS assessment requires the physical presence of the subject in the clinic, is subjective relying on the clinical rater’s expertise, and logistically costly for national health systems. Hence, the practical frequency of symptom tracking is typically confined to once every several months, hindering recruitment for large-scale clinical trials and under-representing the true time scale of PD fluctuations. We develop a comprehensive framework to analyze speech signals by: (1) extracting novel, distinctive signal features, (2) using robust feature selection techniques to obtain a parsimonious subset of those features, and (3a) differentiating PD subjects from healthy controls, or (3b) determining UPDRS using powerful statistical machine learning tools. Towards this aim, we also investigate 10 existing fundamental frequency (F_0) estimation algorithms to determine the most useful algorithm for this application, and propose a novel ensemble F_0 estimation algorithm which leads to a 10% improvement in accuracy over the best individual approach. Moreover, we propose novel feature selection schemes which are shown to be very competitive against widely-used schemes which are more complex. We demonstrate that we can successfully differentiate PD subjects from healthy controls with 98.5% overall accuracy, and also provide rapid, objective, and remote replication of UPDRS assessment with clinically useful accuracy (approximately 2 UPDRS points from the clinicians’ estimates), using only simple, self-administered, and non-invasive speech tests. The findings of this study strongly support the use of speech signal analysis as an objective basis for practical clinical decision support tools in the context of PD assessment. 616.833075

Search results