Global ETD Search

91	Understanding usage of Volvo trucks Dahl, Oskar, Johansson, Fredrik January 2019 (has links) Trucks are designed, configured and marketed for various working environments. There lies a concern whether trucks are used as intended by the manufacturer, as usage may impact the longevity, efficiency and productivity of the trucks. In this thesis we propose a framework divided into two separate parts, that aims to extract costumers’ driving behaviours from Logged Vehicle Data (LVD) in order to a): evaluate whether they align with so-called Global Transport Application (GTA) parameters and b): evaluate the usage in terms of performance. Gaussian mixture model (GMM) is employed to cluster and classify various driving behaviors. Association rule mining was applied on the categorized clusters to validate that the usage follow GTA configuration. Furthermore, Correlation Coefficient (CC) was used to find linear relationships between usage and performance in terms of Fuel Consumption (FC). It is found that the vast majority of the trucks seemingly follow GTA parameters, thus used as marketed. Likewise, the fuel economy was found to be linearly dependent with drivers’ various performances. The LVD lacks detail, such as Global Positioning System (GPS) information, needed to capture the usage in such a way that more definitive conclusions can be drawn. / <p>This thesis was later conducted as a scientific paper and was submit- ted to the conference of ICIMP, 2020. The publication was accepted the 23th of September (2019), and will be presented in January, 2020.</p> Machine Learning Clustering Usage Behaviors Association Rule Mining Gaussian Mixture Models Robotics Robotteknik och automation
92	BLINDED EVALUATIONS OF EFFECT SIZES IN CLINICAL TRIALS: COMPARISONS BETWEEN BAYESIAN AND EM ANALYSES Turkoz, Ibrahim January 2013 (has links) Clinical trials are major and costly undertakings for researchers. Planning a clinical trial involves careful selection of the primary and secondary efficacy endpoints. The 2010 draft FDA guidance on adaptive designs acknowledges possible study design modifications, such as selection and/or order of secondary endpoints, in addition to sample size re-estimation. It is essential for the integrity of a double-blind clinical trial that individual treatment allocation of patients remains unknown. Methods have been proposed for re-estimating the sample size of clinical trials, without unblinding treatment arms, for both categorical and continuous outcomes. Procedures that allow a blinded estimation of the treatment effect, using knowledge of trial operational characteristics, have been suggested in the literature. Clinical trials are designed to evaluate effects of one or more treatments on multiple primary and secondary endpoints. The multiplicity issues when there is more than one endpoint require careful consideration for controlling the Type I error rate. A wide variety of multiplicity approaches are available to ensure that the probability of making a Type I error is controlled within acceptable pre-specified bounds. The widely used fixed sequence gate-keeping procedures require prospective ordering of null hypotheses for secondary endpoints. This prospective ordering is often based on a number of untested assumptions about expected treatment differences, the assumed population variance, and estimated dropout rates. We wish to update the ordering of the null hypotheses based on estimating standardized treatment effects. We show how to do so while the study is ongoing, without unblinding the treatments, without losing the validity of the testing procedure, and with maintaining the integrity of the trial. Our simulations show that we can reliably order the standardized treatment effect also known as signal-to-noise ratio, even though we are unable to estimate the unstandardized treatment effect. In order to estimate treatment difference in a blinded setting, we must define a latent variable substituting for the unknown treatment assignment. Approaches that employ the EM algorithm to estimate treatment differences in blinded settings do not provide reliable conclusions about ordering the null hypotheses. We developed Bayesian approaches that enable us to order secondary null hypotheses. These approaches are based on posterior estimation of signal-to-noise ratios. We demonstrate with simulation studies that our Bayesian algorithms perform better than existing EM algorithm counterparts for ordering effect sizes. Introducing informative priors for the latent variables, in settings where the EM algorithm has been used, typically improves the accuracy of parameter estimation in effect size ordering. We illustrate our method with a secondary analysis of a longitudinal study of depression. / Statistics Statistics Pharmaceutical Sciences Adaptive Designs Bayesian Mixture Models Blinded Evaluations Em Algorithm Mcmc Secondary Endpoints
93	Bayesian Non-parametric Models for Time Series Decomposition Granados-Garcia, Guilllermo 05 January 2023 (has links) The standard approach to analyzing brain electrical activity is to examine the spectral density function (SDF) and identify frequency bands, defined apriori, that have the most substantial relative contributions to the overall variance of the signal. However, a limitation of this approach is that the precise frequency and bandwidth of oscillations are not uniform across cognitive demands. Thus, these bands should not be arbitrarily set in any analysis. To overcome this limitation, we propose three Bayesian Non-parametric models for time series decomposition which are data-driven approaches that identifies (i) the number of prominent spectral peaks, (ii) the frequency peak locations, and (iii) their corresponding bandwidths (or spread of power around the peaks). The standardized SDF is represented as a Dirichlet process mixture based on a kernel derived from second-order auto-regressive processes which completely characterize the location (peak) and scale (bandwidth) parameters. A Metropolis-Hastings within Gibbs algorithm is developed for sampling from the posterior distribution of the mixture parameters for each project. Simulation studies demonstrate the robustness and performance of the proposed methods. The methods developed were applied to analyze local field potential (LFP) activity from the hippocampus of laboratory rats across different conditions in a non-spatial sequence memory experiment to identify the most prominent frequency bands and examine the link between specific patterns of brain oscillatory activity and trial-specific cognitive demands. The second application study 61 EEG channels from two subjects performing a visual recognition task to discover frequency-specific oscillations present across brain zones. The third application extends the model to characterize the data coming from 10 alcoholics and 10 controls across three experimental conditions across 30 trials. The proposed models generate a framework to condense the oscillatory behavior of populations across different tasks isolating the target fundamental components allowing the practitioner different perspectives of analysis. Time Series Markov Chain Monte Carlo Mixture Models Bayesian Non-parametrics EEG Analysis Waveform decomposition
94	An Efficient Implementation of a Robust Clustering Algorithm Blostein, Martin January 2016 (has links) Clustering and classification are fundamental problems in statistical and machine learning, with a broad range of applications. A common approach is the Gaussian mixture model, which assumes that each cluster or class arises from a distinct Gaussian distribution. This thesis studies a robust, high-dimensional extension of the Gaussian mixture model that automatically detects outliers and noise, and a computationally efficient implementation thereof. The contaminated Gaussian distribution is a robust elliptic distribution that allows for automatic detection of ``bad points'', and is used to make robust the usual factor analysis model. In turn, the mixtures of contaminated Gaussian factor analyzers (MCGFA) algorithm allows high-dimesional, robust clustering, classification and detection of bad points. A family of MCGFA models is created through the introduction of different constraints on the covariance structure. A new, efficient implementation of the algorithm is presented, along with an account of its development. The fast implementation permits thorough testing of the MCGFA algorithm, and its performance is compared to two natural competitors: parsimonious Gaussian mixture models (PGMM) and mixtures of modified t factor analyzers (MMtFA). The algorithms are tested systematically on simulated and real data. / Thesis / Master of Science (MSc) clustering classification statistical learning machine learning robust computational statistics mixture models
95	Mixture models for ROC curve and spatio-temporal clustering Cheam, Amay SM January 2016 (has links) Finite mixture models have had a profound impact on the history of statistics, contributing to modelling heterogeneous populations, generalizing distributional assumptions, and lately, presenting a convenient framework for classification and clustering. A novel approach, via Gaussian mixture distribution, is introduced for modelling receiver operating characteristic curves. The absence of a closed-form for a functional form leads to employing the Monte Carlo method. This approach performs excellently compared to the existing methods when applied to real data. In practice, the data are often non-normal, atypical, or skewed. It is apparent that non-Gaussian distributions be introduced in order to better fit these data. Two non-Gaussian mixtures, i.e., t distribution and skew t distribution, are proposed and applied to real data. A novel mixture is presented to cluster spatial and temporal data. The proposed model defines each mixture component as a mixture of autoregressive polynomial with logistic links. The new model performs significantly better compared to the most well known model-based clustering techniques when applied to real data. / Thesis / Doctor of Philosophy (PhD) Finite mixture models ROC curve Spatio-temporal data Functional data Model-based clustering EM algorithm
96	Bayesian Infinite Mixture Models for Gene Clustering and Simultaneous Context Selection Using High-Throughput Gene Expression Data Freudenberg, Johannes M. January 2009 (has links) No description available. Bioinformatics Bayesian infinite mixture models clustering differential co-expression functional clustering analysis gene expression context specificity
97	Computational Study of Calmodulin’s Ca2+-dependent Conformational Ensembles Westerlund, Annie M. January 2018 (has links) Ca2+ and calmodulin play important roles in many physiologically crucial pathways. The conformational landscape of calmodulin is intriguing. Conformational changes allow for binding target-proteins, while binding Ca2+ yields population shifts within the landscape. Thus, target-proteins become Ca2+-sensitive upon calmodulin binding. Calmodulin regulates more than 300 target-proteins, and mutations are linked to lethal disorders. The mechanisms underlying Ca2+ and target-protein binding are complex and pose interesting questions. Such questions are typically addressed with experiments which fail to provide simultaneous molecular and dynamics insights. In this thesis, questions on binding mechanisms are probed with molecular dynamics simulations together with tailored unsupervised learning and data analysis. In Paper 1, a free energy landscape estimator based on Gaussian mixture models with cross-validation was developed and used to evaluate the efficiency of regular molecular dynamics compared to temperature-enhanced molecular dynamics. This comparison revealed interesting properties of the free energy landscapes, highlighting different behaviors of the Ca2+-bound and unbound calmodulin conformational ensembles. In Paper 2, spectral clustering was used to shed light on Ca2+ and target protein binding. With these tools, it was possible to characterize differences in target-protein binding depending on Ca2+-state as well as N-terminal or C-terminal lobe binding. This work invites data-driven analysis into the field of biomolecule molecular dynamics, provides further insight into calmodulin’s Ca2+ and targetprotein binding, and serves as a stepping-stone towards a complete understanding of calmodulin’s Ca2+-dependent conformational ensembles. / <p>QC 20180912</p> Molecular dynamics Calmodulin Free energy estimation Gaussian mixture models Spectral clustering conformational selection Biophysics Biofysik
98	Extensions of D-Optimal Minimal Designs for Mixture Models Li, Yanyan January 2014 (has links) The purpose of mixture experiments is to explore the optimum blends of mixture components, which will provide desirable response characteristics in finished products. D-Optimal minimal designs have been considered for a variety of mixture models, including Scheffe's linear, quadratic, and cubic models. Usually, these D-Optimal designs are minimally supported since they have just as many design points as the number of parameters. Thus, they lack the degrees of freedom to perform the Lack of Fit tests. Also, the majority of the design points in D-Optimal minimal designs are on the boundary: vertices, edges, or faces of the design simplex. In this dissertation, extensions of the D-Optimal minimal designs are developed to allow additional interior points in the design space to enable prediction of the entire response surface. First, the extensions of the D-Optimal minimal designs for two commonly used second-degree mixture models are considered. Second, the methodology for adding interior points to general mixture models is generalized. Also a new strategy for adding multiple interior points for symmetric mixture models is proposed. When compared with the standard mixture designs, the proposed extended D-Optimal minimal design provides higher power for the Lack of Fit tests with comparable D-efficiency. / Statistics Statistics D-optimal Interior Points Lack of Fit Minimal Designs Mixture Models
99	Extending Growth Mixture Models and Handling Missing Values via Mixtures of Non-Elliptical Distributions Wei, Yuhong January 2017 (has links) Growth mixture models (GMMs) are used to model intra-individual change and inter-individual differences in change and to detect underlying group structure in longitudinal studies. Regularly, these models are fitted under the assumption of normality, an assumption that is frequently invalid. To this end, this thesis focuses on the development of novel non-elliptical growth mixture models to better fit real data. Two non-elliptical growth mixture models, via the multivariate skew-t distribution and the generalized hyperbolic distribution, are developed and applied to simulated and real data. Furthermore, these two non-elliptical growth mixture models are extended to accommodate missing values, which are near-ubiquitous in real data. Recently, finite mixtures of non-elliptical distributions have flourished and facilitated the flexible clustering of the data featuring longer tails and asymmetry. However, in practice, real data often have missing values, and so work in this direction is also pursued. A novel approach, via mixtures of the generalized hyperbolic distribution and mixtures of the multivariate skew-t distributions, is presented to handle missing values in mixture model-based clustering context. To increase parsimony, families of mixture models have been developed by imposing constraints on the component scale matrices whenever missing data occur. Next, a mixture of generalized hyperbolic factor analyzers model is also proposed to cluster high-dimensional data with different patterns of missing values. Two missingness indicator matrices are also introduced to ease the computational burden. The algorithms used for parameter estimation are presented, and the performance of the methods is illustrated on simulated and real data. / Thesis / Doctor of Philosophy (PhD) Growth Mixture Model Model-Based Clustering EM Algorithm Missing Data Finite Mixture Models
100	Estimating Veterans' Health Benefit Grants Using the Generalized Linear Mixed Cluster-Weighted Model with Incomplete Data Deng, Xiaoying January 2018 (has links) The poverty rate among veterans in US has increased over the past decade, according to the U.S. Department of Veterans Affairs (2015). Thus, it is crucial to veterans who live below the poverty level to get sufficient benefit grants. A study on prudently managing health benefit grants for veterans may be helpful for government and policy-makers making appropriate decisions and investments. The purpose of this research is to find an underlying group structure for the veterans' benefit grants dataset and then estimate veterans' benefit grants sought using incomplete data. The generalized linear mixed cluster-weighted model based on mixture models is carried out by grouping similar observations to the same cluster. Finally, the estimates of veterans' benefit grants sought will provide reference for future public policies. / Thesis / Master of Science (MSc) Cluster-weighted models Mixture models Generalized linear models Clustering Mixed-type data Incomplete data

Search results