11 |
A RESAMPLING BASED APPROACH IN EVALUATION OF DOSE-RESPONSE MODELSFu, Min January 2014 (has links)
In this dissertation, we propose a computational approach using a resampling based permutation test as an alternative to MCP-Mod (a hybrid framework integrating the multiple comparison procedure and the modeling technique) and gMCP-Mod (generalized MCP-Mod) [11], [29] in the step of identifying significant dose-response signals via model selection. We name our proposed approach RMCP-Mod or gRMCP-Mod correspondingly. The RMCP-Mod/gRMCP-Mod transforms the drug dose comparisons into a dose-response model selection issue via multiple hypotheses testing, an area where not much extended researches have been done, and solve it using resampling based multiple testing procedures [38]. The proposed approach avoids the inclusion of the prior dose-response knowledge known as "guesstimates" used in the model selection step of the MCP-Mod/gMCP-Mod framework, and therefore reduces the uncertainty in the significant model identification. When a new drug is being developed to treat patients with a specified disease, one of the key steps is to discover an optimal drug dose or doses that would produce the desired clinical effect with an acceptable level of toxicity. In order to nd such a dose or doses (different doses may be able to produce the same or better clinical effect with similar acceptable toxicity), the underlying dose-response signals need to be identified and thoroughly examined through statistical analyses. A dose-response signal refers to the fact that a drug has different clinical effects at many quantitative dose levels. Statistically speaking, the dose-response signal is a numeric relationship curve (shape) between drug doses and the clinical effects in quantitative measures. It's often been a challenge to nd correct and accurate efficacy and/or safety dose-response signals that would best describe the dose-effect relationship in the drug development process via conventional statistical methods because the conventional methods tend to either focus on a fixed, small number of quantitative dosages or evaluate multiple pre-denied dose-response models without Type I error control. In searching for more efficient methods, a framework of combining both multiple comparisons procedure (MCP) and model-based (Mod) techniques acronymed MCP-Mod was developed by F. Bretz, J. C. Pinheiro, and M. Branson [11] to handle normally distributed, homoscedastic dose response observations. Subsequently, a generalized version of the MCP- Mod named gMCP-Mod which can additionally deal with binary, counts, or time-to-event dose-response data as well as repeated measurements over time was developed by J. C. Pinheiro, B. Bornkamp, E. Glimm and F. Bretz [29]. The MCP-Mod/gMCP-Mod uses the guesstimates" in the MCP step to pre-specify parameters of the candidate models; however, in situations where the prior knowledge of the dose-response information is difficult to obtain, the uncertainties could be introduced into the model selection process, impacting on the correctness of the model identification. Throughout the evaluation of its application to the hypothetical and real study examples as well as simulation comparisons to the MCP-Mod/gMCP-Mod, our proposed approach, RMCP-Mod/gRMCP-Mod seems a viable method that can be used in the practice with some further improvements and researches that are still needed in applications to broader dose-response data types. / Statistics
|
12 |
High-Dimensional Functional Graphs and Inference for Unknown Heterogeneous PopulationsChen, Han 21 November 2024 (has links)
In this dissertation, we develop innovative methods for analyzing high-dimensional, heterogeneous functional data, focusing specifically on uncovering hidden patterns and network structures within such complex data. We utilize functional graphical models (FGMs) to explore the conditional dependence structure among random elements. We mainly focus on the following three research projects.
The first project combines the strengths of FGMs with finite mixture of regression models (FMR) to overcome the challenges of estimating conditional dependence structures from heterogeneous functional data. This novel approach facilitates the discovery of latent patterns, proving particularly advantageous for analyzing complex datasets, such as brain imaging studies of autism spectrum disorder (ASD). Through numerical analysis of both simulated data and real-world ASD brain imaging, we demonstrate the effectiveness of our methodology in uncovering complex dependencies that traditional methods may miss due to their homogeneous data assumptions.
Secondly, we address the challenge of variable selection within FMR in high-dimensional settings by proposing a joint variable selection technique. This technique employs a penalized expectation-maximization (EM) algorithm that leverages shared structures across regression components, thereby enhancing the efficiency of identifying relevant predictors and improving the predictive ability. We further expand this concept to mixtures of functional regressions, employing a group lasso penalty for variable selection in heterogeneous functional data.
Lastly, we recognize the limitations of existing methods in testing the equality of multiple functional graphs and develop a novel, permutation-based testing procedure. This method provides a robust, distribution-free approach to comparing network structures across different functional variables, as illustrated through simulation studies and functional magnetic resonance imaging (fMRI) analysis for ASD.
Hence, these research works provide a comprehensive framework for functional data analysis, significantly advancing the estimation of network structures, functional variable selection, and testing of functional graph equality. This methodology holds great promise for enhancing our understanding of heterogeneous functional data and its practical applications. / Doctor of Philosophy / This study introduces innovative techniques for analyzing complex, high-dimensional functional data, such as functional magnetic resonance imaging (fMRI) data from the brain. Our goal is to reveal underlying patterns and network connections, particularly in the context of autism spectrum disorder (ASD). In functional data, we treat each signal curve from various locations as a single data point. These datasets are characterized by high dimensionality, with the number of model parameters exceeding the sample size.
We employ functional graphical models (FGMs) to investigate the conditional dependencies among data elements. Our approach combines FGMs with finite mixture of regression models (FMR), allowing us to uncover hidden patterns that traditional methods assuming homogeneity might miss. Additionally, we introduce a new method for selecting relevant variables in high-dimensional regression contexts. This method enhances prediction accuracy by utilizing shared information among regression components.
Furthermore, we develop a robust testing framework to facilitate the comparison of network structures between groups without relying on distribution assumptions. This enables precise evaluations of functional graphs.
Hence, our research works contribute to a deeper understanding of complex, diverse functional data, paving the way for novel insights across various fields.
|
13 |
Postupy pro detekci změny v některých speciálních regresních modelech / Postupy pro detekci změny v některých speciálních regresních modelechExnarová, Petra January 2013 (has links)
Title: Detection of change in some special regression models Author: Bc. Petra Exnarová Department: Department of Probability and Mathematical Statistics Supervisor: Prof. RNDr. Marie Hušková, DrSc. Abstract: Presented thesis deals with testing of change in three special cases of change-point analysis. First of them is case of continuous change in linear regression (so-called broken-line model), the other two are related to change in parameters of discrete value distributions - simple case of Bernoulli distributed variables is studied first and then the approach is generalized for case of Multi- nomial distribution. Both situations of known and unknown change point are described for all three cases. Beside approximation by using limit theorems, the bootstrap method and permutation test are described for all studied cases as well. The comparison of critical values gained by different approaches for the particular tests and small power analysis is done using simulations. Keywords: change-point analysis, broken-line model, discrete distribution, boot- strap, permutation test 1
|
14 |
The Impact of Midbrain Cauterize Size on Auditory and Visual Responses' DistributionZhang, Yan 20 April 2009 (has links)
This thesis presents several statistical analysis on a cooperative project with Dr. Pallas and Yuting Mao from Biology Department of Georgia State University. This research concludes the impact of cauterize size of animals’ midbrain on auditory and visual response in brains. Besides some already commonly used statistical analysis method, such as MANOVA and Frequency Test, a unique combination of Permutation Test, Kolmogorov-Smirnov Test and Wilcoxon Rank Sum Test is applied to our non-parametric data. Some simulation results show the Permutation Test we used has very good powers, and fits the need for this study. The result confirms part of the Biology Department’s hypothesis statistically and enhances more complete understanding of the experiments and the potential impact of helping patients with Acquired Brain Injury.
|
15 |
The Impact of the COVID-19 Lockdown on the Urban Air Quality: A Machine Learning Approach.Bobba, Srinivas January 2021 (has links)
‘‘SARS-CoV-2’’ which is responsible for the current pandemic of COVID-19 disease was first reported from Wuhan, China, on 31 December 2019. Since then, to prevent its propagation around the world, a set of rapid and strict countermeasures have been taken. While most of the researchers around the world initiated their studies on the Covid-19 lockdown effect on air quality and concluded pollution reduction, the most reliable methods that can be used to find out the reduction of the pollutants in the air are still in debate. In this study, we performed an analysis on how Covid-19 lockdown procedures impacted the air quality in selected cities i.e. New Delhi, Diepkloof, Wuhan, and London around the world. The results show that the air quality index (AQI) improved by 43% in New Delhi,18% in Wuhan,15% in Diepkloof, and 12% in London during the initial lockdown from the 19th of March 2020 to 31st May 2020 compared to that of four-year pre-lockdown. Furthermore, the concentrations of four main pollutants, i.e., NO2, CO, SO2, and PM2.5 were analyzed before and during the lockdown in India. The quantification of pollution drop is supported by statistical measurements like the AVOVA Test and the Permutation Test. Overall, 58%, 61%,18% and 55% decrease is observed in NO2, CO,SO2, and PM2.5 concentrations, respectively. To check if the change in weather has played any role in pollution level reduction or not we analyzed how weather factors are correlated with pollutants using a correlation matrix. Finally, machine learning regression models are constructed to assess the lockdown impact on air quality in India by incorporating weather data. Gradient Boosting is performed well in the Prediction of drop-in PM2.5 concentration on individual cities in India. By comparing the feature importance ranking by regression models supported by correlation factors with PM2.5.This study concludes that COVID-19 lockdown has a significant effect on the natural environment and air quality improvement.
|
16 |
Corrélats neuro-fonctionnels du phénomène de sortie de boucle : impacts sur le monitoring des performances / Neurofunctional correlates of the out-of-the-loop phenomenon : impacts on performance monitoringSomon, Bertille 04 December 2018 (has links)
Les mutations technologiques à l’œuvre dans les systèmes aéronautiques ont profondément modifié les interactions entre l’homme et la machine. Au fil de cette évolution, les opérateurs se sont retrouvés face à des systèmes de plus en plus complexes, de plus en plus automatisés et de plus en plus opaques. De nombreuses tragédies montrent à quel point la supervision des systèmes par des opérateurs humains reste un problème sensible. En particulier, de nombreuses évidences montrent que l’automatisation a eu tendance à éloigner l’opérateur de la boucle de contrôle des systèmes, créant un phénomène dit de sortie de boucle (OOL). Ce phénomène se caractérise notamment par une diminution de la conscience de la situation et de la vigilance de l’opérateur, ainsi qu’une complaisance et une sur-confiance dans les automatismes. Ces difficultés déclenchent notamment une baisse des performances de l’opérateur qui n’est plus capable de détecter les erreurs du système et de reprendre la main si nécessaire. La caractérisation de l’OOL est donc un enjeux majeur des interactions homme-système et de notre société en constante évolution. Malgré plusieurs décennies de recherche, l’OOL reste difficile à caractériser, et plus encore à anticiper. Nous avons dans cette thèse utilisé les théories issues des neurosciences, notamment sur le processus de détection d’erreurs, afin de progresser sur notre compréhension de ce phénomène dans le but de développer des outils de mesure physiologique permettant de caractériser l’état de sortie de boucle lors d’interactions avec des systèmes écologiques. En particulier, l’objectif de cette thèse était de caractériser l’OOL à travers l’activité électroencéphalographique (EEG) dans le but d’identifier des marqueurs et/ou précurseurs de la dégradation du processus de supervision du système. Nous avons dans un premier temps évalué ce processus de détection d’erreurs dans des conditions standards de laboratoire plus ou moins complexes. Deux études en EEG nous ont d’abord permis : (i) de montrer qu’une activité cérébrale associée à ce processus cognitif se met en place dans les régions fronto-centrales à la fois lors de la détection de nos propres erreurs (ERN-Pe et FRN-P300) et lors de la détection des erreurs d’un agent que l’on supervise, (complexe N2-P3) et (ii) que la complexité de la tâche évaluée peut dégrader cette activité cérébrale. Puis nous avons mené une autre étude portant sur une tâche plus écologique et se rapprochant des conditions de supervision courantes d’opérateurs dans l’aéronautique. Au travers de techniques de traitement du signal EEG particulières (e.g., analyse temps-fréquence essai par essai), cette étude a mis en évidence : (i) l’existence d’une activité spectrale θ dans les régions fronto-centrales qui peut être assimilée aux activités mesurées en condition de laboratoire, (ii) une diminution de l’activité cérébrale associée à la détection des décisions du système au cours de la tâche, et (iii) une diminution spécifique de cette activité pour les erreurs. Dans cette thèse, plusieurs mesures et analyses statistiques de l’activité EEG ont été adaptées afin de considérer les contraintes des tâches écologiques. Les perspectives de cette thèse ouvrent sur une étude en cours dont le but est de mettre en évidence la dégradation de l’activité de supervision des systèmes lors de la sortie de boucle, ce qui permettrait d’identifier des marqueurs précis de ce phénomène permettant ainsi de le détecter, voire même, de l’anticiper. / The ongoing technological mutations occuring in aeronautics have profoundly changed the interactions between men and machines. Systems are more and more complex, automated and opaque. Several tragedies have reminded us that the supervision of those systems by human operators is still a challenge. Particularly, evidences have been made that automation has driven the operators away from the control loop of the system thus creating an out-of-the-loop phenomenon (OOL). This phenomenon is characterized by a decrease in situation awareness and vigilance, but also complacency and over-reliance towards automated systems. These difficulties have been shown to result in a degradation of the operator’s performances. Thus, the OOL phenomenon is a major issue of today’s society to improve human-machine interactions. Even though it has been studied for several decades, the OOL is still difficult to characterize, and even more to predict. The aim of this thesis is to define how cognitive neurosciences theories, such as the performance monitoring activity, can be used in order to better characterize the OOL phenomenon and the operator’s state, particularly through physiological measures. Consequently, we have used electroencephalographic activity (EEG) to try and identify markers and/or precursors of the supervision activity during system monitoring. In a first step we evaluated the error detection or performance monitoring activity through standard laboratory tasks, with varying levels of difficulty. We performed two EEG studies allowing us to show that : (i) the performance monitoring activity emerges both for our own errors detection but also during another agent supervision, may it be a human agent or an automated system, and (ii) the performance monitoring activity is significantly decreased by increasing task difficulty. These results led us to develop another experiment to assess the brain activity associated with system supervision in an ecological environment, resembling everydaylife aeronautical system monitoring. Thanks to adapted signal processing techniques (e.g. trial-by-trial time-frequency decomposition), we were able to show that there is : (i) a fronto-central θ activité time-locked to the system’s decision similar to the one obtained in laboratory condition, (ii) a decrease in overall supervision activity time-locked to the system’s decision, and (iii) a specific decrease of monitoring activity for errors. In this thesis, several EEG measures have been used in order to adapt to the context at hand. As a perspective, we have developped a final study aiming at defining the evolution of the monitoring activity during the OOL. Finding markers of this degradation would allow to monitor its emersion, and even better, predict it.
|
17 |
Some Advanced Semiparametric Single-index Modeling for Spatially-Temporally Correlated DataMahmoud, Hamdy F. F. 09 October 2014 (has links)
Semiparametric modeling is a hybrid of the parametric and nonparametric modelings where some function forms are known and others are unknown. In this dissertation, we have made several contributions to semiparametric modeling based on the single index model related to the following three topics: the first is to propose a model for detecting change points simultaneously with estimating the unknown function; the second is to develop two models for spatially correlated data; and the third is to further develop two models for spatially-temporally correlated data.
To address the first topic, we propose a unified approach in its ability to simultaneously estimate the nonlinear relationship and change points. We propose a single index change point model as our unified approach by adjusting for several other covariates. We nonparametrically estimate the unknown function using kernel smoothing and also provide a permutation based testing procedure to detect multiple change points. We show the asymptotic properties of the permutation testing based procedure. The advantage of our approach is demonstrated using the mortality data of Seoul, Korea from January, 2000 to December, 2007.
On the second topic, we propose two semiparametric single index models for spatially correlated data. One additively separates the nonparametric function and spatially correlated random effects, while the other does not separate the nonparametric function and spatially correlated random effects. We estimate these two models using two algorithms based on Markov Chain Expectation Maximization algorithm. Our approaches are compared using simulations, suggesting that the semiparametric single index nonadditive model provides more accurate estimates of spatial correlation. The advantage of our approach is demonstrated using the mortality data of six cities, Korea from January, 2000 to December, 2007.
The third topic involves proposing two semiparametric single index models for spatially and temporally correlated data. Our first model has the nonparametric function which can separate from spatially and temporally correlated random effects. We refer it to "semiparametric spatio-temporal separable single index model (SSTS-SIM)", while the second model does not separate the nonparametric function from spatially correlated random effects but separates the time random effects. We refer our second model to "semiparametric nonseparable single index model (SSTN-SIM)". Two algorithms based on Markov Chain Expectation Maximization algorithm are introduced to simultaneously estimate parameters, spatial effects, and times effects. The proposed models are then applied to the mortality data of six major cities in Korea. Our results suggest that SSTN-SIM is more flexible than SSTS-SIM because it can estimate various nonparametric functions while SSTS-SIM enforces the similar nonparametric curves. SSTN-SIM also provides better estimation and prediction. / Ph. D.
|
18 |
A comparative study of permutation proceduresVan Heerden, Liske 30 November 1994 (has links)
The unique problems encountered when analyzing weather data sets - that is, measurements taken while conducting a meteorological experiment- have forced statisticians to reconsider the conventional analysis methods and investigate permutation test procedures. The problems encountered when analyzing weather data sets are simulated for a Monte Carlo study, and the results of the parametric and permutation t-tests are
compared with regard to significance level, power, and the average coilfidence interval length. Seven population distributions are considered - three are variations of the normal distribution, and the others the gamma, the lognormal, the rectangular and empirical distributions. The normal distribution contaminated with zero measurements is also simulated. In those simulated situations in which the variances are unequal, the permutation
test procedure was performed using other test statistics, namely the Scheffe, Welch and Behrens-Fisher test statistics. / Mathematical Sciences / M. Sc. (Statistics)
|
19 |
兩種正則化方法用於假設檢定與判別分析時之比較 / A comparison between two regularization methods for discriminant analysis and hypothesis testing李登曜, Li, Deng-Yao Unknown Date (has links)
在統計學上,高維度常造成許多分析上的問題,如進行多變量迴歸的假設檢定時,當樣本個數小於樣本維度時,其樣本共變異數矩陣之反矩陣不存在,使得檢定無法進行,本文研究動機即為在進行兩群多維常態母體的平均數檢定時,所遇到的高維度問題,並引發在分類上的研究,試圖尋找解決方法。本文研究目的為在兩種不同的正則化方法中,比較何者在檢定與分類上表現較佳。本文研究方法為以 Warton 與 Friedman 的正則化方法來分別進行檢定與分類上的分析,根據其檢定力與分類錯誤的表現來判斷何者較佳。由分析結果可知,兩種正則化方法並沒有絕對的優劣,須視母體各項假設而定。 / High dimensionality causes many problems in statistical analysis. For instance, consider the testing of hypotheses about multivariate regression models. Suppose that the dimension of the multivariate response is larger than the number of observations, then the sample covariance matrix is not invertible. Since the inverse of the sample covariance matrix is often needed when computing the usual likelihood ratio test statistic (under normality), the matrix singularity makes it difficult to implement the test . The singularity of the sample covariance matrix is also a problem in classification when the linear discriminant analysis (LDA) or the quadratic discriminant analysis (QDA) is used.
Different regularization methods have been proposed to deal with the singularity of the sample covariance matrix for different purposes. Warton (2008) proposed a regularization procedure for testing, and Friedman (1989) proposed a regularization procedure for classification. Is it true that Warton's regularization works better for testing and Friedman's regularization works better for classification? To answer this question, some simulation studies are conducted and the results are presented in this thesis.
It is found that neither regularization method is superior to the other.
|
20 |
A comparative study of permutation proceduresVan Heerden, Liske 30 November 1994 (has links)
The unique problems encountered when analyzing weather data sets - that is, measurements taken while conducting a meteorological experiment- have forced statisticians to reconsider the conventional analysis methods and investigate permutation test procedures. The problems encountered when analyzing weather data sets are simulated for a Monte Carlo study, and the results of the parametric and permutation t-tests are
compared with regard to significance level, power, and the average coilfidence interval length. Seven population distributions are considered - three are variations of the normal distribution, and the others the gamma, the lognormal, the rectangular and empirical distributions. The normal distribution contaminated with zero measurements is also simulated. In those simulated situations in which the variances are unequal, the permutation
test procedure was performed using other test statistics, namely the Scheffe, Welch and Behrens-Fisher test statistics. / Mathematical Sciences / M. Sc. (Statistics)
|
Page generated in 0.1066 seconds