Global ETD Search

1	Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations January 2019 (has links) abstract: Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it. I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data. / Dissertation/Thesis / Doctoral Dissertation Psychology 2019 Quantitative psychology Growth Curve Model Longitudinal Data Machine Learning Mixed-Effects Models Recursive Partitioning Regression Trees
2	Residuals in the growth curve model with applications to the analysis of longitudinal data HUANG, WEILIANG January 2012 (has links) <p>Statistical models often rely on several assumptions including distributional assumptions on outcome variables and relational assumptions where we model the relationship between outcomes and independent variables. Further assumptions are also made depending on the complexity of the data and the model being used. Model diagnostics is, therefore, a crucial component of any model fitting problem. Residuals play important roles in model diagnostics. Residuals are not only used to check adequacy of model fit, but they also are excellent tools to validate model assumptions as well as identify outliers and influential observations. Residuals in univariate models are studied extensively and are routinely used for model diagnostics. In multivariate models residuals are not commonly used to assess model fit, although a few approaches have been proposed to check multivariate normality. However, in the analysis of longitudinal data, the resulting residuals are correlated and are not normally distributed. It is, therefore, not clear as to how ordinary residuals can be used for model diagnostics. Under sufficiently large sample size, a transformation of ordinary residuals are proposed to check the normality assumption. The transformation is based solely on removing correlation among the residuals. However, we show that these transformed residuals fail in the presence of model mis-specification. In this thesis, we investigate residuals in the analysis of longitudinal data. We consider ordinary residuals, Fitzmaurice’s transformed (uncorrelated) residuals as well as von Rosen’s decomposed residuals. Using simulation studies, we show how the residuals behave under multivariate normality and when this assumption is violated. We also investigate their properties under correct fitting as well as wrongly fitted models. Finally, we propose new residuals by transforming von Rosen’s decomposed residuals. We show that these residuals perform better than Fitzmourice’s transformed residuals in the presence of model mis-specification. We illustrate our approach using two real data sets.</p> / Master of Science (MSc) Decomposition of linear spaces growth curve model residuals decomposed residuals Statistical Methodology Statistical Methodology
3	Inference for Generalized Multivariate Analysis of Variance (GMANOVA) Models and High-dimensional Extensions Jana, Sayantee 11 1900 (has links) A Growth Curve Model (GCM) is a multivariate linear model used for analyzing longitudinal data with short to moderate time series. It is a special case of Generalized Multivariate Analysis of Variance (GMANOVA) models. Analysis using the GCM involves comparison of mean growths among different groups. The classical GCM, however, possesses some limitations including distributional assumptions, assumption of identical degree of polynomials for all groups and it requires larger sample size than the number of time points. In this thesis, we relax some of the assumptions of the traditional GCM and develop appropriate inferential tools for its analysis, with the aim of reducing bias, improving precision and to gain increased power as well as overcome limitations of high-dimensionality. Existing methods for estimating the parameters of the GCM assume that the underlying distribution for the error terms is multivariate normal. In practical problems, however, we often come across skewed data and hence estimation techniques developed under the normality assumption may not be optimal. Simulation studies conducted in this thesis, in fact, show that existing methods are sensitive to the presence of skewness in the data, where estimators are associated with increased bias and mean square error (MSE), when the normality assumption is violated. Methods appropriate for skewed distributions are, therefore, required. In this thesis, we relax the distributional assumption of the GCM and provide estimators for the mean and covariance matrices of the GCM under multivariate skew normal (MSN) distribution. An estimator for the additional skewness parameter of the MSN distribution is also provided. The estimators are derived using the expectation maximization (EM) algorithm and extensive simulations are performed to examine the performance of the estimators. Comparisons with existing estimators show that our estimators perform better than existing estimators, when the underlying distribution is multivariate skew normal. Illustration using real data set is also provided, wherein Triglyceride levels from the Framingham Heart Study is modelled over time. The GCM assumes equal degree of polynomial for each group. Therefore, when groups means follow different shapes of polynomials, the GCM fails to accommodate this difference in one model. We consider an extension of the GCM, wherein mean responses from different groups can have different shapes, represented by polynomials of different degree. Such a model is referred to as Extended Growth Curve Model (EGCM). We extend our work on GCM to EGCM, and develop estimators for the mean and covariance matrices under MSN errors. We adopted the Restricted Expectation Maximization (REM) algorithm, which is based on the multivariate Newton-Raphson (NR) method and Lagrangian optimization. However, the multivariate NR method and hence, the existing REM algorithm are applicable to vector parameters and the parameters of interest in this study are matrices. We, therefore, extended the NR approach to matrix parameters, which consequently allowed us to extend the REM algorithm to matrix parameters. The performance of the proposed estimators were examined using extensive simulations and a motivating real data example was provided to illustrate the application of the proposed estimators. Finally, this thesis deals with high-dimensional application of GCM. Existing methods for a GCM are developed under the assumption of ‘small p large n’ (n >> p) and are not appropriate for analyzing high-dimensional longitudinal data, due to singularity of the sample covariance matrix. In a previous work, we used Moore-Penrose generalized inverse to overcome this challenge. However, the method has some limitations around near singularity, when p~n. In this thesis, a Bayesian framework was used to derive a test for testing the linear hypothesis on the mean parameter of the GCM, which is applicable in high-dimensional situations. Extensive simulations are performed to investigate the performance of the test statistic and establish optimality characteristics. Results show that this test performs well, under different conditions, including the near singularity zone. Sensitivity of the test to mis-specification of the parameters of the prior distribution are also examined empirically. A numerical example is provided to illustrate the usefulness of the proposed method in practical situations. / Thesis / Doctor of Philosophy (PhD) Growth Curve Model (GCM) GMANOVA models Bayesian methods High-dimensional data Longitudinal analysis Multivariate Skew Normal distribution Extended Growth Curve Model (EGCM) EM algorithm Restricted EM algorithm Matrix Newton Raphson method
4	Linear Discriminant Analysis with Repeated Measurements Skinner, Evelina January 2019 (has links) The classification of observations based on repeated measurements performed on the same subject over a given period of time or under different conditions is a common procedure in many disciplines such as medicine, psychology and environmental studies. In this thesis repeated measurements follow the Growth Curve model and are classified using linear discriminant analysis. The aim of this thesis is both to examine the effect of missing data on classification accuracy and to examine the effect of additional data on classification robustness. The results indicate that an increasing amount of missing data leads to a progressive decline in classification accuracy. With regard to the effect of additional data on classification robustness the results show a less predictable effect which can only be characterised as a general tendency towards improved robustness. Classification repeated measurements linear discriminant analysis the Growth Curve model missing data classification robustness additional measurements Probability Theory and Statistics Sannolikhetsteori och statistik
5	Bilinear Gaussian Radial Basis Function Networks for classification of repeated measurements Sjödin Hällstrand, Andreas January 2020 (has links) The Growth Curve Model is a bilinear statistical model which can be used to analyse several groups of repeated measurements. Normally the Growth Curve Model is defined in such a way that the permitted sampling frequency of the repeated measurement is limited by the number of observed individuals in the data set.In this thesis, we examine the possibilities of utilizing highly frequently sampled measurements to increase classification accuracy for real world data. That is, we look at the case where the regular Growth Curve Model is not defined due to the relationship between the sampling frequency and the number of observed individuals. When working with this high frequency data, we develop a new method of basis selection for the regression analysis which yields what we call a Bilinear Gaussian Radial Basis Function Network (BGRBFN), which we then compare to more conventional polynomial and trigonometrical functional bases. Finally, we examine if Tikhonov regularization can be used to further increase the classification accuracy in the high frequency data case.Our findings suggest that the BGRBFN performs better than the conventional methods in both classification accuracy and functional approximability. The results also suggest that both high frequency data and furthermore Tikhonov regularization can be used to increase classification accuracy. Bilinear Regression Classification Supervised Learning High Dimensional Statistics Growth Curve Model Regularization Radial Basis Probability Theory and Statistics Sannolikhetsteori och statistik
6	Depressive Symptoms Trajectories Following Child Death in Later Life: Variation by Race-Ethnicity Mellencamp, Kagan Alexander 13 August 2019 (has links) No description available. Sociology child death race-ethnicity later life depressive symptoms parental bereavement resilience cumulative disadvantage trajectory acute chronic growth curve model
7	The unweighted mean estimator in a Growth Curve model Karlsson, Emil January 2016 (has links) The field of statistics is becoming increasingly more important as the amount of data in the world grows. This thesis studies the Growth Curve model in multivariate statistics which is a model that is not widely used. One difference compared with the linear model is that the Maximum Likelihood Estimators are more complicated. That makes it more difficult to use and to interpret which may be a reason for its not so widespread use. From this perspective this thesis will compare the traditional mean estimator for the Growth Curve model with the unweighted mean estimator. The unweighted mean estimator is simpler than the regular MLE. It will be proven that the unweighted estimator is in fact the MLE under certain conditions and examples when this occurs will be discussed. In a more general setting this thesis will present conditions when the un-weighted estimator has a smaller covariance matrix than the MLEs and also present confidence intervals and hypothesis testing based on these inequalities. Growth Curve model maximum likelihood eigenvalue inequality un-weighted mean estimator covariance matrix circular symmetric Toeplitz intra-class generalized intraclas Probability Theory and Statistics Sannolikhetsteori och statistik
8	Decision Trees for Classification of Repeated Measurements Holmberg, Julianna January 2024 (has links) Classification of data from repeated measurements is useful in various disciplines, for example that of medicine. This thesis explores how classification trees (CART) can be used for classifying repeated measures data. The reader is introduced to variations of the CART algorithm which can be used for classifying the data set and tests the performance of these algorithms on a data set that can be modelled using bilinear regression. The performance is compared with that of a classification rule based on linear discriminant analysis. It is found that while the performance of the CART algorithm can be satisfactory, using linear discriminant analysis is more reliable for achieving good results. / Klassificering av data från upprepade mätningar är användbart inom olika discipliner, till exempel medicin. Denna uppsats undersöker hur klassificeringsträd (CART) kan användas för att klassificera upprepade mätningar. Läsaren introduceras till varianter av CART-algoritmen som kan användas för att klassificera datamängden och testar prestandan för dessa algoritmer på en datamängd som kan modelleras med hjälp av bilinjär regression. Prestandan jämförs med en klassificeringsregel baserad på linjär diskriminantanalys. Det har visar sig att även om prestandan för CART-algoritmen kan vara tillfredsställande, är användning av linjär diskriminantanalys mer tillförlitlig för att uppnå goda resultat. Repeated Measurement Data Growth Curve Model Linear Discriminant Analysis Decision Tree Bootstrap Aggregating CART CART-LC Probability Theory and Statistics Sannolikhetsteori och statistik
9	The Growth Curve Model for High Dimensional Data and its Application in Genomics Jana, Sayantee 04 1900 (has links) <p>Recent advances in technology have allowed researchers to collect high-dimensional biological data simultaneously. In genomic studies, for instance, measurements from tens of thousands of genes are taken from individuals across several experimental groups. In time course microarray experiments, gene expression is measured at several time points for each individual across the whole genome resulting in massive amount of data. In such experiments, researchers are faced with two types of high-dimensionality. The first is global high-dimensionality, which is common to all genomic experiments. The global high-dimensionality arises because inference is being done on tens of thousands of genes resulting in multiplicity. This challenge is often dealt with statistical methods for multiple comparison, such as the Bonferroni correction or false discovery rate (FDR). We refer to the second type of high-dimensionality as gene specific high-dimensionality, which arises in time course microarry experiments due to the fact that, in such experiments, sample size is often smaller than the number of time points ($n</p> <p>In this thesis, we use the growth curve model (GCM), which is a generalized multivariate analysis of variance (GMANOVA) model, and propose a moderated test statistic for testing a special case of the general linear hypothesis, which is specially useful for identifying genes that are expressed. We use the trace test for the GCM and modify it so that it can be used in high-dimensional situations. We consider two types of moderation: the Moore-Penrose generalized inverse and Stein's shrinkage estimator of $ S $. We performed extensive simulations to show performance of the moderated test, and compared the results with original trace test. We calculated empirical level and power of the test under many scenarios. Although the focus is on hypothesis testing, we also provided moderated maximum likelihood estimator for the parameter matrix and assessed its performance by investigating bias and mean squared error of the estimator and compared the results with those of the maximum likelihood estimators. Since the parameters are matrices, we consider distance measures in both power and level comparisons as well as when investigating bias and mean squared error. We also illustrated our approach using time course microarray data taken from a study on Lung Cancer. We were able to filter out 1053 genes as non-noise genes from a pool of 22,277 genes which is approximately 5\% of the total number of genes. This is in sync with results from most biological experiments where around 5\% genes are found to be differentially expressed.</p> / Master of Science (MSc) growth curve model high-dimensional data Euclidean distance multivariate bias and mean square error moderated trace test Moore-Penrose generalized inverse Biostatistics Multivariate Analysis Biostatistics
10	Emotion Regulation Difficulties and Couple Relationship Quality across Eight Sessions: A Dyadic Analysis with Latent Growth Curve Modeling Xu, Min 25 July 2023 (has links) (PDF) While previous studies have demonstrated significant associations between partners' emotion regulation and overall well-being, few studies provide knowledge on emotion regulation in clinical couples, especially across couple therapy sessions. With 168 heterosexual couples who attended at least the initial eight sessions of couple therapy, the current study was designed to examine the intra- and inter-personal effects of emotion regulation difficulties (i.e., lack of awareness, nonacceptance, limited strategies, and impulsivity) on the development of relationship quality over the course of couple therapy. The results of the current study provide a few important findings. First, gender differences exist in emotion regulation and relationship quality. Second, across couple therapy sessions, relationship quality improves and partners change at a similar rate. Third, emotion regulation difficulties have intra- and inter-personal effects on the starting scores and rates of change of relationship quality in the process of couple therapy. For clinical implications of findings, I specifically discussed that clinicians working with couples may find it beneficial to utilize couple therapy modalities with a specific emphasis on the emotion regulation process, such as Greenberg's version of emotionally focused couple therapy. The need for future studies with more diverse samples was also discussed. emotion regulation difficulties intrinsic emotion regulation couple relationship quality couple relationship functioning couple therapy dyadic latent growth curve model Family, Life Course, and Society

Search results