Spelling suggestions: "subject:"ariance component"" "subject:"ariance dcomponent""
1 |
Longitudinal analysis on AQI in 3 main economic zones of ChinaWu, Kailin 09 October 2014 (has links)
In modern China, air pollution has become an essential environmental problem. Over the last 2 years, the air pollution problem, as measured by PM 2.5 (particulate matter) is getting worse. My report aims to carry out a longitudinal data analysis of the air quality index (AQI) in 3 main economic zones in China. Longitudinal data, or repeated measures data, can be viewed as multilevel data with repeated measurements nested within individuals. I arrive at some conclusions about why the 3 areas have different AQI, mainly attributed to factors like population, GDP, temperature, humidity, and other factors like whether the area is inland or by the sea. The residual variance is partitioned into a between-zone component (the variance of the zone-level residuals) and a within-zone component (the variance of the city-level residuals). The zone residuals represent unobserved zone characteristics that affect AQI. In this report, the model building is mainly according to the sequence described by West et al (2007) with respect to the bottom-up procedures and the reference by Singer, J. D., & Willett, J. B (2003) which includes the non-linear situations. This report also compares the quartic curve model with piecewise growth model with respect to this data. The final model I reached is a piece wise model with time-level and zone-level predictors and also with temperature by time interactions. / text
|
2 |
High resolution linkage and association study of quantitative trait lociJung, Jeesun 01 November 2005 (has links)
As a large number of single nucleotide polymorphisms (SNPs) and microsatellite
markers are available, high resolution mapping employing multiple markers or
multiple allele markers is an important step to identify quantitative trait locus (QTL)
of complex human disease. For many complex diseases, quantitative phenotype values
contain more information than dichotomous traits do.
Much research has been done on conducting high resolution mapping using information
of linkage and linkage disequilibrium. The most commonly employed approaches
for mapping QTL are pedigree-based linkage analysis and population-based
association analysis. As one of the methods dealing with multiple alleles markers,
mixed models are developed to work out family-based association study with the information
of transmitted allele and nontransmitted allele from one parent to offspring.
For multiple markers, variance component models are proposed to perform association
study and linkage analysis simultaneously. Linkage analysis provides suggestive
linkage based on a broad chromosome region and is robust to population admixtures.
One the other hand, allelic association due to linkage disequilibrium (LD) usually
operates over very short genetic distance, but is affected by population stratification.
Combining both approaches plays a synergistic role in overcoming their limitations
and in increasing the efficiency and effectiveness of gene mapping.
|
3 |
Models estadístics en avaluació educativa: les proves d'accés a la universitatCuxart i Jardí, Anna 26 November 1998 (has links)
La tesis se inscribe en un doble ámbito científico formado por la Estadística y la Pedagogía. El objetivo de la tesis es la investigación de modelos estadísticos y estrategias de análisis que puedan ser de utilidad en el seguimiento de sistemas de evaluación complejos. Su motivación se encuentra en la necesidad de analizar las Pruebas de Aptitud para el Acceso a la Universidad (PAAU), que regulan el acceso a la universidad en España, desde la perspectiva de la ciencia estadística. La validez y fiabilidad de los exámenes COU (Curso de Orientación Universitaria) y PAAU han merecido una atención especial a lo largo de la investigación. Asimismo, se analizan con detenimiento las principales fuentes de variación de dichas notas: las diferencias entre centros de secundaria y el proceso de corrección de las pruebas PAAU.En la Introducción, una vez resumidas las características del sistema de evaluación de las pruebas PAAU y discutido el papel de la estadística en el tratamiento de datos en educación, se establecen los objetivos concretos de la tesis, a la luz de las necesidades existentes y de los trabajos de investigación realizados hasta el momento.El Capítulo 1 ilustra las diferencias entre los exámenes COU y las pruebas PAAU. Se aborda el estudio de la asociación entre ambas puntuaciones. La modelización de la variación de la nota PAAU individual por medio de modelos de regresión coeficientes aleatorios permite evidenciar (y medir) las diferencias entre centros de secundaria en cuanto a los estándares utilizados en COU. Este primer capítulo contiene una detallada introducción a los modelos de coeficientes aleatorios, también llamados modelos de nivel múltiple, que posteriormente se aplicaran en los capítulos 2 y 4, en la versión de modelos multivariantes de componentes de la varianza. El segundo capitulo, en un enfoque que complementa el anterior, se centra en el estudio de las medias (de COU y de PAAU) de cada centro, en la estructura de covarianza entre ambas. Como resultado relevante cabe citar la aplicación a la selección de la combinación más eficiente. El Capítulo 3 se ha dedicado enteramente a la calidad del sistema de corrección de los exámenes PAAU. La modelización presentada ha permitido evaluar el impacto de los correctores en términos de la varianza debida a las diferencias en el grado de severidad y a la varianza generada por la inconsistencia. Para la obtención de los datos se ha requerido del diseño de experimentos. Dichos experimentos, que han evidenciado una serie de puntos débiles del sistema, deberían ser realizados de manera sistemática cada año en una estrategia de mejora de la calidad del proceso. El Capítulo 4 estudia la covarianza del conjunto de notas PAAU tanto a nivel estudiante como a nivel centro, ofreciendo nuevos elementos de reflexión para la validez de dichas pruebas. El Capítulo 5 resume la aplicación de varias propuestas de la tesis a la primera convocatoria de las pruebas PAAU-LOGSE. El Capítulo 6 incluye las conclusiones de la tesis así como una serie de propuestas de seguimiento y mejora de la calidad global del sistema.
|
4 |
Maximum Likelihood Estimators of the Variance Components Based on the Q-Reduced ModelLee, K. R., Kapadia, C. H. 01 January 1988 (has links)
In a variance component model,(Formula presented.), Pukelsheim (1981) proved that the non-negative and unbiased estimation of the variance components σ(Formula presented.), j=1, …, c, entails a transformation of the original model to Q(Formula presented.) (called Q-reduced model). The maximum likelihood (ML) approach based on the likelihood of Q(Formula presented.) (denoted Q-ML) is considered and applied to an incomplete block design (IBD) model. The Q-ML estimators of variance components and are shown to be more efficient in the mean squared error sense than the non-negative MINQUE’s (minimum norm quadratic unbiased estimators) in the IBD. The effect of using Q-ML estimators of the variance components to estimate the variance ratio in the combined estimator of the treatment contrast is also considered.
|
5 |
Multivariate and Structural Equation Models for Family DataMorris, Nathan J. 13 October 2009 (has links)
No description available.
|
6 |
Hidden Variance in Multiple Mini-Interview ScoresZaidi, Nikki 09 June 2015 (has links)
No description available.
|
7 |
Time-Varying Coefficient Models for Recurrent EventsLiu, Yi 14 November 2018 (has links)
I have developed time-varying coefficient models for recurrent event data to evaluate the temporal profiles for recurrence rate and covariate effects. There are three major parts in this dissertation. The first two parts propose a mixed Poisson process model with gamma frailties for single type recurrent events. The third part proposes a Bayesian joint model based on multivariate log-normal frailties for multi-type recurrent events. In the first part, I propose an approach based on penalized B-splines to obtain smooth estimation for both time-varying coefficients and the log baseline intensity. An EM algorithm is developed for parameter estimation. One issue with this approach is that the estimating procedure is conditional on smoothing parameters, which have to be selected by cross-validation or optimizing certain performance criterion. The procedure can be computationally demanding with a large number of time-varying coefficients. To achieve objective estimation of smoothing parameters, I propose a mixed-model representation approach for penalized splines. Spline coefficients are treated as random effects and smoothing parameters are to be estimated as variance components. An EM algorithm embedded with penalized quasi-likelihood approximation is developed to estimate the model parameters. The third part proposes a Bayesian joint model with time-varying coefficients for multi-type recurrent events. Bayesian penalized splines are used to estimate time-varying coefficients and the log baseline intensity. One challenge in Bayesian penalized splines is that the smoothness of a spline fit is considerably sensitive to the subjective choice of hyperparameters. I establish a procedure to objectively determine the hyperparameters through a robust prior specification. A Markov chain Monte Carlo procedure based on Metropolis-adjusted Langevin algorithms is developed to sample from the high-dimensional distribution of spline coefficients. The procedure includes a joint sampling scheme to achieve better convergence and mixing properties. Simulation studies in the second and third part have confirmed satisfactory model performance in estimating time-varying coefficients under different curvature and event rate conditions. The models in the second and third part were applied to data from a commercial truck driver naturalistic driving study. The application results reveal that drivers with 7-hours-or-less sleep prior to a shift have a significantly higher intensity after 8 hours of on-duty driving and that their intensity remains higher after taking a break. In addition, the results also show drivers' self-selection on sleep time, total driving hours in a shift, and breaks. These applications provide crucial insight into the impact of sleep time on driving performance for commercial truck drivers and highlights the on-road safety implications of insufficient sleep and breaks while driving. This dissertation provides flexible and robust tools to evaluate the temporal profile of intensity for recurrent events. / PHD / The overall objective of this dissertation is to develop models to evaluate the time-varying profiles for event occurrences and the time-varying effects of risk factors upon event occurrences. There are three major parts in this dissertation. The first two parts are designed for single event type. They are based on approaches such that the whole model is conditional on a certain kind of tuning parameter. The value of this tuning parameter has to be pre-specified by users and is influential to the model results. Instead of pre-specifying the value, I develop an approach to achieve an objective estimate for the optimal value of tuning parameter and obtain model results simultaneously. The third part proposes a model for multi-type events. One challenge is that the model results are considerably sensitive to the subjective choice of hyperparameters. I establish a procedure to objectively determine the hyperparameters. Simulation studies have confirmed satisfactory model performance in estimating the temporal profiles for both event occurrences and effects of risk factors. The models were applied to data from a commercial truck driver naturalistic driving study. The results reveal that drivers with 7-hours-or-less sleep prior to a shift have a significantly higher intensity after 8 hours of on-duty driving and that their driving risk remains higher after taking a break. In addition, the results also show drivers’ self-selection on sleep time, total driving hours in a shift, and breaks. These applications provide crucial insight into the impact of sleep time on driving performance for commercial truck drivers and highlights the on-road safety implications of insufficient sleep and breaks while driving. This dissertation provides flexible and robust tools to evaluate the temporal profile of both event occurrences and effects of risk factors.
|
8 |
Reproductive traits and sex ratio bias in the dwarf willow Salix herbaceaZhao, Minchun January 2024 (has links)
Sex ratio is very important for the evolution of dioecious plants. It can influence the reproductive success of plants. Sex ratio bias is common among reproductive individuals. Studying the reproductive traits can help to understand possible mechanisms that could influence the generation and maintenance of sex ratio bias. However, few studies have reported the relationship between reproductive traits and sex ratio bias. We investigated 29 full-sib families of the dwarf willow Salix herbacea L. S.herbacea exhibits an overall female sex ratio bias but also strong variation in sex ratio among families. We used variance component analysis to investigate from which morphological level the variation of reproductive traits (cumulative catkin number over four growth periods, annual catkin number in the fourth growth period, flower number, ovule number) came from. And we used mixed models to test the influence of family, sex and the sex by family interaction on reproductive traits. Besides that, we also tested the correlation between sex ratio and reproductive traits. Our results suggest that genetic factors can influence the degree of sexual dimorphism of S. herbacea in the different families Flowers from families with higher sex ratios had more ovules, sex ratio and the ovule number co-varied across families.
|
9 |
Elucidating and Mapping Heat Tolerance in Wild Tetraploid Wheat (Triticum turgidum L.)Ali, Mohamed Badry Mohamed 2010 December 1900 (has links)
Identifying reliable screening tools and characterizing tolerant germplasm sources is essential for developing wheat (Triticum aestivum L.) varieties suited for the hot areas of the world. Our objective was to evaluate heat tolerance of promising wild tetraploid wheat (Triticum turgidum L.) accessions that could be used as sources of heat tolerance in common- and durum-wheat (Triticum durum) breeding programs.
We screened 109 wild tetraploid wheat accessions collected by the International Center for Agriculture Research in the Dry Areas (ICARDA) from the hottest wheat growing areas in Africa and Asia, as well as, two common wheat checks for their response to heat stress by measuring damage to the thylakoid membranes, flag leaf temperature depression (FLTD), and spike temperature depression (STD) during exposure to heat stress for 16 beginning at anthesis. Measurements were taken on the day of anthesis then 4, 8, 12, and 16 days post anthesis (DPA) under controlled optimum and heat-stress conditions. Individual kernel weight (IKW) and heat susceptibility index (HSI) measurements were also obtained. Prolonged exposure to heat stress was associated with increased damage to thylakoid membranes, as indicated by the high ratio of constant fluorescence (O) to peak variable fluorescence (P).
A positive and significant correlation was found between O/P ratio and both FLTD and STD under heat-stress conditions. A negative and significant correlation was found between FLTD and HSI and between STD and HSI based on the second and third measurements (4 and 8 DPA). Correlations obtained after the third measurement were not significant because heat-stress accelerated maturity and senescence.
For a pedigree-based mapping strategy a family approach was then developed by crossing and back-crossing heat-tolerant and heat-susceptible germplasm. A set of 800 lines resulting from the pedigree-based family approach was phenotyped using FLTD, chlorophyll content and yield and its components under heat stress. Genotyping of these lines was accomplished using simple sequence repeat (SSRs) markers. Some QTLs associated with heat stress tolerance were identified. This study identified potential heat-tolerant wild tetraploid wheat germplasm and QTL conditioning heat tolerance that can be incorporated into wheat breeding programs to improve cultivated common and durum wheat.
|
10 |
Some Advances in Classifying and Modeling Complex DataZhang, Angang 16 December 2015 (has links)
In statistical methodology of analyzing data, two of the most commonly used techniques are classification and regression modeling. As scientific technology progresses rapidly, complex data often occurs and requires novel classification and regression modeling methodologies according to the data structure. In this dissertation, I mainly focus on developing a few approaches for analyzing the data with complex structures.
Classification problems commonly occur in many areas such as biomedical, marketing, sociology and image recognition. Among various classification methods, linear classifiers have been widely used because of computational advantages, ease of implementation and interpretation compared with non-linear classifiers. Specifically, linear discriminant analysis (LDA) is one of the most important methods in the family of linear classifiers.
For high dimensional data with number of variables p larger than the number of observations n occurs more frequently, it calls for advanced classification techniques.
In Chapter 2, I proposed a novel sparse LDA method which generalizes LDA through a regularized approach for the two-class classification problem.
The proposed method can obtain an accurate classification accuracy with attractive computation, which is suitable for high dimensional data with p>n.
In Chapter 3, I deal with the classification when the data complexity lies in the non-random missing responses in the training data set. Appropriate classification method needs to be developed accordingly. Specifically, I considered the "reject inference problem'' for the application of fraud detection for online business. For online business, to prevent fraud transactions, suspicious transactions are rejected with unknown fraud status, yielding a training data with selective missing response. A two-stage modeling approach using logistic regression is proposed to enhance the efficiency and accuracy of fraud detection.
Besides the classification problem, data from designed experiments in scientific areas often have complex structures. Many experiments are conducted with multiple variance sources. To increase the accuracy of the statistical modeling, the model need to be able to accommodate more than one error terms. In Chapter 4, I propose a variance component mixed model for a nano material experiment data to address the between group, within group and within subject variance components into a single model. To adjust possible systematic error introduced during the experiment, adjustment terms can be added. Specifically a group adaptive forward and backward selection (GFoBa) procedure is designed to select the significant adjustment terms. / Ph. D.
|
Page generated in 0.088 seconds