121 |
A review of "longitudinal study" in developmental psychologyFinley, Emily H. 01 January 1972 (has links)
The purpose of this library research thesis is to review the "longitudinal study" in terms of problems and present use. A preliminary search of the literature on longitudinal method revealed problems centering around two areas: (1) definition of "longitudinal study" and (2) practical problems of method itself. The purpose of this thesis then is to explore through a search of books and journals the following questions:
1. How can “longitudinal study” be defined?
2. What problems are inherent in the study of the same individuals over time and how can these problems be solved?
A third question which emerges from these two is:
3. How is “longitudinal study” being used today?
This thesis differentiates traditional longitudinal study from other methods of study: the cross-sectional study, the time-lag study, the experimental study, the retrospective study, and the study from records. Each of these methods of study is reviewed according to its unique problems and best uses and compared with the longitudinal study. Finally, the traditional longitudinal study is defined as the study: (1) of individual change under natural conditions not controlled by the experimenter, (2) which proceeds over time from the present to the future by measuring the same individuals repeatedly, and (3) which retains individuality of data in analyses.
Some problem areas of longitudinal study are delineated which are either unique to this method or especially difficult. The following problems related to planning the study are reviewed: definition of study objectives, selection of method of study, statistical methods, cost, post hoc analysis and replication of the study, time factor in longitudinal study, and the problem of allowing variables to operate freely. Cultural shift and attrition are especially emphasized. The dilemma is examined which is posed by sample selection with its related problems of randomization and generalizability of the study, together with the problems of repeated measurements and selection of control groups. These problems are illustrated with studies from the literature.
Not only are these problems delineated cut considerable evidence is shown that we have already started to accumulate data that will permit their solution. This paper presents a number of studies which have considered these problems separately or as a side issue of a study on some other topic. Some recommendations for further research in problem areas are suggested.
At the same time that this thesis notes differentiation of the longitudinal study from other studies, it also notes integration of results of longitudinal studies with results of other studies. The tenet adopted here is: scientific knowledge is cumulative and not dependent on one crucial experiment.
Trends in recent longitudinal studies are found to be toward more strict observance of scientific protocols and toward limitation of time and objectives of the study. When objectives of the study are well defined and time is limited to only enough for specified change to take place, many of the problems of longitudinal study are reduced to manageable proportions.
Although modern studies are of improved quality, longitudinal method is not being sufficiently used today to supply the demand for this type of data. Longitudinal study is necessary to answer some of the questions in developmental psychology. We have no alternative but to continue to develop this important research tool.
|
122 |
Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous TraitsJanuary 2018 (has links)
acase@tulane.edu / Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in the extreme phenotypic samples within the top and bottom percentiles, EPS can boost the study power compared with the random sampling with the same sample size. The existing statistical methods for EPS data test the variants/regions individually. However, many disorders are caused by multiple genetic factors. Therefore, it is critical to simultaneously model the effects of genetic factors, which may increase the power of current genetic studies and identify novel disease-associated genetic factors in EPS. The challenge of the simultaneous analysis of genetic data is that the number (p ~10,000) of genetic factors is typically greater than the sample size (n ~1,000) in a single study. The standard linear model would be inappropriate for this p>n problem due to the rank deficiency of the design matrix. An alternative solution is to apply a penalized regression method – the least absolute shrinkage and selection operator (LASSO).
LASSO can deal with this high-dimensional (p>n) problem by forcing certain regression coefficients to be zero. Although the application of LASSO in genetic studies under random sampling has been widely studied, its statistical inference and testing under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function to investigate the genetic associations, including the gene expression and rare variant analyses. The comprehensive simulation shows EPS-LASSO outperforms existing methods with superior power when the effects are large and stable type I error and FDR control. Together with the real data analysis of genetic study for obesity, our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. / 1 / Chao Xu
|
123 |
Goodness-of-Fit and Change-Point Tests for Functional DataGabrys, Robertas 01 May 2010 (has links)
A test for independence and identical distribution of functional observations is proposed in this thesis. To reduce dimension, curves are projected on the most important functional principal components. Then a test statistic based on lagged cross--covariances of the resulting vectors is constructed. We show that this dimension reduction step introduces asymptotically negligible terms, i.e. the projections behave asymptotically as iid vector--valued observations. A complete asymptotic theory based on correlations of random matrices, functional principal component expansions, and Hilbert space techniques is developed. The test statistic has chi-square asymptotic null distribution.
Two inferential tests for error correlation in the functional linear model are put forward. To construct them, finite dimensional residuals are computed in two different ways, and then their autocorrelations are suitably defined. From these autocorrelation matrices, two quadratic forms are constructed whose limiting distributions are chi--squared with known numbers of degrees of freedom (different for the two forms).
A test for detecting a change point in the mean of functional observations is developed. The null distribution of the test statistic is asymptotically pivotal with a well-known asymptotic distribution. A comprehensive asymptotic theory for the estimation of a change--point in the mean function of functional observations is developed.
The procedures developed in this thesis can be readily computed using the R package fda. All theoretical insights obtained in this thesis are confirmed by simulations and illustrated by real life-data examples.
|
124 |
Sheaf Theory as a Foundation for Heterogeneous Data FusionMansourbeigi, Seyed M-H 01 December 2018 (has links)
A major impediment to scientific progress in many fields is the inability to make sense of the huge amounts of data that have been collected via experiment or computer simulation. This dissertation provides tools to visualize, represent, and analyze the collection of sensors and data all at once in a single combinatorial geometric object. Encoding and translating heterogeneous data into common language are modeled by supporting objects. In this methodology, the behavior of the system based on the detection of noise in the system, possible failure in data exchange and recognition of the redundant or complimentary sensors are studied via some related geometric objects. Applications of the constructed methodology are described by two case studies: one from wildfire threat monitoring and the other from air traffic monitoring. Both cases are distributed (spatial and temporal) information systems. The systems deal with temporal and spatial fusion of heterogeneous data obtained from multiple sources, where the schema, availability and quality vary. The behavior of both systems is explained thoroughly in terms of the detection of the failure in the systems and the recognition of the redundant and complimentary sensors. A comparison between the methodology in this dissertation and the alternative methods is described to further verify the validity of the sheaf theory method. It is seen that the method has less computational complexity in both space and time.
|
125 |
Topological Data Analysis of Properties of Four-Regular Rigid Vertex GraphsConine, Grant Mcneil 24 June 2014 (has links)
Homologous DNA recombination and rearrangement has been modeled with a class of four-regular rigid vertex graphs called assembly graphs which can also be represented by double occurrence words. Various invariants have been suggested for these graphs, some based on the structure of the graphs, and some biologically motivated.
In this thesis we use a novel method of data analysis based on a technique known as partial-clustering analysis and an algorithm known as Mapper to examine the relationships between these invariants. We introduce some of the basic machinery of topological data analysis, including the construction of simplicial complexes on a data set, clustering analysis, and the workings of the Mapper algorithm. We define assembly graphs and three specific invariants of these graphs: assembly number, nesting index, and genus range. We apply Mapper to the set of all assembly graphs up to 6 vertices and compare relationships between these three properties. We make several observations based upon the results of the analysis we obtained. We conclude with some suggestions for further research based upon our findings.
|
126 |
Techniques to handle missing values in a factor analysisTurville, Christopher, University of Western Sydney, Faculty of Informatics, Science and Technology January 2000 (has links)
A factor analysis typically involves a large collection of data, and it is common for some of the data to be unrecorded. This study investigates the ability of several techniques to handle missing values in a factor analysis, including complete cases only, all available cases, imputing means, an iterative component method, singular value decomposition and the EM algorithm. A data set that is representative of that used for a factor analysis is simulated. Some of this data are then randomly removed to represent missing values, and the performance of the techniques are investigated over a wide range of conditions. Several criteria are used to investigate the abilities of the techniques to handle missing values in a factor analysis. Overall, there is no one technique that performs best for all of the conditions studied. The EM algorithm is generally the most effective technique except when there are ill-conditioned matrices present or when computing time is of concern. Some theoretical concerns are introduced regarding the effects that changes in the correlation matrix will have on the loadings of a factor analysis. A complicated expression is derived that shows that the change in factor loadings as a result of change in the elements of a correlation matrix involves components of eigenvectors and eigenvalues. / Doctor of Philosophy (PhD)
|
127 |
Multi-angular hyperspectral data and its influences on soil and plant property measurements: spectral mapping and functional data analysis approachSugianto, ., Biological, Earth & Environmental Science, UNSW January 2006 (has links)
This research investigates the spectral reflectance characteristics of soil and vegetation using multi-angular and single view hyperspectral data. The question of the thesis is ???How much information can be obtained from multi-angular hyperspectral remote sensing in comparison with single view angle hyperspectral remote sensing of soil and vegetation???? This question is addressed by analysing multi-angular and single view angle hyperspectral remote sensing using data from the field, airborne and space borne hyperspectral sensors. Spectral mapping, spectral indices and Functional Data Analysis (FDA) are used to analyse the data. Spectral mapping has been successfully used to distinguish features of soil and cotton with hyperspectral data. Traditionally, spectral mapping is based on collecting endmembers of pure pixels and using these as training areas for supervised classification. There are, however, limitations in the use of these algorithms when applied to multi-angular images, as the reflectance of a single ground unit will differ at each angle. Classifications using six-class endmembers identified using single angle imagery were assessed using multi-angular Compact High Resolution Imaging Spectrometer (CHRIS) imagery, as well as a set of vegetation indices. The results showed no significant difference between the angles. Low nutrient content in the soil produced lower vegetation index values, and more nutrients increased the index values. This research introduces FDA as an image processing tool for multi-angular hyperspectral imagery of soil and cotton, using basis functions for functional principal component analysis (fPCA) and functional linear modelling. FDA has advantages over conventional statistical analysis because it does not assume the errors in the data are independent and uncorrelated. Investigations showed that B-splines with 20-basis functions was the best fit for multi-angular soil spectra collected using the spectroradiometer and the satellite mounted CHRIS. Cotton spectra collected from greenhouse plants using a spectrodiometer needed 30-basis functions to fit the model, while 20-basis functions were sufficient for cotton spectra extracted from CHRIS. Functional principal component analysis (fPCA) of multi-angular soil spectra show the first fPCA explained a minimum of 92.5% of the variance of field soil spectra for different azimuth and zenith angles and 93.2% from CHRIS for the same target. For cotton, more than 93.6% of greenhouse trial and 70.6% from the CHRIS data were explained by the first fPCA. Conventional analysis of multi-angular hyperspectral data showed significant differences exist between soil spectra acquired at different azimuth and zenith angles. Forward scan direction of zenith angle provides higher spectral reflectance than backward direction. However, most multi-angular hyperspectral data analysed as functional data show no significant difference from nadir, except for small parts of the wavelength of cotton spectra using CHRIS. There is also no significant difference for soil spectra analysed as functional data collected from the field, although there was some difference for soil spectra extracted from CHRIS. Overall, the results indicate that multi-angular hyperspectral data provides only a very small amount of additional information when used for conventional analyses.
|
128 |
Physique statistique des réseaux de neurones et de l'optimisation combinatoireKrauth, Werner 14 June 1989 (has links) (PDF)
Dans la première partie nous étudions l'apprentissage et le rappel dans des réseaux de neurones à une couche (modèle de Hopfield). Nous proposons un algorithme d'apprentissage qui est capable d'optimiser la 'stabilité', un paramètre qui décrit la qualité de la représentation d'un pattern dans le réseau. Pour des patterns aléatoires, cet algorithme permet d'atteindre la borne théorique de Gardner. Nous étudions ensuite l'importance dynamique de la stabilité et d'un paramètre concernant la symétrie de la matrice de couplages. Puis, nous traitons le cas où les couplages ne peuvent prendre que deux valeurs (inhibiteur, excitateur). Pour ce modèle nous établissons les limites supérieures de la capacité par un calcul numérique, et nous proposons une solution analytique. La deuxième partie de la thèse est consacrée à une étude détaillée - du point de vue de la physique statistique - du problème du voyageur de commerce. Nous étudions le cas spécial d'une matrice aléatoire de connexions. Nous exposons la théorie de ce problème (suivant la méthode des répliques) et la comparons aux résultats d'une étude numérique approfondie.
|
129 |
Mining for Lung Cancer Biomarkers in Plasma Metabolomics Data / Sökande efter Biomarkörer för Lungcancer genom Analys av MetabolitdataJohnsson, Anna January 2010 (has links)
<p>Lung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.NyckelordLung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.Nyckelord</p>
|
130 |
Exploring factors affecting math achievement using large scale assessment results in SaskatchewanLai, Hollis 16 September 2008
Current research suggests that a high level of confidence and a low level of anxiety are predictive of higher math achievement. Compared to students from other provinces, previous research has found that Saskatchewan students have a higher level of confidence and a lower level of anxiety for learning math, but still tend to achieve lower math scores compared to students in other provinces. The data suggest that there may be unique factors effecting math learning for students in Saskatchewan. The purpose of the study is to determine the factors that may affect Saskatchewan students math achievement. Exploratory factor analyses and regression methods were employed to investigate possible traits that aid students in achieving higher math scores. Results from a 2007 math assessment administered to grade 5 students in Saskatchewan were used for the current study. The goal of the study was to provide a better understanding of the factors and trends unique to students for mathematic achievements in Saskatchewan.<p> Using results from a province-wide math assessment and an accompanying questionnaire administered to students in grade five across public school in Saskatchewan (n=11,279), the present study found statistical significance in three factors that have been supported by previous studies to influence math achievement differences, specifically in (1) confidence in math, (2) parental involvement in math and (3) extracurricular participation in math. The three aforementioned factors were found to be related to math achievement as predicted by the Assessment for Learning (AFL) program in Saskatchewan, although there were reservations to the findings due to a weak amount of variances accounted for in the regression model (r2 =.084). Furthermore, a multivariate analysis of variance indicated gender and locations of schools to have effects on students math achievement scores. Although a high amount of measurement errors in the questionnaire (and subsequently a low variance accounted for by the regression model) limited the scope and implications of the model, future implications and improvements are discussed
|
Page generated in 0.0608 seconds