Global ETD Search

1	Topics in functional data analysis with biological applications Li, Yehua 02 June 2009 (has links) Functional data analysis (FDA) is an active field of statistics, in which the primary subjects in the study are curves. My dissertation consists of two innovative applications of functional data analysis in biology. The data that motivated the research broadened the scope of FDA and demanded new methodology. I develop new nonparametric methods to make various estimations, and I focus on developing large sample theories for the proposed estimators. The first project is motivated from a colon carcinogenesis study, the goal of which is to study the function of a protein (p27) in colon cancer development. In this study, a number of colonic crypts (units) were sampled from each rat (subject) at random locations along the colon, and then repeated measurements on the protein expression level were made on each cell (subunit) within the selected crypts. In this problem, measurements within each crypt can be viewed as a function, since the measurements can be indexed by the cell locations. The functions from the same subject are spatially correlated along the colon, and my goal is to estimate this correlation function using nonparametric methods. We use this data set as an motivation and propose a kernel estimator of the correlation function in a more general framework. We develop a pointwise asymptotic normal distribution for the proposed estimator when the number of subjects is fixed and the number of units within each subject goes to infinity. Based on the asymptotic theory, we propose a weighted block bootstrapping method for making inferences about the correlation function, where the weights account for the inhomogeneity of the distribution of the unit locations. Simulation studies are also provided to illustrate the numerical performance of the proposed method. My second project is on a lipoprotein profile data, where the goal is to use lipoprotein profile curves to predict the cholesterol level in human blood. Again, motivated by the data, we consider a more general problem: the functional linear models (Ramsay and Silverman, 1997) with functional predictor and scalar response. There is literature developing different methods for this model; however, there is little theory to support the methods. Therefore, we focus more on the theoretical properties of this model. There are other contemporary theoretical work on methods based on Principal Component Regression. Our work is different in the sense that we base our method on roughness penalty approach and consider a more realistic scenario that the functional predictor is observed only on discrete points. To reduce the difficulty of the theoretical derivations, we restrict the functions with a periodic boundary condition and develop an asymptotic convergence rate for this problem in Chapter III. A more general result based on splines is a future research topic that I give some discussion in Chapter IV. Functional Data Analysis Nonparametric statistics
2	Functional data analysis with application to MS and cervical vertebrae data Yaraee, Kate Unknown Date No description available. Functional data analysis Multiple Sclerosis Cervical Vertebrae
3	Spectral Density Function Estimation with Applications in Clustering and Classification Chen, Tianbo 03 March 2019 (has links) Spectral density function (SDF) plays a critical role in spatio-temporal data analysis, where the data are analyzed in the frequency domain. Although many methods have been proposed for SDF estimation, real-world applications in many research fields, such as neuroscience and environmental science, call for better methodologies. In this thesis, we focus on the spectral density functions for time series and spatial data, develop new estimation algorithms, and use the estimators as features for clustering and classification purposes. The first topic is motivated by clustering electroencephalogram (EEG) data in the spectral domain. To identify synchronized brain regions that share similar oscillations and waveforms, we develop two robust clustering methods based on the functional data ranking of the estimated SDFs. The two proposed clustering methods use different dissimilarity measures and their performance is examined by simulation studies in which two types of contaminations are included to show the robustness. We apply the methods to two sets of resting-state EEG data collected from a male college student. Then, we propose an efficient collective estimation algorithm for a group of SDFs. We use two sets of basis functions to represent the SDFs for dimension reduction, and then, the scores (the coefficients of the basis) estimated by maximizing the penalized Whittle likelihood are used for clustering the SDFs in a much lower dimension. For spatial data, an additional penalty is applied to the likelihood to encourage the spatial homogeneity of the clusters. The proposed methods are applied to cluster the EEG data and the soil moisture data. Finally, we propose a parametric estimation method for the quantile spectrum. We approximate the quantile spectrum by the ordinary spectral density of an AR process at each quantile level. The AR coefficients are estimated by solving Yule- Walker equations using the Levinson algorithm. Numerical results from simulation studies show that the proposed method outperforms other conventional smoothing techniques. We build a convolutional neural network (CNN) to classify the estimated quantile spectra of the earthquake data in Oklahoma and achieve a 99.25% accuracy on testing sets, which is 1.25% higher than using ordinary periodograms. spectral analysis functional data analysis clustering classification
4	Goodness-of-Fit and Change-Point Tests for Functional Data Gabrys, Robertas 01 May 2010 (has links) A test for independence and identical distribution of functional observations is proposed in this thesis. To reduce dimension, curves are projected on the most important functional principal components. Then a test statistic based on lagged cross--covariances of the resulting vectors is constructed. We show that this dimension reduction step introduces asymptotically negligible terms, i.e. the projections behave asymptotically as iid vector--valued observations. A complete asymptotic theory based on correlations of random matrices, functional principal component expansions, and Hilbert space techniques is developed. The test statistic has chi-square asymptotic null distribution. Two inferential tests for error correlation in the functional linear model are put forward. To construct them, finite dimensional residuals are computed in two different ways, and then their autocorrelations are suitably defined. From these autocorrelation matrices, two quadratic forms are constructed whose limiting distributions are chi--squared with known numbers of degrees of freedom (different for the two forms). A test for detecting a change point in the mean of functional observations is developed. The null distribution of the test statistic is asymptotically pivotal with a well-known asymptotic distribution. A comprehensive asymptotic theory for the estimation of a change--point in the mean function of functional observations is developed. The procedures developed in this thesis can be readily computed using the R package fda. All theoretical insights obtained in this thesis are confirmed by simulations and illustrated by real life-data examples. Correlation test Functional data analysis Goodness-of-fit test Statistics and Probability
5	Multi-angular hyperspectral data and its influences on soil and plant property measurements: spectral mapping and functional data analysis approach Sugianto, ., Biological, Earth & Environmental Science, UNSW January 2006 (has links) This research investigates the spectral reflectance characteristics of soil and vegetation using multi-angular and single view hyperspectral data. The question of the thesis is ???How much information can be obtained from multi-angular hyperspectral remote sensing in comparison with single view angle hyperspectral remote sensing of soil and vegetation???? This question is addressed by analysing multi-angular and single view angle hyperspectral remote sensing using data from the field, airborne and space borne hyperspectral sensors. Spectral mapping, spectral indices and Functional Data Analysis (FDA) are used to analyse the data. Spectral mapping has been successfully used to distinguish features of soil and cotton with hyperspectral data. Traditionally, spectral mapping is based on collecting endmembers of pure pixels and using these as training areas for supervised classification. There are, however, limitations in the use of these algorithms when applied to multi-angular images, as the reflectance of a single ground unit will differ at each angle. Classifications using six-class endmembers identified using single angle imagery were assessed using multi-angular Compact High Resolution Imaging Spectrometer (CHRIS) imagery, as well as a set of vegetation indices. The results showed no significant difference between the angles. Low nutrient content in the soil produced lower vegetation index values, and more nutrients increased the index values. This research introduces FDA as an image processing tool for multi-angular hyperspectral imagery of soil and cotton, using basis functions for functional principal component analysis (fPCA) and functional linear modelling. FDA has advantages over conventional statistical analysis because it does not assume the errors in the data are independent and uncorrelated. Investigations showed that B-splines with 20-basis functions was the best fit for multi-angular soil spectra collected using the spectroradiometer and the satellite mounted CHRIS. Cotton spectra collected from greenhouse plants using a spectrodiometer needed 30-basis functions to fit the model, while 20-basis functions were sufficient for cotton spectra extracted from CHRIS. Functional principal component analysis (fPCA) of multi-angular soil spectra show the first fPCA explained a minimum of 92.5% of the variance of field soil spectra for different azimuth and zenith angles and 93.2% from CHRIS for the same target. For cotton, more than 93.6% of greenhouse trial and 70.6% from the CHRIS data were explained by the first fPCA. Conventional analysis of multi-angular hyperspectral data showed significant differences exist between soil spectra acquired at different azimuth and zenith angles. Forward scan direction of zenith angle provides higher spectral reflectance than backward direction. However, most multi-angular hyperspectral data analysed as functional data show no significant difference from nadir, except for small parts of the wavelength of cotton spectra using CHRIS. There is also no significant difference for soil spectra analysed as functional data collected from the field, although there was some difference for soil spectra extracted from CHRIS. Overall, the results indicate that multi-angular hyperspectral data provides only a very small amount of additional information when used for conventional analyses. Multi-angular CHRIS hyperspectral functional data analysis spectral mapping reflectance
6	Functional Chemometrics: Automated Spectral Smoothing with Spatially Adaptive Splines Fernandes, Philip Manuel 02 October 2012 (has links) Functional data analysis (FDA) is a demonstrably effective, practical, and powerful method of data analysis, yet it remains virtually unheard of outside of academic circles and has almost no exposure to industry. FDA adds to the milieu of statistical methods by treating functions of one or more independent variables as data objects, analogous to the way in which discrete points are the data objects we are familiar with in conventional statistics. The first step in functional analysis is to “functionalize” the data, or convert discrete points into a system represented most times by continuous functions. Choosing the type of functions to use is data-dependent and often straightforward – for example, Fourier series lend themselves well to periodic systems, while splines offer great flexibility in approximating more irregular trends, such as chemical spectra. This work explores the question of how B-splines can be rapidly and reliably used to denoised infrared chemical spectra, a difficult problem not only because of the many parameters involved in generating a spline fit, but also due to the disparate nature of spectra in terms of shape and noise intensity. Automated selection of spline parameters is required to support high-throughput analysis, and the heteroscedastic nature of such spectra presents challenges for existing techniques. The heuristic knot placement algorithm of Li et al. (2005) for 1D object contours is extended to spectral fitting by optimizing the denoising step for a range of spectral types and signal/noise ratios, using the following criteria: robustness to types of spectra and noise conditions, parsimony of knots, low computational demand, and ease of implementation in high-throughput settings. Pareto-optimal filter configurations are determined using simulated data from factorial experimental designs. The improved heuristic algorithm uses wavelet transforms and provides improved performance in robustness, parsimony of knots and the quality of functional regression models used to correlate real spectral data with chemical composition. In practical applications, functional principal component regression models yielded similar or significantly improved results when compared with their discrete partial least squares counterparts. / Thesis (Master, Chemical Engineering) -- Queen's University, 2012-10-01 20:18:31.119
7	On the statistical analysis of functional data arising from designed experiments Sirski, Monica 10 April 2012 (has links) We investigate various methods for testing whether two groups of curves are statistically significantly different, with the motivation to apply the techniques to the analysis of data arising from designed experiments. We propose a set of tests based on pairwise differences between individual curves. Our objective is to compare the power and robustness of a variety of tests, including a collection of permutation tests, a test based on the functional principal components scores, the adaptive Neyman test and the functional F test. We illustrate the application of these tests in the context of a designed 2^4 factorial experiment with a case study using data provided by NASA. We apply the methods for comparing curves to this factorial data by dividing the data into two groups by each effect (A, B, . . . , ABCD) in turn. We carry out a large simulation study investigating the power of the tests in detecting contamination, location, and shift effects on unimodal and monotone curves. We conclude that the permutation test using the mean of the pairwise differences in L1 norm has the best overall power performance and is a robust test statistic applicable in a wide variety of situations. The advantage of using a permutation test is that it is an exact, distribution-free test that performs well overall when applied to functional data. This test may be extended to more than two groups by constructing test statistics based on averages of pairwise differences between curves from the different groups and, as such, is an important building-block for larger experiments and more complex designs. functional data analysis design of experiments permutation test power analysis
8	On the statistical analysis of functional data arising from designed experiments Sirski, Monica 10 April 2012 (has links) We investigate various methods for testing whether two groups of curves are statistically significantly different, with the motivation to apply the techniques to the analysis of data arising from designed experiments. We propose a set of tests based on pairwise differences between individual curves. Our objective is to compare the power and robustness of a variety of tests, including a collection of permutation tests, a test based on the functional principal components scores, the adaptive Neyman test and the functional F test. We illustrate the application of these tests in the context of a designed 2^4 factorial experiment with a case study using data provided by NASA. We apply the methods for comparing curves to this factorial data by dividing the data into two groups by each effect (A, B, . . . , ABCD) in turn. We carry out a large simulation study investigating the power of the tests in detecting contamination, location, and shift effects on unimodal and monotone curves. We conclude that the permutation test using the mean of the pairwise differences in L1 norm has the best overall power performance and is a robust test statistic applicable in a wide variety of situations. The advantage of using a permutation test is that it is an exact, distribution-free test that performs well overall when applied to functional data. This test may be extended to more than two groups by constructing test statistics based on averages of pairwise differences between curves from the different groups and, as such, is an important building-block for larger experiments and more complex designs. functional data analysis design of experiments permutation test power analysis
9	Multi-angular hyperspectral data and its influences on soil and plant property measurements: spectral mapping and functional data analysis approach Sugianto, ., Biological, Earth & Environmental Science, UNSW January 2006 (has links) This research investigates the spectral reflectance characteristics of soil and vegetation using multi-angular and single view hyperspectral data. The question of the thesis is ???How much information can be obtained from multi-angular hyperspectral remote sensing in comparison with single view angle hyperspectral remote sensing of soil and vegetation???? This question is addressed by analysing multi-angular and single view angle hyperspectral remote sensing using data from the field, airborne and space borne hyperspectral sensors. Spectral mapping, spectral indices and Functional Data Analysis (FDA) are used to analyse the data. Spectral mapping has been successfully used to distinguish features of soil and cotton with hyperspectral data. Traditionally, spectral mapping is based on collecting endmembers of pure pixels and using these as training areas for supervised classification. There are, however, limitations in the use of these algorithms when applied to multi-angular images, as the reflectance of a single ground unit will differ at each angle. Classifications using six-class endmembers identified using single angle imagery were assessed using multi-angular Compact High Resolution Imaging Spectrometer (CHRIS) imagery, as well as a set of vegetation indices. The results showed no significant difference between the angles. Low nutrient content in the soil produced lower vegetation index values, and more nutrients increased the index values. This research introduces FDA as an image processing tool for multi-angular hyperspectral imagery of soil and cotton, using basis functions for functional principal component analysis (fPCA) and functional linear modelling. FDA has advantages over conventional statistical analysis because it does not assume the errors in the data are independent and uncorrelated. Investigations showed that B-splines with 20-basis functions was the best fit for multi-angular soil spectra collected using the spectroradiometer and the satellite mounted CHRIS. Cotton spectra collected from greenhouse plants using a spectrodiometer needed 30-basis functions to fit the model, while 20-basis functions were sufficient for cotton spectra extracted from CHRIS. Functional principal component analysis (fPCA) of multi-angular soil spectra show the first fPCA explained a minimum of 92.5% of the variance of field soil spectra for different azimuth and zenith angles and 93.2% from CHRIS for the same target. For cotton, more than 93.6% of greenhouse trial and 70.6% from the CHRIS data were explained by the first fPCA. Conventional analysis of multi-angular hyperspectral data showed significant differences exist between soil spectra acquired at different azimuth and zenith angles. Forward scan direction of zenith angle provides higher spectral reflectance than backward direction. However, most multi-angular hyperspectral data analysed as functional data show no significant difference from nadir, except for small parts of the wavelength of cotton spectra using CHRIS. There is also no significant difference for soil spectra analysed as functional data collected from the field, although there was some difference for soil spectra extracted from CHRIS. Overall, the results indicate that multi-angular hyperspectral data provides only a very small amount of additional information when used for conventional analyses. Multi-angular CHRIS hyperspectral functional data analysis spectral mapping reflectance
10	Handling Sparse and Missing Data in Functional Data Analysis: A Functional Mixed-Effects Model Approach January 2016 (has links) abstract: This paper investigates a relatively new analysis method for longitudinal data in the framework of functional data analysis. This approach treats longitudinal data as so-called sparse functional data. The first section of the paper introduces functional data and the general ideas of functional data analysis. The second section discusses the analysis of longitudinal data in the context of functional data analysis, while considering the unique characteristics of longitudinal data such, in particular sparseness and missing data. The third section introduces functional mixed-effects models that can handle these unique characteristics of sparseness and missingness. The next section discusses a preliminary simulation study conducted to examine the performance of a functional mixed-effects model under various conditions. An extended simulation study was carried out to evaluate the estimation accuracy of a functional mixed-effects model. Specifically, the accuracy of the estimated trajectories was examined under various conditions including different types of missing data and varying levels of sparseness. / Dissertation/Thesis / Masters Thesis Psychology 2016 Psychology Functional Data Analysis Longitudinal Data Mixed Models

Search results