Global ETD Search

31	Classification in Functional Data Analysis : Applications on Motion Data Kröger, Viktor January 2021 (has links) Anterior cruciate knee ligament injuries are common and well known, especially amongst athletes.These injuries often require surgeries and long rehabilitation programs, and can lead to functionloss and re-injuries (Marshall et al., 1977). This work aims to explore the possibility of applyingsupervised classification on knee functionality, using different types of models, and testing differentdivisions of classes. The data used is gathered through a performance test, where individualsperform one-leg hops with motion sensors attached to their bodies. The obtained data representsthe position over time, and is considered functional data.With functional data analysis (FDA), a process can be analysed as a continuous function of time,instead of being reduced to finite data points. FDA includes many useful tools, but also somechallenges. A functional observation can for example be differentiated, a handy tool not found inthe multivariate tool-box. The speed, and acceleration, can then be calculated from the obtaineddata. How to define "similarity" is, on the other hand, not as obvious as with points. In this work,an FDA-approach is taken on classifying knee kinematic data, from a long-term follow-up studyon knee ligament injuries.This work studies kernel functional classifiers, and k-nearest neighbours models, and performssignificance tests on the model accuracy, using re-sampling methods. Additionally, depending onhow similarity is defined, the models can distinguish different features of the data. Attempts atutilising more information through incorporation of ensemble-methods, does not exceed the singlemodels it is created from. Further, it is shown that classification on optimised sub-domains, canbe superior to classifiers using the full domain, in terms of predictive power. / Främre korsbandsskador är vanliga och välkända skador, speciellt bland idrottsutövare. Skadornakräver ofta operationer och långa rehabiliteringsprogram, och kan leda till funktionell nedsättningoch återskador (Marshall et al., 1977). Målet med det här arbetet är att utforska möjligheten attklassificera knän utifrån funktionalitet, där utfallet är känt. Detta genom att använda olika typerav modeller, och genom att testa olika indelningar av grupper. Datat som används är insamlatunder ett prestandatest, där personer hoppat på ett ben med rörelsesensorer på kroppen. Deninsamlade datan representerar position över tid, och betraktas som funktionell data.Med funktionell dataanalys (FDA) kan en process analyseras som en kontinuerlig funktion av tid,istället för att reduceras till ett ändligt antal datapunkter. FDA innehåller många användbaraverktyg, men även utmaningar. En funktionell observation kan till exempel deriveras, ett händigtverktyg som inte återfinns i den multivariata verktygslådan. Hastigheten och accelerationen kandå beräknas utifrån den insamlade datan. Hur "likhet" är definierat, å andra sidan, är inte likauppenbart som med punkt-data. I det här arbetet används FDA för att klassificera knärörelsedatafrån en långtidsuppföljningsstudie av främre korsbandsskador.I detta arbete studeras både funktionella kärnklassificerare och k-närmsta grannar-metoder, och ut-för signifikanstest av modellträffsäkerheten genom omprovtagning. Vidare kan modellerna urskiljaolika egenskaper i datat, beroende på hur närhet definieras. Ensemblemetoder används i ett försökatt nyttja mer av informationen, men lyckas inte överträffa någon av de enskilda modellerna somutgör ensemblen. Vidare så visas också att klassificering på optimerade deldefinitionsmängder kange en högre förklaringskraft än klassificerare som använder hela definitionsmängden. Classification Functional Data Analysis FDA motion data ligament injury Mathematics Matematik Probability Theory and Statistics Sannolikhetsteori och statistik
32	An investigation into Functional Linear Regression Modeling Essomba, Rene Franck January 2015 (has links) Functional data analysis, commonly known as FDA", refers to the analysis of information on curves of functions. Key aspects of FDA include the choice of smoothing techniques, data reduction, model evaluation, functional linear modeling and forecasting methods. FDA is applicable in numerous applications such as Bioscience, Geology, Psychology, Sports Science, Econometrics, Meteorology, etc. This dissertation main objective is to focus more specifically on Functional Linear Regression Modelling (FLRM), which is an extension of Multivariate Linear Regression Modeling. The problem of constructing a Functional Linear Regression modelling with functional predictors and functional response variable is considered in great details. Discretely observed data for each variable involved in the modelling are expressed as smooth functions using: Fourier Basis, B-Splines Basis and Gaussian Basis. The Functional Linear Regression Model is estimated by the Least Square method, Maximum Likelihood method and more thoroughly by Penalized Maximum Likelihood method. A central issue when modelling Functional Regression models is the choice of a suitable model criterion as well as the number of basis functions and an appropriate smoothing parameter. Four different types of model criteria are reviewed: the Generalized Cross-Validation, the Generalized Information Criterion, the modified Akaike Information Criterion and Generalized Bayesian Information Criterion. Each of these aforementioned methods are applied to a dataset and contrasted based on their respective results. Mathematical Statistics Functional Data Analysis Basis Expansion Functional Regression Smoothing Techniques
33	Functional principal component and factor analysis of spatially correlated data Liu, Chong 22 January 2016 (has links) While multivariate data analysis is concerned with data in the form of random vectors, functional data analysis goes one big step farther, focusing on data that are infinite-dimensional, such as curves, shapes and images. We focus on functional data that are measured over time across multiple subjects. The first part of the thesis focuses on spatially correlated functional data. This correlation is modeled by correlating functional principal component scores. We propose a Spatial Principal Analysis by Conditional Expectation framework to explicitly estimate spatial correlations and reconstruct individual curves. This approach works even when the observed data per curve are extremely sparse. Assuming spatial stationarity, empirical between-curve correlations are calculated as the ratio of eigenvalues of the smoothed covariance surface Cov(Xi(s),Xi(t)) and cross-covariance surface Cov(Xi(s),Xj(t)). Then a parametric spatial correlation model is employed to fit empirical correlations. Finally, principal component scores are estimated to reconstruct the sparsely observed curves. This framework could naturally accommodate arbitrary covariance structures, but there is an enormous reduction in computation if one can assume the separability of temporal and spatial components. We propose hypothesis tests to examine the separability and isotropy effect of spatial correlation. Simulation studies and applications of empirical data show improvements in the curve reconstruction using our framework over the method where curves are assumed to be independent. In addition, asymptotic properties of estimates are discussed in details. In the second part of this work, we present a new approach to factor rotation for functional data. This is achieved by rotating the functional principal components toward a predefined space of periodic functions designed to decompose the total variation into components that are nearly-periodic and nearly-aperiodic with a predefined period. We show that the factor rotation can be obtained by the calculation of canonical correlations between appropriate spaces. Moreover, we demonstrate that our proposed rotations provide stable and interpretable results in the presence of highly complex covariance. This work is motivated by the goal of finding interpretable sources of variability in a gridded time series of vegetation index measurements obtained from remote sensing, and we demonstrate our methodology through the application of factor rotation of this data. Statistics Functional data analysis Functional factor rotation Functional principal component Spatial correlation
34	Empirical Properties of Functional Regression Models and Application to High-Frequency Financial Data Zhang, Xi 01 May 2013 (has links) Functional data analysis (FDA) has grown into a substantial field of statistical research, with new methodology, numerous useful applications and interesting novel theoretical developments. My dissertation focuses on the empirical properties of functional regression models and their application to financial data. We start from testing the empirical properties of forecasts with the functional autoregressive models based on simulated and real data. We define intraday returns and consider their prediction from such returns on a market index. This is an extension to intraday data of the Capital Asset Pricing model. Finally we investigate multifactor functional models and assess their suitability for the prediction of intraday returns for various financial assets, including stock and commodity futures. Empirical Study Financial Data Functional Data Analysis Functional Regression Models Finance and Financial Management Statistics and Probability
35	Functional Data Models for Raman Spectral Data and Degradation Analysis Do, Quyen Ngoc 16 August 2022 (has links) Functional data analysis (FDA) studies data in the form of measurements over a domain as whole entities. Our first focus is on the post-hoc analysis with pairwise and contrast comparisons of the popular functional ANOVA model comparing groups of functional data. Existing contrast tests assume independent functional observations within group. In reality, this assumption may not be satisfactory since functional data are often collected continually overtime on a subject. In this work, we introduce a new linear contrast test that accounts for time dependency among functional group members. For a significant contrast test, it can be beneficial to identify the region of significant difference. In the second part, we propose a non-parametric regression procedure to obtain a locally sparse estimate of functional contrast. Our work is motivated by a biomedical study using Raman spectroscopy to monitor hemodialysis treatment near real-time. With contrast test and sparse estimation, practitioners can monitor the progress of the hemodialysis within session and identify important chemicals for dialysis adequacy monitoring. In the third part, we propose a functional data model for degradation analysis of functional data. Motivated by degradation analysis application of rechargeable Li-ion batteries, we combine state-of-the-art functional linear models to produce fully functional prediction for curves on heterogenous domains. Simulation studies and data analysis demonstrate the advantage of the proposed method in predicting degradation measure than existing method using aggregation method. / Doctor of Philosophy / Functional data analysis (FDA) studies complex data structure in the form of curves and shapes. Our work is motivated by two applications concerning data from Raman spectroscopy and battery degradation study. Raman spectra of a liquid sample are curves with measurements over a domain of wavelengths that can identify chemical composition and whose values signify the constituent concentrations in the sample. We first propose a statistical procedure to test the significance of a functional contrast formed by spectra collected at beginning and at later time points during a dialysis session. Then a follow-up procedure is developed to produce a sparse representation of the contrast functional contrast with clearly identified zero and nonzero regions. The use of this method on contrast formed by Raman spectra of used dialysate collected at different time points during hemodialysis sessions can be adapted for evaluating the treatment efficacy in real time. In a third project, we apply state-of-the-art methodologies from FDA to a degradation study of rechargeable Li-ion batteries. Our proposed methods produce fully functional prediction of voltage discharge curves allowing flexibility in monitoring battery health. Functional data analysis degradation data analysis functional linear regression nonparametric regression Raman spectra Lithium-ion batteries
36	Synergistic Modeling of Advanced Manufacturing Processes with Functional Variables Sun, Hongyue 01 June 2017 (has links) Modern manufacturing needs to optimize the entire product lifecycle to satisfy the customer needs. The advancement of sensing technologies has brought a data rich environment for manufacturing and provide a great opportunity for real-time, proactive quality assurance. However, due to the lack of methods for analyzing heterogeneous types of data, the transformation of data to information and knowledge for effective decision making in manufacturing is still a challenging problem. In particular, functional variables can represent the in situ process conditions and rich product performance information, and are widely encountered in various manufacturing processes. In this dissertation, I will focus on modeling of manufacturing processes with in situ process (functional) variables, and integrating these functional variables and other measured variables for the manufacturing modeling. The modeling is explored by extracting informative features through the integration of multiple functional variables, functional variables and offline setting variables, and quantitative and qualitative quality variables. After an introduction in Chapter 1, three research tasks are investigated. First, a functional variable selection problem is studied in Chapter 2 to identify the significant functional variables as well as their features in a logistic regression model. A hierarchical non-negative garrote constrained estimation method is proposed. Second, the quality-process relationships for scalar offline setting variables, functional in situ process variables, and manufacturing quality responses are studied in Chapter 3. A functional graphical model that can integrate functional variables in a graphical model is proposed and investigated. Third, the quantitative and qualitative quality responses are jointly modeled with scalar offline setting variables and functional in situ process variables in Chapter 4. A functional quantitative and qualitative model is proposed and investigated. Finally, I summarize the research contribution and discuss future research directions in Chapter 5. The proposed methodologies have broad applications in manufacturing processes with functional variables, and are demonstrated in a crystal growth process with multiple functional variables (Chapter 2), a plasma spray process with multiple scalar and functional variables (Chapter 3), and an additive manufacturing process called fused deposition modeling with quantitative and qualitative quality responses (Chapter 4). / Ph. D. Functional Data Analysis In situ Process Variables Synergistic Modeling
37	Análise de dados funcionais aplicada ao estudo de repetitividade e reprodutividade : ANOVA das distâncias Pedott, Alexandre Homsi January 2010 (has links) Esta dissertação apresenta um método adaptado do estudo de repetitividade e reprodutibilidade para analisar a capacidade e o desempenho de sistemas de medição, no contexto da análise de dados funcionais. Dado funcional é a variável de resposta dada por uma coleção de dados que formam um perfil ou uma curva. O método adaptado contribui para o avanço do estado da arte sobre a análise de sistemas de medição. O método proposto é uma alternativa ao uso de métodos tradicionais de análise, que usados de forma equivocada, podem deteriorar a qualidade dos produtos monitorados através de variáveis de resposta funcionais. O método proposto envolve a adaptação de testes de hipótese e da análise de variância de um e dois fatores usados em comparações de populações, na avaliação de sistemas de medições. A proposta de adaptação foi baseada na utilização de distâncias entre curvas. Foi usada a Distância de Hausdorff como uma medida de proximidade entre as curvas. A adaptação proposta à análise de variância foi composta de três abordagens. Os métodos adaptados foram aplicados a um estudo simulado de repetitividade e reprodutibilidade. O estudo foi estruturado para analisar cenários em que o sistema de medição foi aprovado e reprovado. O método proposto foi denominado de ANOVA das Distâncias. / This work presents a method to analyze a measurement system's performance in a functional data analysis context, based on repeatability and reproducibility studies. Functional data are a collection of data points organized as a profile or curve. The proposed method contributes to the state of the art on measurement system analysis. The method is an alternative to traditional methods often used mistakenly, leading to deterioration in the quality of products monitored through functional responses. In the proposed method we adapt hypothesis tests and one-way and two-way ANOVA to be used in measurement system analysis. The method is grounded on the use of distances between curves. For that matter the Hausdorff distance was chosen as a measure of proximity between curves. Three ANOVA approaches were proposed and applied in a simulated repeatability and reproducibility study. The study was structured to analyze scenarios in which the measurement system was approved or rejected. The proposed method was named ANOVA of the distances. Controle de qualidade Análise de dados funcionais Functional data analysis Measurement systems R & R studies ANOVA Functional ANOVA
38	Estimação de modelos geoestatísticos com dados funcionais usando ondaletas / Estimation of Geostatistical Models with Functional Data using Wavelets Sassi, Gilberto Pereira 03 March 2016 (has links) Com o recente avanço do poder computacional, a amostragem de curvas indexadas espacialmente tem crescido principalmente em dados ecológicos, atmosféricos e ambientais, o que conduziu a adaptação de métodos geoestatísticos para o contexto de Análise de Dados Funcionais. O objetivo deste trabalho é estudar métodos de krigagem para Dados Funcionais, adaptando os métodos de interpolação espacial em Geoestatística. Mais precisamente, em um conjunto de dados funcionais pontualmente fracamente estacionário e isotrópico, desejamos estimar uma curva em um ponto não monitorado no espaço buscando estimadores não viciados com erro quadrático médio mínimo. Apresentamos três abordagens para aproximar uma curva em sítio não monitorado, demonstramos resultados que simplificam o problema de otimização postulado pela busca de estimadores ótimos não viciados, implementamos os modelos em MATLAB usando ondaletas, que é mais adequada para captar comportamentos localizados, e comparamos os três modelos através de estudos de simulação. Ilustramos os métodos através de dois conjuntos de dados reais: um conjunto de dados de temperatura média diária das províncias marítimas do Canadá (New Brunswick, Nova Scotia e Prince Edward Island) coletados em 82 estações no ano 2000 e um conjunto de dados da CETESB (Companhia Ambiental do Estado de São Paulo) referentes ao índice de qualidade de ar MP10 em 22 estações meteorológicas na região metropolitana da cidade de São Paulo coletados no ano de 2014. / The advance of the computational power in last decades has been generating a considerable increase in datasets of spatially indexed curves, mainly in ecological, atmospheric and environmental data, what have leaded to adjustments of geostatistcs for the context of Functional Data Analysis. The goal of this work is to adapt the kriging methods from geostatistcs analysis to the framework of Functional Data Analysis. More precisely, we shall interpolate a curve in an unvisited spot searching for an unbiased estimator with minimum mean square error for a pointwise weakly stationary and isotropic functional dataset. We introduce three different approaches to estimate a curve in an unvisited spot, we demonstrate some results simplifying the optimization problem postulated by the optimality from these estimators, we implement the three models in MATLAB using wavelets and we compare them by simulation. We illustrate the ideas using two dataset: a real climatic dataset from Canadian maritime provinces (New Brunswick, Nova Scotia and Prince Edward Island) sampled at year 2000 in 82 weather station consisting of daily mean temperature and data from CETESB (environmental agency from the state of São Paulo, Brazil) sampled at 22 weather station in the metropolitan region of São Paulo city at year 2014 consisting of the air quality index PM10. Análise de dados funcionais Estatística espacial Functional Data Analysis Geoestatística Geostatistcs Krigagem Kriging MATLAB MATLAB Ondaletas Spatial Statistics Wavelets
39	Analytics for Novel Consumer Insights (A Three Essay Dissertation) Shrivastava, Utkarsh 03 July 2018 (has links) Both literature and practice have investigated how the vast amount of ever increasing customer information can inform marketing strategy and decision making. However, the customer data is often susceptible to modeling bias and misleading findings due to various factors including sample selection and unobservable variables. The available analytics toolkit has continued to develop but in the age of nearly perfect information, the customer decision making has also evolved. The dissertation addresses some of the challenges in deriving valid and useful consumer insights from customer data in the digital age. The first study addresses the limitations of traditional customer purchase measures to account of dynamic temporal variations in the customer purchase history. The study proposes a new approach for representation and summarization of customer purchases to improve promotion forecasts. The method also accounts for sample selection bias that arises due to biased selection of customers for the promotion. The second study investigates the impact of increasing internet penetration on the consumer choices and their response to marketing actions. Using the case study of physician’s drug prescribing, the study identifies how marketers can misallocate resources at the regional level by not accounting for variations in internet penetration. The third paper develops a data driven metric for measuring temporal variations in the brand loyalty. Using a network representation of brand and customer the study also investigates the spillover effects of manufacturer related information shocks on the brand’s loyalty. Direct Promotion Functional Data Analysis Quasi-Experiment Control Function Approach Counterfactual Simulations Databases and Information Systems Marketing
40	Statistical computation and inference for functional data analysis Jiang, Huijing 09 November 2010 (has links) My doctoral research dissertation focuses on two aspects of functional data analysis (FDA): FDA under spatial interdependence and FDA for multi-level data. The first part of my thesis focuses on developing modeling and inference procedure for functional data under spatial dependence. The methodology introduced in this part is motivated by a research study on inequities in accessibility to financial services. The first research problem in this part is concerned with a novel model-based method for clustering random time functions which are spatially interdependent. A cluster consists of time functions which are similar in shape. The time functions are decomposed into spatial global and time-dependent cluster effects using a semi-parametric model. We also assume that the clustering membership is a realization from a Markov random field. Under these model assumptions, we borrow information across curves from nearby locations resulting in enhanced estimation accuracy of the cluster effects and of the cluster membership. In a simulation study, we assess the estimation accuracy of our clustering algorithm under a series of settings: small number of time points, high noise level and varying dependence structures. Over all simulation settings, the spatial-functional clustering method outperforms existing model-based clustering methods. In the case study presented in this project, we focus on estimates and classifies service accessibility patterns varying over a large geographic area (California and Georgia) and over a period of 15 years. The focus of this study is on financial services but it generally applies to any other service operation. The second research project of this part studies an association analysis of space-time varying processes, which is rigorous, computational feasible and implementable with standard software. We introduce general measures to model different aspects of the temporal and spatial association between processes varying in space and time. Using a nonparametric spatiotemporal model, we show that the proposed association estimators are asymptotically unbiased and consistent. We complement the point association estimates with simultaneous confidence bands to assess the uncertainty in the point estimates. In a simulation study, we evaluate the accuracy of the association estimates with respect to the sample size as well as the coverage of the confidence bands. In the case study in this project, we investigate the association between service accessibility and income level. The primary objective of this association analysis is to assess whether there are significant changes in the income-driven equity of financial service accessibility over time and to identify potential under-served markets. The second part of the thesis discusses novel statistical methodology for analyzing multilevel functional data including a clustering method based on a functional ANOVA model and a spatio-temporal model for functional data with a nested hierarchical structure. In this part, I introduce and compare a series of clustering approaches for multilevel functional data. For brevity, I present the clustering methods for two-level data: multiple samples of random functions, each sample corresponding to a case and each random function within a sample/case corresponding to a measurement type. A cluster consists of cases which have similar within-case means (level-1 clustering) or similar between-case means (level-2 clustering). Our primary focus is to evaluate a model-based clustering to more straightforward hard clustering methods. The clustering model is based on a multilevel functional principal component analysis. In a simulation study, we assess the estimation accuracy of our clustering algorithm under a series of settings: small vs. moderate number of time points, high noise level and small number of measurement types. We demonstrate the applicability of the clustering analysis to a real data set consisting of time-varying sales for multiple products sold by a large retailer in the U.S. My ongoing research work in multilevel functional data analysis is developing a statistical model for estimating temporal and spatial associations of a series of time-varying variables with an intrinsic nested hierarchical structure. This work has a great potential in many real applications where the data are areal data collected from different data sources and over geographic regions of different spatial resolution. Service distribution equity Multi-level data Model-based clustering Spatio-temporal Functional data analysis Multilevel models (Statistics) Markov random fields

Search results