Spelling suggestions: "subject:"nonparametric statistics"" "subject:"onparametric statistics""
141 |
STATISTICAL METHODS FOR SPECTRAL ANALYSIS OF NONSTATIONARY TIME SERIESBruce, Scott Alan January 2018 (has links)
This thesis proposes novel methods to address specific challenges in analyzing the frequency- and time-domain properties of nonstationary time series data motivated by the study of electrophysiological signals. A new method is proposed for the simultaneous and automatic analysis of the association between the time-varying power spectrum and covariates. The procedure adaptively partitions the grid of time and covariate values into an unknown number of approximately stationary blocks and nonparametrically estimates local spectra within blocks through penalized splines. The approach is formulated in a fully Bayesian framework, in which the number and locations of partition points are random, and fit using reversible jump Markov chain Monte Carlo techniques. Estimation and inference averaged over the distribution of partitions allows for the accurate analysis of spectra with both smooth and abrupt changes. The new methodology is used to analyze the association between the time-varying spectrum of heart rate variability and self-reported sleep quality in a study of older adults serving as the primary caregiver for their ill spouse. Another method proposed in this dissertation develops a unique framework for automatically identifying bands of frequencies exhibiting similar nonstationary behavior. This proposal provides a standardized, unifying approach to constructing customized frequency bands for different signals under study across different settings. A frequency-domain, iterative cumulative sum procedure is formulated to identify frequency bands that exhibit similar nonstationary patterns in the power spectrum through time. A formal hypothesis testing procedure is also developed to test which, if any, frequency bands remain stationary. This method is shown to consistently estimate the number of frequency bands and the location of the upper and lower bounds defining each frequency band. This method is used to estimate frequency bands useful in summarizing nonstationary behavior of full night heart rate variability data. / Statistics
|
142 |
Interpolants, Error Bounds, and Mathematical Software for Modeling and Predicting Variability in Computer SystemsLux, Thomas Christian Hansen 23 September 2020 (has links)
Function approximation is an important problem. This work presents applications of interpolants to modeling random variables. Specifically, this work studies the prediction of distributions of random variables applied to computer system throughput variability. Existing approximation methods including multivariate adaptive regression splines, support vector regressors, multilayer perceptrons, Shepard variants, and the Delaunay mesh are investigated in the context of computer variability modeling. New methods of approximation using Box splines, Voronoi cells, and Delaunay for interpolating distributions of data with moderately high dimension are presented and compared with existing approaches. Novel theoretical error bounds are constructed for piecewise linear interpolants over functions with a Lipschitz continuous gradient. Finally, a mathematical software that constructs monotone quintic spline interpolants for distribution approximation from data samples is proposed. / Doctor of Philosophy / It is common for scientists to collect data on something they are studying. Often scientists want to create a (predictive) model of that phenomenon based on the data, but the choice of how to model the data is a difficult one to answer. This work proposes methods for modeling data that operate under very few assumptions that are broadly applicable across science. Finally, a software package is proposed that would allow scientists to better understand the true distribution of their data given relatively few observations.
|
143 |
Nonparametric procedures for process control when the control value is not specifiedPark, Changsoon January 1984 (has links)
In industrial production processes, control charts have been developed to detect changes in the parameters specifying the quality of the production so that some rectifying action can be taken to restore the parameters to satisfactory values. Examples of the control charts are the Shewhart chart and the cumulative sum control chart (CUSUM chart). In designing a control chart, the exact distribution of the observations, e.g. normal distribution, is usually assumed to be known. But, when there is not sufficient information in determining the distribution, nonparametric procedures are appropriate. In such cases, the control value for the parameter may not be given because of insufficient information.
To construct a control chart when the control value is not given, a standard sample must be obtained when the process is known to be under control so that the quality of the product can be maintained at the same level as that of the standard sample. For this purpose, samples of fixed size are observed sequentially, and at each time a sample is observed a two-sample nonparametric statistic is obtained from the standard sample and the sequentially observed sample. With these sequentially obtained statistics, the usual process control procedure can be done. The truncation point is applied to denote the finite run length or the time at which sufficient information about the distribution of the observations and/or the control value is obtained so that the procedure may be switched to a parametric procedure or a nonparametric procedure with a control value.
To lessen the difficulties in the dependent structure of the statistics we use the fact that conditioned on the standard sample the statistics are i.i.d. random variables. Upper and lower bounds of the run length distribution are obtained for the Shewhart chart. A Brownian motion process is used to approximate the discrete time process of the CUSUM chart. The exact run length distribution of the approximated CUSUM chart is derived by using the inverse Laplace transform. Applying an appropriate correction to the boundary improves the approximation. / Ph. D.
|
144 |
Non-parametric regression modelling of in situ fCO2 in the Southern OceanPretorius, Wesley Byron 12 1900 (has links)
Thesis (MComm)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: The Southern Ocean is a complex system, where the relationship between CO2
concentrations and its drivers varies intra- and inter-annually. Due to the lack
of readily available in situ data in the Southern Ocean, a model approach
was required which could predict the CO2 concentration proxy variable, fCO2.
This must be done using predictor variables available via remote measurements
to ensure the usefulness of the model in the future. These predictor
variables were sea surface temperature, log transformed chlorophyll-a concentration,
mixed layer depth and at a later stage altimetry. Initial exploratory
analysis indicated that a non-parametric approach to the model should be
taken. A parametric multiple linear regression model was developed to use as
a comparison to previous studies in the North Atlantic Ocean as well as to
compare with the results of the non-parametric approach. A non-parametric
kernel regression model was then used to predict fCO2 and nally a combination
of the parametric and non-parametric regression models was developed,
referred to as the mixed regression model. The results indicated, as expected
from exploratory analyses, that the non-parametric approach produced more
accurate estimates based on an independent test data set. These more accurate
estimates, however, were coupled with zero estimates, caused by the
curse of dimensionality. It was also found that the inclusion of salinity (not
available remotely) improved the model and therefore altimetry was chosen
to attempt to capture this e ect in the model. The mixed model displayed
reduced errors as well as removing the zero estimates and hence reducing
the variance of the error rates. The results indicated that the mixed model
is the best approach to use to predict fCO2 in the Southern Ocean and that
altimetry's inclusion did improve the prediction accuracy. / AFRIKAANSE OPSOMMING: Die Suidelike Oseaan is 'n komplekse sisteem waar die verhouding tussen CO2
konsentrasies en die drywers daarvoor intra- en interjaarliks varieer. 'n Tekort
aan maklik verkrygbare in situ data van die Suidelike Oseaan het daartoe gelei
dat 'n model benadering nodig was wat die CO2 konsentrasie plaasvervangerveranderlike,
fCO2, kon voorspel. Dié moet gedoen word deur om gebruik te
maak van voorspellende veranderlikes, beskikbaar deur middel van afgeleë metings,
om die bruikbaarheid van die model in die toekoms te verseker. Hierdie
voorspellende veranderlikes het ingesluit see-oppervlaktetemperatuur, log getransformeerde
chloro l-a konsentrasie, gemengde laag diepte en op 'n latere
stadium, hoogtemeting. 'n Aanvanklike, ondersoekende analise het aangedui
dat 'n nie-parametriese benadering tot die data geneem moet word. 'n Parametriese
meerfoudige lineêre regressie model is ontwikkel om met die vorige
studies in die Noord-Atlantiese Oseaan asook met die resultate van die nieparametriese
benadering te vergelyk. 'n Nie-parametriese kern regressie model
is toe ingespan om die fCO2 te voorspel en uiteindelik is 'n kombinasie van
die parametriese en nie-parametriese regressie modelle ontwikkel vir dieselfde
doel, wat na verwys word as die gemengde regressie model. Die resultate het
aangetoon, soos verwag uit die ondersoekende analise, dat die nie-parametriese
benadering meer akkurate beramings lewer, gebaseer op 'n onafhanklike toets
datastel. Dié meer akkurate beramings het egter met "nul"beramings gepaartgegaan
wat veroorsaak word deur die vloek van dimensionaliteit. Daar is ook
gevind dat die insluiting van soutgehalte (nie beskikbaar oor via sateliet nie)
die model verbeter en juis daarom is hoogtemeting gekies om te poog om hierdie
e ek in die model vas te vang. Die gemengde model het kleiner foute
getoon asook die "nul"beramings verwyder en sodoende die variasie van die
foutkoerse verminder. Die resultate het dus aangetoon dat dat die gemengde
model die beste benadering is om te gebruik om die fCO2 in die Suidelike Oseaan
te beraam en dat die insluiting van altimetry die akkuraatheid van hierdie
beraming verbeter.
|
145 |
Estimação e comparação de curvas de sobrevivência sob censura informativa. / Estimation and comparison of survival curves with informative censoring.Cesar, Raony Cassab Castro 10 July 2013 (has links)
A principal motivação desta dissertação é um estudo realizado pelo Instituto do Câncer do Estado de São Paulo (ICESP), envolvendo oitocentos e oito pacientes com câncer em estado avançado. Cada paciente foi acompanhado a partir da primeira admissão em uma unidade de terapia intensiva (UTI) pelo motivo de câncer, por um período de no máximo dois anos. O principal objetivo do estudo é avaliar o tempo de sobrevivência e a qualidade de vida desses pacientes através do uso de um tempo ajustado pela qualidade de vida (TAQV). Segundo Gelber et al. (1989), a combinação dessas duas informações, denominada TAQV, induz a um esquema de censura informativa; consequentemente, os métodos tradicionais de análise para dados censurados, tais como o estimador de Kaplan-Meier (Kaplan e Meier, 1958) e o teste de log-rank (Peto e Peto, 1972), tornam-se inapropriados. Visando sanar essa deficiência, Zhao e Tsiatis (1997) e Zhao e Tsiatis (1999) propuseram novos estimadores para a função de sobrevivência e, em Zhao e Tsiatis (2001), foi desenvolvido um teste análogo ao teste log-rank para comparar duas funções de sobrevivência. Todos os métodos considerados levam em conta a ocorrência de censura informativa. Neste trabalho avaliamos criticamente esses métodos, aplicando-os para estimar e testar curvas de sobrevivência associadas ao TAQV no estudo do ICESP. Por fim, utilizamos um método empírico, baseado na técnica de reamostragem bootstrap, a m de propor uma generalização do teste de Zhao e Tsiatis para mais do que dois grupos. / The motivation for this research is related to a study undertaken at the Cancer Institute at São Paulo (ICESP), which comprises the follow up of eight hundred and eight patients with advanced cancer. The patients are followed up from the first admission to the intensive care unit (ICU) for a period up to two years. The main objective is to evaluate the quality-adjusted lifetime (QAL). According to Gelber et al. (1989), the combination of both this information leads to informative censoring; therefore, traditional methods of survival analisys, such as the Kaplan-Meier estimator (Kaplan and Meier, 1958) and log-rank test (Peto and Peto, 1972) become inappropriate. For these reasons, Zhao and Tsiatis (1997) and Zhao and Tsiatis (1999) proposed new estimators for the survival function, and Zhao and Tsiatis (2001) developed a test similar to the log-rank test to compare two survival functions. In this dissertation we critically evaluate and summarize these methods, and employ then in the estimation and hypotheses testing to compare survival curves derived for QAL, the proposed methods to estimate and test survival functions under informative censoring. We also propose a empirical method, based on the bootstrap resampling method, to compare more than two groups, extending the proposed test by Zhao and Tsiatis.
|
146 |
A smoothing spline approach to nonlinear inference for time seriesPitrun, Ivet, 1959- January 2001 (has links)
Abstract not available
|
147 |
Nonparametric evolutionary clusteringXu, Tianbing. January 2009 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Computer Science, 2009. / Includes bibliographical references.
|
148 |
Estimação não paramétrica da função de covariância para dados funcionais agregados / Nonparametric estimation of the covariance function for aggregated functional dataLudwig, Guilherme Vieira Nunes 18 August 2018 (has links)
Orientadores: Nancy Lopes Garcia, Ronaldo Dias / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-18T04:43:13Z (GMT). No. of bitstreams: 1
Ludwig_GuilhermeVieiraNunes_M.pdf: 4540322 bytes, checksum: c767b4a6c7cd883a70e9ebbc33fe04ec (MD5)
Previous issue date: 2011 / Resumo: O objetivo desta dissertação é desenvolver estimadores não paramétricos para a função de covariância de dados funcionais agregados, que consistem em combinações lineares de dados funcionais que não podem ser observados separadamente. Estes métodos devem ser capazes de produzir estimativas que separem a covariância típica de cada uma das subpopulações que geram os dados, e que sejam funções não negativas definidas. Sob estas restrições, foi definida uma classe de funções de covariância não estacionarias, à qual resultados da teoria de estimação de covariância de processos estacionários podem ser estendidos. Os métodos desenvolvidos foram ilustrados com a aplicação em dois problemas reais: a estimação do perfil de consumidores de energia elétrica, em função do tempo, e a estimação da transmitância de substâncias puras em espectroscopia de infravermelho, através da inspeção de misturas, em função do espectro da luz / Abstract: The goal of this dissertation is to develop nonparametric estimators for the covariance function of aggregated functional data, which consists into linear combinations of functional data that cannot be sampled separately. Such methods must be able to produce estimates that not only separate the typical covariance of the subpopulations composing the data, but also be nonnegative definite functions. Under these restrictions, a class of nonstationary covariance functions was proposed, to which stationary processes' covariance function estimation results can be readily extended. The developed methods were illustrated with an application to two real problems: the estimation of electric energy consumers' profiles, as a function of the time of the day, and the estimation of the transmittance of pure substances in infrared spectroscopy, while inspecting mixtures of them, as a function of light spectrum / Mestrado / Estatistica Não Parametrica / Mestre em Estatística
|
149 |
Uma metodologia semi-parametrica para IBNR (Incurred But Not Reported) / A semi-parametric methodology to IBNR (Incurred But Not Reported)Nascimento, Fernando Ferraz do 17 March 2006 (has links)
Orientadores: Ronaldo Dias, Nancy Lopes Garcia / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-06T03:32:13Z (GMT). No. of bitstreams: 1
Nascimento_FernandoFerrazdo_M.pdf: 973412 bytes, checksum: 97b1cf4137bd59eb99ae987305700439 (MD5)
Previous issue date: 2006 / Resumo: Neste trabalho, comparamos diversas técnicas de previsão de IBNR (Incurred But Not Reported) para dados de um triângulo Run-OjJ, desde as mais simples, como por exemplo as técnicas Chain- Ladder e a técnica da Separação, até as técnicas mais sofisticadas, considerando modelos Log-Normais ou pela distribuição Poisson Composta. Além disso, nosso trabalho enfatiza a necessidade do uso de técnicas não-paramétricas, considerando um modelo de truncamento das variáveis. Foi possível mostrar que, mesmo não tendo nenhuma informação sobre a distribuição dos dados, é possível estimar o IBNR com menor erro e variabilidade do que as técnicas usuais conhecidas. Para fazer as comparações, foram realizadas simulações de sinistros ocorrendo através de um Processo de Poisson não homogêneo, e com dependência no tempo de relato e valor do sinistro. A medida de comparação utilizada foi o Erro Quadrático Médio (EQM) entre os valores simulados e os valores previstos por cada técnica. A abordagem paramétrica, quando os dados provém de uma distribuição Poisson Composta, apresentou o menor EQM dentre todas as técnicas. Entretanto, se não há informação sobre a distribuição dos dados, a técnica de Mista de truncamento foi a melhor entre as não-paramétricas / Abstract: We compare several forecast techniques for IBNR(Incurred But Not Reported) from a Run-Off triangle data, since the most simple techniques like Chain-Ladder and Separation Technique, to the more complex using Log-Normal models and Compound Poisson distribution. Therefore, exist the necessity of the use of Nonparametric techniques, using a model that consider variable Truncation. It was possible shown that, when we don't have any information about the data, it's possible estimate de IBNR forecasting with less mistake and variability than the usual techniques. For make the forecasting, we used claims simulations occurring by a nonhomogeneous Poisson process and with dependence entry the time to report and value paid for one claim. The measure of comparison used was the Mean Square Error (MSE) of simulated values and forecasting values for each technique. The parametric boarding when the data come from a Compound Poisson distribution, was the best MSE entry all techniques. However, when we don't have any information about the data, the Truncation Technique was the best of the nonparametric techniques / Mestrado / Mestre em Estatística
|
150 |
Estimação não-parametrica para função de covariancia de processos gaussianos espaciais / Nonparametric estimation for covariance function of spatial gaussian processesGomes, José Clelto Barros 13 August 2018 (has links)
Orientador: Ronaldo Dias / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-13T14:28:48Z (GMT). No. of bitstreams: 1
Gomes_JoseCleltoBarros_M.pdf: 1798618 bytes, checksum: db671b29b83f0321e8dbc03c5af42cde (MD5)
Previous issue date: 2009 / Resumo: O desafio na modelagem de processos espaciais está na descrição da estrutura de covariância do fenômeno sob estudo. Um estimador não-paramétrico da função de covariância foi construído de forma a usar combinações lineares de funções B-splines. Estas bases são
usadas com muita frequência na literatura graças ao seu suporte compacto e a computação tão rápida quanto a habilidade de criar aproximações suaves e apropriadas. Verificouse que a função de covariância estimada era definida positiva por meio do teorema de Bochner. Para a estimação da função de covariância foi implementado um algoritmo que fornece um procedimento completamente automático baseado no número de funções
bases. Então foram realizados estudos numéricos que evidenciaram que assintoticamente o procedimento é consistente, enquanto que para pequenas amostras deve-se considerar as restrições das funções de covariância. As funções de covariâncias usadas na estimação foram as de exponencial potência, gaussiana, cúbica, esférica, quadrática racional, ondular e família de Matérn. Foram estimadas ainda covariâncias encaixadas. Simulações foram realizadas também a fim de verificar o comportamento da distribuição da afinidade. As estimativas apresentaram-se satisfatórias / Abstract: The challenge in modeling of spatials processes is in description of the framework of covariance of the phenomenon about study. The estimation of covariance functions was done using a nonparametric linear combinations of basis functions B-splines. These bases are used frequently in literature thanks to its compact support and fast computing as the
ability to create smooth and appropriate approaches There was positive definiteness of the estimator proposed by the Bochner's theorem. For the estimation of the covariance functions was implemented an algorithm that provides a fully automated procedure based on the number of basis functions. Then numerical studies were performed that showed that the procedure is consistent assynthotically. While for small samples should consider the restrictions of the covariance functions, so the process of optimization was non-linear optimization with restrictions. The following covariance functions were used in estimating: powered exponential, Gaussian, cubic, spherical, rational quadratic and Matérn family.
Nested covariance funtions still were estimated. Simulations were also performed to verify the behavior of affinity and affinity partial, which measures how good is the true function of the estimated function. Estimates showed satisfactory / Mestrado / Mestre em Estatística
|
Page generated in 0.092 seconds