Spelling suggestions: "subject:"degression spline"" "subject:"aregression spline""
11 |
Modélisation bayésienne avec des splines du comportement moyen d'un échantillon de courbesMerleau, James 08 1900 (has links)
Cette thèse porte sur l'analyse bayésienne de données fonctionnelles dans un contexte hydrologique. L'objectif principal
est de modéliser des données d'écoulements d'eau d'une manière parcimonieuse tout en reproduisant adéquatement les caractéristiques statistiques de celles-ci. L'analyse de données fonctionnelles nous amène à considérer les séries chronologiques
d'écoulements d'eau comme des fonctions à modéliser avec une méthode non paramétrique. Dans un premier temps, les fonctions sont rendues plus homogènes en les synchronisant. Ensuite, disposant d'un échantillon de courbes homogènes, nous procédons à la modélisation de leurs caractéristiques statistiques en faisant appel aux splines de régression bayésiennes dans un cadre probabiliste assez général.
Plus spécifiquement, nous étudions une famille de distributions continues, qui inclut celles de la famille exponentielle, de laquelle les observations peuvent provenir. De plus, afin d'avoir un outil de modélisation non paramétrique flexible, nous traitons les noeuds intérieurs, qui
définissent les éléments de la base des splines de régression, comme des quantités aléatoires. Nous utilisons alors le MCMC avec sauts réversibles afin d'explorer la distribution a posteriori des noeuds intérieurs. Afin de simplifier cette procédure dans notre contexte général de modélisation, nous considérons des approximations de la distribution marginale des observations, nommément une approximation basée sur le critère d'information de Schwarz et une autre qui fait appel à l'approximation de Laplace. En plus de modéliser la tendance centrale d'un échantillon de courbes, nous proposons aussi une méthodologie pour modéliser simultanément la tendance centrale et
la dispersion de ces courbes, et ce dans notre cadre probabiliste général. Finalement, puisque nous étudions une diversité de distributions statistiques au niveau des observations, nous mettons de l'avant une approche afin de déterminer les distributions les plus adéquates pour
un échantillon de courbes donné. / This thesis is about Bayesian functional data analysis in hydrology. The main objective is to model water flow data in a parsimonious fashion while still reproducing the statistical features of the data. Functional data analysis leads us to consider the water flow time series as functions to be modelled with a nonparametric method. First, the functions are registered in order to make them more homogeneous. With a more homogeneous sample of curves, we proceed to model their statistical features by relying on Bayesian regression splines in a fairly broad probabilistic framework. More specifically, we study a family of continuous distributions, which include those of the exponential family, from which the data might have arisen. Furthermore, to have
a flexible nonparametric modeling tool, we treat the interior knots, which define the basis elements of the regression splines, as random quantities. We then use MCMC with reversible jumps in order to explore the posterior distribution of the interior knots. In order to simplify the procedure in our general modeling context, we consider some approximations for the marginal distribution of the observations, namely one based on the Schwarz information criterion and another which relies on Laplace's approximation. In addition to modeling the central tendency of a sample of curves, we also propose a methodology to simultaneously model the central tendency and the dispersion of the curves in our general probabilistic framework. Finally, since we study several statistical distributions for the observations, we put forward an approach to determine the most adequate distributions for a given sample of curves.
|
12 |
Bayesian Uncertainty Quantification for Large Scale Spatial Inverse ProblemsMondal, Anirban 2011 August 1900 (has links)
We considered a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a high dimension spatial field. The Bayesian approach contains a
natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provides a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. Karhunen-Lo'eve expansion and Discrete Cosine transform were used for dimension reduction of the
random spatial field. Furthermore, we used a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we have shown that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. The need for multiple evaluations of the forward model on a high dimension spatial field (e.g. in the context of MCMC) together with the high dimensionality of the posterior, results in many computation challenges. We developed two-stage reversible jump MCMC method which has the ability to screen the bad proposals in the first inexpensive stage. Channelized spatial fields were represented by facies boundaries and
variogram-based spatial fields within each facies. Using level-set based approach, the shape of the channel boundaries was updated with dynamic data using a Bayesian
hierarchical model where the number of points representing the channel boundaries is assumed to be unknown. Statistical emulators on a large scale spatial field were introduced to avoid the expensive likelihood calculation, which contains the forward simulator, at each iteration of the MCMC step. To build the emulator, the original spatial field was represented by a low dimensional parameterization using Discrete Cosine Transform (DCT), then the Bayesian approach to multivariate adaptive regression spline (BMARS) was used to emulate the simulator. Various numerical results were presented by analyzing simulated as well as real data.
|
13 |
An osteometric evaluation of age and sex differences in the long bones of South African children from the Western CapeStull, Kyra Elizabeth January 2013 (has links)
The main goal of a forensic anthropological analysis of unidentified human remains is to
establish an accurate biological profile. The largest obstacle in the creation or validation of
techniques specific for subadults is the lack of large, modern samples. Techniques created for
subadults were mainly derived from antiquated North American or European samples and thus
inapplicable to a modern South African population as the techniques lack diversity and ignore
the secular trends in modern children. This research provides accurate and reliable methods to
estimate age and sex of South African subadults aged birth to 12 years from long bone lengths
and breadths, as no appropriate techniques exist.
Standard postcraniometric variables (n = 18) were collected from six long bones on 1380
(males = 804, females = 506) Lodox Statscan-generated radiographic images housed at the
Forensic Pathology Service, Salt River and the Red Cross War Memorial Children’s Hospital in
Cape Town, South Africa. Measurement definitions were derived from and/or follow studies in
fetal and subadult osteology and longitudinal growth studies. Radiographic images were
generated between 2007 and 2012, thus the majority of children (70%) were born after 2000 and
thus reflect the modern population.
Because basis splines and multivariate adaptive regression splines (MARS) are
nonparametric the 95% prediction intervals associated with each age at death model were
calculated with cross-validation. Numerous classification methods were employed namely linear,
quadratic, and flexible discriminant analysis, logistic regression, naïve Bayes, and random
forests to identify the method that consistently yielded the lowest error rates. Because some of
the multivariate subsets demonstrated small sample sizes, the classification accuracies were
bootstrapped to validate results. Both univariate and multivariate models were employed in the
age and sex estimation analyses.
Standard errors for the age estimation models were smaller in most of the multivariate
models with the exception of the univariate humerus, femur, and tibia diaphyseal lengths.
Univariate models provide narrower age estimates at the younger ages but the multivariate
models provide narrower age estimates at the older ages. Diaphyseal lengths did not demonstrate
any significant sex differences at any age, but diaphyseal breadths demonstrated significant sex
differences throughout the majority of the ages. Classification methods utilizing multivariate
subsets achieved the highest accuracies, which offer practical applicability in forensic
anthropology (81% to 90%). Whereas logistic regression yielded the highest classification
accuracies for univariate models, FDA yielded the highest classification accuracies for
multivariate models. This study is the first to successfully estimate subadult age and sex using an
extensive number of measurements, univariate and multivariate models, and robust statistical
analyses. The success of the current study is directly related to the large, modern sample size,
which ultimately captured a wider range of human variation than previously collected for
subadult diaphyseal dimensions. / Thesis (PhD)--University of Pretoria, 2013. / gm2014 / Anatomy / unrestricted
|
14 |
Comparison of the 1st and 2nd order Lee–Carter methods with the robust Hyndman–Ullah method for fitting and forecasting mortality ratesWillersjö Nyfelt, Emil January 2020 (has links)
The 1st and 2nd order Lee–Carter methods were compared with the Hyndman–Ullah method in regards to goodness of fit and forecasting ability of mortality rates. Swedish population data was used from the Human Mortality Database. The robust estimation property of the Hyndman–Ullah method was also tested with inclusion of the Spanish flu and a hypothetical scenario of the COVID-19 pandemic. After having presented the three methods and making several comparisons between the methods, it is concluded that the Hyndman–Ullah method is overall superior among the three methods with the implementation of the chosen dataset. Its robust estimation of mortality shocks could also be confirmed.
|
Page generated in 0.066 seconds