Global ETD Search

191	HATLINK: a link between least squares regression and nonparametric curve estimation Einsporn, Richard L. January 1987 (has links) For both least squares and nonparametric kernel regression, prediction at a given regressor location is obtained as a weighted average of the observed responses. For least squares, the weights used in this average are a direct consequence of the form of the parametric model prescribed by the user. If the prescribed model is not exactly correct, then the resulting predictions and subsequent inferences may be misleading. On the other hand, nonparametric curve estimation techniques, such as kernel regression, obtain prediction weights solely on the basis of the distance of the regressor coordinates of an observation to the point of prediction. These methods therefore ignore information that the researcher may have concerning a reasonable approximate model. In overlooking such information, the nonparametric curve fitting methods often fit anomalous patterns in the data. This paper presents a method for obtaining an improved set of prediction weights by striking the proper balance between the least squares and kernel weighting schemes. The method is called "HATLINK," since the appropriate balance is achieved through a mixture of the hat matrices corresponding to the least squares and kernel fits. The mixing parameter is determined adaptively through cross-validation (PRESS) or by a version of the Cp statistic. Predictions obtained through the HATLINK procedure are shown through simulation studies to be robust to model misspecification by the researcher. It is also demonstrated that the HA TLINK procedure can be used to perform many of the usual tasks of regression analysis, such as estimate the error variance, provide confidence intervals, test for lack of fit of the user's prescribed model, and assist in the variable selection process. In accomplishing all of these tasks, the HATLINK procedure provides a modelrobust alternative to the standard model-based approach to regression. / Ph. D. LD5655.V856 1987.E467 Regression analysis Least squares Nonparametric statistics
192	Non-parametric regression modelling of in situ fCO2 in the Southern Ocean Pretorius, Wesley Byron 12 1900 (has links) Thesis (MComm)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: The Southern Ocean is a complex system, where the relationship between CO2 concentrations and its drivers varies intra- and inter-annually. Due to the lack of readily available in situ data in the Southern Ocean, a model approach was required which could predict the CO2 concentration proxy variable, fCO2. This must be done using predictor variables available via remote measurements to ensure the usefulness of the model in the future. These predictor variables were sea surface temperature, log transformed chlorophyll-a concentration, mixed layer depth and at a later stage altimetry. Initial exploratory analysis indicated that a non-parametric approach to the model should be taken. A parametric multiple linear regression model was developed to use as a comparison to previous studies in the North Atlantic Ocean as well as to compare with the results of the non-parametric approach. A non-parametric kernel regression model was then used to predict fCO2 and nally a combination of the parametric and non-parametric regression models was developed, referred to as the mixed regression model. The results indicated, as expected from exploratory analyses, that the non-parametric approach produced more accurate estimates based on an independent test data set. These more accurate estimates, however, were coupled with zero estimates, caused by the curse of dimensionality. It was also found that the inclusion of salinity (not available remotely) improved the model and therefore altimetry was chosen to attempt to capture this e ect in the model. The mixed model displayed reduced errors as well as removing the zero estimates and hence reducing the variance of the error rates. The results indicated that the mixed model is the best approach to use to predict fCO2 in the Southern Ocean and that altimetry's inclusion did improve the prediction accuracy. / AFRIKAANSE OPSOMMING: Die Suidelike Oseaan is 'n komplekse sisteem waar die verhouding tussen CO2 konsentrasies en die drywers daarvoor intra- en interjaarliks varieer. 'n Tekort aan maklik verkrygbare in situ data van die Suidelike Oseaan het daartoe gelei dat 'n model benadering nodig was wat die CO2 konsentrasie plaasvervangerveranderlike, fCO2, kon voorspel. Dié moet gedoen word deur om gebruik te maak van voorspellende veranderlikes, beskikbaar deur middel van afgeleë metings, om die bruikbaarheid van die model in die toekoms te verseker. Hierdie voorspellende veranderlikes het ingesluit see-oppervlaktetemperatuur, log getransformeerde chloro l-a konsentrasie, gemengde laag diepte en op 'n latere stadium, hoogtemeting. 'n Aanvanklike, ondersoekende analise het aangedui dat 'n nie-parametriese benadering tot die data geneem moet word. 'n Parametriese meerfoudige lineêre regressie model is ontwikkel om met die vorige studies in die Noord-Atlantiese Oseaan asook met die resultate van die nieparametriese benadering te vergelyk. 'n Nie-parametriese kern regressie model is toe ingespan om die fCO2 te voorspel en uiteindelik is 'n kombinasie van die parametriese en nie-parametriese regressie modelle ontwikkel vir dieselfde doel, wat na verwys word as die gemengde regressie model. Die resultate het aangetoon, soos verwag uit die ondersoekende analise, dat die nie-parametriese benadering meer akkurate beramings lewer, gebaseer op 'n onafhanklike toets datastel. Dié meer akkurate beramings het egter met "nul"beramings gepaartgegaan wat veroorsaak word deur die vloek van dimensionaliteit. Daar is ook gevind dat die insluiting van soutgehalte (nie beskikbaar oor via sateliet nie) die model verbeter en juis daarom is hoogtemeting gekies om te poog om hierdie e ek in die model vas te vang. Die gemengde model het kleiner foute getoon asook die "nul"beramings verwyder en sodoende die variasie van die foutkoerse verminder. Die resultate het dus aangetoon dat dat die gemengde model die beste benadering is om te gebruik om die fCO2 in die Suidelike Oseaan te beraam en dat die insluiting van altimetry die akkuraatheid van hierdie beraming verbeter. Nonparametric regression Regression analysis Nonparametric statistics Carbon dioxide -- Antarctic Ocean
193	Statistical methods to study heterogeneity of treatment effects Taft, Lin H. 25 September 2015 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Randomized studies are designed to estimate the average treatment effect (ATE) of an intervention. Individuals may derive quantitatively, or even qualitatively, different effects from the ATE, which is called the heterogeneity of treatment effect. It is important to detect the existence of heterogeneity in the treatment responses, and identify the different sub-populations. Two corresponding statistical methods will be discussed in this talk: a hypothesis testing procedure and a mixture-model based approach. The hypothesis testing procedure was constructed to test for the existence of a treatment effect in sub-populations. The test is nonparametric, and can be applied to all types of outcome measures. A key innovation of this test is to build stochastic search into the test statistic to detect signals that may not be linearly related to the multiple covariates. Simulations were performed to compare the proposed test with existing methods. Power calculation strategy was also developed for the proposed test at the design stage. The mixture-model based approach was developed to identify and study the sub-populations with different treatment effects from an intervention. A latent binary variable was used to indicate whether or not a subject was in a sub-population with average treatment benefit. The mixture-model combines a logistic formulation of the latent variable with proportional hazards models. The parameters in the mixture-model were estimated by the EM algorithm. The properties of the estimators were then studied by the simulations. Finally, all above methods were applied to a real randomized study in a low ejection fraction population that compared the Implantable Cardioverter Defibrillator (ICD) with conventional medical therapy in reducing total mortality. Bootstrap Heterogeneity Nonparametric Randomized trial Stochastic search Instrumental variables (Statistics) Nonparametric statistics Opportunity costs Quantitative research Qualitative research
194	Nonparametric statistical inference for functional brain information mapping Stelzer, Johannes 26 May 2014 (has links) (PDF) An ever-increasing number of functional magnetic resonance imaging (fMRI) studies are now using information-based multi-voxel pattern analysis (MVPA) techniques to decode mental states. In doing so, they achieve a significantly greater sensitivity compared to when they use univariate analysis frameworks. Two most prominent MVPA methods for information mapping are searchlight decoding and classifier weight mapping. The new MVPA brain mapping methods, however, have also posed new challenges for analysis and statistical inference on the group level. In this thesis, I discuss why the usual procedure of performing t-tests on MVPA derived information maps across subjects in order to produce a group statistic is inappropriate. I propose a fully nonparametric solution to this problem, which achieves higher sensitivity than the most commonly used t-based procedure. The proposed method is based on resampling methods and preserves the spatial dependencies in the MVPA-derived information maps. This enables to incorporate a cluster size control for the multiple testing problem. Using a volumetric searchlight decoding procedure and classifier weight maps, I demonstrate the validity and sensitivity of the new approach using both simulated and real fMRI data sets. In comparison to the standard t-test procedure implemented in SPM8, the new results showed a higher sensitivity and spatial specificity. The second goal of this thesis is the comparison of the two widely used information mapping approaches -- the searchlight technique and classifier weight mapping. Both methods take into account the spatially distributed patterns of activation in order to predict stimulus conditions, however the searchlight method solely operates on the local scale. The searchlight decoding technique has furthermore been found to be prone to spatial inaccuracies. For instance, the spatial extent of informative areas is generally exaggerated, and their spatial configuration is distorted. In this thesis, I compare searchlight decoding with linear classifier weight mapping, both using the formerly proposed non-parametric statistical framework using a simulation and ultra-high-field 7T experimental data. It was found that the searchlight method led to spatial inaccuracies that are especially noticeable in high-resolution fMRI data. In contrast, the weight mapping method was more spatially precise, revealing both informative anatomical structures as well as the direction by which voxels contribute to the classification. By maximizing the spatial accuracy of ultra-high-field fMRI results, such global multivariate methods provide a substantial improvement for characterizing structure-function relationships. fMRI nonparametric statistics MVPA pattern classification functional magnetic resonance imaging multivariate searchlight decoding fMRI nonparametric statistics MVPA pattern classification functional magnetic resonance imaging multivariate searchlight decoding ddc:500
195	Contributions to robust methods in nonparametric frontier models Bruffaerts, Christopher 10 September 2014 (has links) Les modèles de frontières sont actuellement très utilisés par beaucoup d’économistes, gestionnaires ou toute personne dite « decision-maker ». Dans ces modèles de frontières, le but du chercheur consiste à attribuer à des unités de production (des firmes, des hôpitaux ou des universités par exemple) une mesure de leur efficacité en terme de production. Ces unités (dénotées DMU-Decision-Making Units) utilisent-elles à bon escient leurs « inputs » et « outputs »? Font-elles usage de tout leur potentiel dans le processus de production? <p>L’ensemble de production est l’ensemble contenant toutes les combinaisons d’inputs et d’outputs qui sont physiquement réalisables dans une économie. De cet ensemble contenant p inputs et q outputs, la notion d’efficacité d ‘une unité de production peut être définie. Celle-ci se définie comme une distance séparant le DMU de la frontière de l’ensemble de production. A partir d’un échantillon de DMUs, le but est de reconstruire cette frontière de production afin de pouvoir y évaluer l’efficacité des DMUs. A cette fin, le chercheur utilise très souvent des méthodes dites « classiques » telles que le « Data Envelopment Analysis » (DEA).<p><p>De nos jours, le statisticien bénéficie de plus en plus de données, ce qui veut également dire qu’il n’a pas l’opportunité de faire attention aux données qui font partie de sa base de données. Il se peut en effet que certaines valeurs aberrantes s’immiscent dans les jeux de données sans que nous y fassions particulièrement attention. En particulier, les modèles de frontières sont extrêmement sensibles aux valeurs aberrantes et peuvent fortement influencer l’inférence qui s’en suit. Pour éviter que certaines données n’entravent une analyse correcte, des méthodes robustes sont utilisées.<p><p>Allier le côté robuste au problème d’évaluation d’efficacité est l’objectif général de cette thèse. Le premier chapitre plante le décor en présentant la littérature existante dans ce domaine. Les quatre chapitres suivants sont organisés sous forme d’articles scientifiques. <p>Le chapitre 2 étudie les propriétés de robustesse d’un estimateur d’efficacité particulier. Cet estimateur mesure la distance entre le DMU analysé et la frontière de production le long d’un chemin hyperbolique passant par l’unité. Ce type de distance très spécifique s’avère très utile pour définir l’efficacité de type directionnel. <p>Le chapitre 3 est l’extension du premier article au cas de l’efficacité directionnelle. Ce type de distance généralise toutes les distances de type linéaires pour évaluer l’efficacité d’un DMU. En plus d’étudier les propriétés de robustesse de l’estimateur d’efficacité de type directionnel, une méthode de détection de valeurs aberrantes est présentée. Celle-ci s’avère très utile afin d’identifier les unités de production influençantes dans cet espace multidimensionnel (dimension p+q). <p>Le chapitre 4 présente les méthodes d’inférence pour les efficacités dans les modèles nonparamétriques de frontière. En particulier, les méthodes de rééchantillonnage comme le bootstrap ou le subsampling s’avère être très utiles. Dans un premier temps, cet article montre comment améliorer l’inférence sur les efficacités grâce au subsampling et prouve qu’il n’est pas suffisant d’utiliser un estimateur d’efficacité robuste dans les méthodes de rééchantillonnage pour avoir une inférence qui soit fiable. C’est pourquoi, dans un second temps, cet article propose une méthode robuste de rééchantillonnage qui est adaptée au problème d’évaluation d’efficacité. <p>Finalement, le dernier chapitre est une application empirique. Plus précisément, cette analyse s’intéresse à l ‘efficacité des universités américaines publiques et privées au niveau de leur recherche. Des méthodes classiques et robustes sont utilisées afin de montrer comment tous les outils étudiés précédemment peuvent s’appliquer en pratique. En particulier, cette étude permet d’étudier l’impact sur l’efficacité des institutions américaines de certaines variables telles que l’enseignement, l’internationalisation ou la collaboration avec le monde de l’industrie.<p> / Doctorat en sciences, Orientation statistique / info:eu-repo/semantics/nonPublished Mathématiques Sciences exactes et naturelles Nonparametric statistics Resampling (Statistics) Distribution (Probability theory) Statistique non-paramétrique Rééchantillonnage (Statistique) Frontier models Nonparametric DEA Robustness
196	Gestion des actifs financiers : de l’approche Classique à la modélisation non paramétrique en estimation du DownSide Risk pour la constitution d’un portefeuille efficient / The Management of financial assets : from Classical Approach to the Nonparametric Modelling in the DownSide Risk Estimation in Order to Get an Optimal Portfolio Ben Salah, Hanene 23 November 2015 (has links) La méthode d'optimisation d'un portefeuille issue de la minimisation du DownSide Risk a été mise au point pour suppléer les carences de la méthode classique de Markowitz dont l'hypothèse de la normalité de la distribution des rendements se trouve défaillante très souvent. Dans cette thèse, nous proposons d'introduire des estimateurs non paramétriques de la moyenne ou de la médiane conditionnelle pour remplacer les rendements observés d'un portefeuille ou des actifs constituant un portefeuille dans le cas du DownSide Risk. Ces estimateurs nous permettent d'obtenir des frontières efficientes lisses et facilement interprétables. Nous développons des algorithmes itératifs pour résoudre les différents problèmes d'optimisation permettant d'obtenir des portefeuilles optimaux. Nous proposons aussi une nouvelle mesure de risque dit risque conditionnel qui tient compte des anticipations des valeurs futures des différents rendements. Pour le définir nous avons fait appel aux prédicteurs non paramétriques basés sur l'estimation de la moyenne conditionnelle. Enfin, nous avons testé et validé toutes nos méthodes sur des données issues de différents marchés et nous avons montré leur performance et leur efficacité comparées aux méthodes classiques / The DownSide Risk (DSR) model for portfolio optimization allows to overcome the drawbacks of the classical Mean-Variance model concerning the asymmetry of returns and the risk perception of investors. This optimization model deals with a positive definite matrix that is endogenous with respect to the portfolio weights and hence leads to a non standard optimization problem. To bypass this hurdle, we developed a new recursive minimization procedure that ensures the convergence to the solution and gives a smooth portfolio efficient frontier. Our method consists in replacing all the returns by their nonparametric estimators counterpart using kernel mean or median regressions. This technique provides an effect similar to the case where an infinite number of observations is available. We also develop a new portfolio optimization model where the risks are measured through conditional variance or semivariance. This strategy allows us to take advantage from returns prediction which are obtained by nonparametric univariate methods. The prediction step uses kernel estimation of the conditional mean. Data from different markets are used to test and validate the proposed approaches, and results indicate better overall performance Risque Conditionnel DownSide Risk Noyau Semivariance Conditional Risk DownSide Risk Kernel Predictors Nonparametric Mean Estimation Nonparametric Median Estimation Semivariance 658.1
197	Methodologies for Missing Data with Range Regressions Stoll, Kevin Edward 24 April 2019 (has links) No description available. Statistics Missing Data Missing Response Nonparametric Range Regression Nonparametric Range Regression Propensity Score Ascendancy Average Rank Propensity Stratification Regression Bootstrap Missing at Random Double-Robust Consistency Almost Sure
198	Scalable Nonparametric L1 Density Estimation via Sparse Subtree Partitioning Sandstedt, Axel January 2023 (has links) We consider the construction of multivariate histogram estimators for any density f seeking to minimize its L1 distance to the true underlying density using arbitrarily large sample sizes. Theory for such estimators exist and the early stages of distributed implementations are available. Our main contributions are new algorithms which seek to optimise out unnecessary network communication taking place in the distributed stages of the construction of such estimators using sparse binary tree arithmetics. density estimation scalable density estimation nonparametric density estimation L1 L_1 anomaly detection regression analysis Probability Theory and Statistics Sannolikhetsteori och statistik
199	The Rasch Sampler Verhelst, Norman D., Hatzinger, Reinhold, Mair, Patrick 22 February 2007 (has links) (PDF) The Rasch sampler is an efficient algorithm to sample binary matrices with given marginal sums. It is a Markov chain Monte Carlo (MCMC) algorithm. The program can handle matrices of up to 1024 rows and 64 columns. A special option allows to sample square matrices with given marginals and fixed main diagonal, a problem prominent in social network analysis. In all cases the stationary distribution is uniform. The user has control on the serial dependency. (authors' abstract)
200	Empirical Bayesian Smoothing Splines for Signals with Correlated Errors: Methods and Applications Rosales Marticorena, Luis Francisco 22 June 2016 (has links) No description available. 510 nonparametric statistics smoothing splines Demmler-Reinsch basis correlated errors Mathematik (PPN61756535X)

Search results