• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 21
  • 8
  • 7
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 55
  • 55
  • 15
  • 11
  • 10
  • 10
  • 9
  • 9
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Mean preservation in censored regression using preliminary nonparametric smoothing

Heuchenne, Cédric 18 August 2005 (has links)
In this thesis, we consider the problem of estimating the regression function in location-scale regression models. This model assumes that the random vector (X,Y) satisfies Y = m(X) + s(X)e, where m(.) is an unknown location function (e.g. conditional mean, median, truncated mean,...), s(.) is an unknown scale function, and e is independent of X. The response Y is subject to random right censoring, and the covariate X is completely observed. In the first part of the thesis, we assume that m(x) = E(Y|X=x) follows a polynomial model. A new estimation procedure for the unknown regression parameters is proposed, which extends the classical least squares procedure to censored data. The proposed method is inspired by the method of Buckley and James (1979), but is, unlike the latter method, a non-iterative procedure due to nonparametric preliminary estimation. The asymptotic normality of the estimators is established. Simulations are carried out for both methods and they show that the proposed estimators have usually smaller variance and smaller mean squared error than the Buckley-James estimators. For the second part, suppose that m(.)=E(Y|.) belongs to some parametric class of regression functions. A new estimation procedure for the true, unknown vector of parameters is proposed, that extends the classical least squares procedure for nonlinear regression to the case where the response is subject to censoring. The proposed technique uses new `synthetic' data points that are constructed by using a nonparametric relation between Y and X. The consistency and asymptotic normality of the proposed estimator are established, and the estimator is compared via simulations with an estimator proposed by Stute in 1999. In the third part, we study the nonparametric estimation of the regression function m(.). It is well known that the completely nonparametric estimator of the conditional distribution F(.|x) of Y given X=x suffers from inconsistency problems in the right tail (Beran, 1981), and hence the location function m(x) cannot be estimated consistently in a completely nonparametric way, whenever m(x) involves the right tail of F(.|x) (like e.g. for the conditional mean). We propose two alternative estimators of m(x), that do not share the above inconsistency problems. The idea is to make use of the assumed location-scale model, in order to improve the estimation of F(.|x), especially in the right tail. We obtain the asymptotic properties of the two proposed estimators of m(x). Simulations show that the proposed estimators outperform the completely nonparametric estimator in many cases.
42

Testing for spatial correlation and semiparametric spatial modeling of binary outcomes with application to aberrant crypt foci in colon carcinogenesis experiments

Apanasovich, Tatiyana Vladimirovna 01 November 2005 (has links)
In an experiment to understand colon carcinogenesis, all animals were exposed to a carcinogen while half the animals were also exposed to radiation. Spatially, we measured the existence of aberrant crypt foci (ACF), namely morphologically changed colonic crypts that are known to be precursors of colon cancer development. The biological question of interest is whether the locations of these ACFs are spatially correlated: if so, this indicates that damage to the colon due to carcinogens and radiation is localized. Statistically, the data take the form of binary outcomes (corresponding to the existence of an ACF) on a regular grid. We develop score??type methods based upon the Matern and conditionally autoregression (CAR) correlation models to test for the spatial correlation in such data, while allowing for nonstationarity. Because of a technical peculiarity of the score??type test, we also develop robust versions of the method. The methods are compared to a generalization of Moran??s test for continuous outcomes, and are shown via simulation to have the potential for increased power. When applied to our data, the methods indicate the existence of spatial correlation, and hence indicate localization of damage. Assuming that there are correlations in the locations of the ACF, the questions are how great are these correlations, and whether the correlation structures di?er when an animal is exposed to radiation. To understand the extent of the correlation, we cast the problem as a spatial binary regression, where binary responses arise from an underlying Gaussian latent process. We model these marginal probabilities of ACF semiparametrically, using ?xed-knot penalized regression splines and single-index models. We ?t the models using pairwise pseudolikelihood methods. Assuming that the underlying latent process is strongly mixing, known to be the case for many Gaussian processes, we prove asymptotic normality of the methods. The penalized regression splines have penalty parameters that must converge to zero asymptotically: we derive rates for these parameters that do and do not lead to an asymptotic bias, and we derive the optimal rate of convergence for them. Finally, we apply the methods to the data from our experiment.
43

Testing for spatial correlation and semiparametric spatial modeling of binary outcomes with application to aberrant crypt foci in colon carcinogenesis experiments

Apanasovich, Tatiyana Vladimirovna 01 November 2005 (has links)
In an experiment to understand colon carcinogenesis, all animals were exposed to a carcinogen while half the animals were also exposed to radiation. Spatially, we measured the existence of aberrant crypt foci (ACF), namely morphologically changed colonic crypts that are known to be precursors of colon cancer development. The biological question of interest is whether the locations of these ACFs are spatially correlated: if so, this indicates that damage to the colon due to carcinogens and radiation is localized. Statistically, the data take the form of binary outcomes (corresponding to the existence of an ACF) on a regular grid. We develop score??type methods based upon the Matern and conditionally autoregression (CAR) correlation models to test for the spatial correlation in such data, while allowing for nonstationarity. Because of a technical peculiarity of the score??type test, we also develop robust versions of the method. The methods are compared to a generalization of Moran??s test for continuous outcomes, and are shown via simulation to have the potential for increased power. When applied to our data, the methods indicate the existence of spatial correlation, and hence indicate localization of damage. Assuming that there are correlations in the locations of the ACF, the questions are how great are these correlations, and whether the correlation structures di?er when an animal is exposed to radiation. To understand the extent of the correlation, we cast the problem as a spatial binary regression, where binary responses arise from an underlying Gaussian latent process. We model these marginal probabilities of ACF semiparametrically, using ?xed-knot penalized regression splines and single-index models. We ?t the models using pairwise pseudolikelihood methods. Assuming that the underlying latent process is strongly mixing, known to be the case for many Gaussian processes, we prove asymptotic normality of the methods. The penalized regression splines have penalty parameters that must converge to zero asymptotically: we derive rates for these parameters that do and do not lead to an asymptotic bias, and we derive the optimal rate of convergence for them. Finally, we apply the methods to the data from our experiment.
44

Essays on Trade Agreements, Agricultural Commodity Prices and Unconditional Quantile Regression

Li, Na 03 January 2014 (has links)
My dissertation consists of three essays in three different areas: international trade; agricultural markets; and nonparametric econometrics. The first and third essays are theoretical papers, while the second essay is empirical. In the first essay, I developed a political economy model of trade agreements where the set of policy instruments are endogenously determined, providing a rationale for countervailing duties (CVDs). Trade-related policy intervention is assumed to be largely shaped in response to rent seeking demand as is often shown empirically. Consequently, the uncertain circumstance during the lifetime of a trade agreement involves both economic and rent seeking conditions. The latter approximates the actual trade policy decisions more closely than the externality hypothesis and thus provides scope for empirical testing. The second essay tests whether normal mixture (NM) generalized autoregressive conditional heteroscedasticity (GARCH) models adequately capture the relevant properties of agricultural commodity prices. Volatility series were constructed for ten agricultural commodity weekly cash prices. NM-GARCH models allow for heterogeneous volatility dynamics among different market regimes. Both in-sample fit and out-of-sample forecasting tests confirm that the two-state NM-GARCH approach performs significantly better than the traditional normal GARCH model. For each commodity, it is found that an expected negative price change corresponds to a higher volatility persistence, while an expected positive price change arises in conjunction with a greater responsiveness of volatility. In the third essay, I propose an estimator for a nonparametric additive unconditional quantile regression model. Unconditional quantile regression is able to assess the possible different impacts of covariates on different unconditional quantiles of a response variable. The proposed estimator does not require d-dimensional nonparametric regression and therefore has no curse of dimensionality. In addition, the estimator has an oracle property in the sense that the asymptotic distribution of each additive component is the same as the case when all other components are known. Both numerical simulations and an empirical application suggest that the new estimator performs much better than alternatives. / the Canadian Agricultural Trade Policy and Competitiveness Research Network, the Structure and Performance of Agriculture and Agri-products Industry Network, and the Institute for the Advanced Study of Food and Agricultural Policy.
45

Analyse de données fonctionnelles en télédétection hyperspectrale : application à l'étude des paysages agri-forestiers / Functional data analysis in hyperspectral remote sensing : application to the study of agri-forest landscape

Zullo, Anthony 19 September 2016 (has links)
En imagerie hyperspectrale, chaque pixel est associé à un spectre provenant de la réflectance observée en d points de mesure (i.e., longueurs d'onde). On se retrouve souvent dans une situation où la taille d'échantillon n est relativement faible devant le nombre d de variables. Ce phénomène appelé "fléau de la dimension" est bien connu en statistique multivariée. Plus d augmente devant n, plus les performances des méthodologies statistiques standard se dégradent. Les spectres de réflectance intègrent dans leur dimension spectrale un continuum qui leur confère une nature fonctionnelle. Un hyperspectre peut être modélisé par une fonction univariée de la longueur d'onde, sa représentation produisant une courbe. L'utilisation de méthodes fonctionnelles sur de telles données permet de prendre en compte des aspects fonctionnels tels que la continuité, l'ordre des bandes spectrales, et de s'affranchir des fortes corrélations liées à la finesse de la grille de discrétisation. L'objectif principal de cette thèse est d'évaluer la pertinence de l'approche fonctionnelle dans le domaine de la télédétection hyperspectrale lors de l'analyse statistique. Nous nous sommes focalisés sur le modèle non-paramétrique de régression fonctionnelle, couvrant la classification supervisée. Dans un premier temps, l'approche fonctionnelle a été comparée avec des méthodes multivariées usuellement employées en télédétection. L'approche fonctionnelle surpasse les méthodes multivariées dans des situations délicates où l'on dispose d'une petite taille d'échantillon d'apprentissage combinée à des classes relativement homogènes (c'est-à-dire difficiles à discriminer). Dans un second temps, une alternative à l'approche fonctionnelle pour s'affranchir du fléau de la dimension a été développée à l'aide d'un modèle parcimonieux. Ce dernier permet, à travers la sélection d'un petit nombre de points de mesure, de réduire la dimensionnalité du problème tout en augmentant l'interprétabilité des résultats. Dans un troisième temps, nous nous sommes intéressés à la situation pratique quasi-systématique où l'on dispose de données fonctionnelles contaminées. Nous avons démontré que pour une taille d'échantillon fixée, plus la discrétisation est fine, meilleure sera la prédiction. Autrement dit, plus d est grand devant n, plus la méthode statistique fonctionnelle développée est performante. / In hyperspectral imaging, each pixel is associated with a spectrum derived from observed reflectance in d measurement points (i.e., wavelengths). We are often facing a situation where the sample size n is relatively low compared to the number d of variables. This phenomenon called "curse of dimensionality" is well known in multivariate statistics. The mored increases with respect to n, the more standard statistical methodologies performances are degraded. Reflectance spectra incorporate in their spectral dimension a continuum that gives them a functional nature. A hyperspectrum can be modelised by an univariate function of wavelength and his representation produces a curve. The use of functional methods allows to take into account functional aspects such as continuity, spectral bands order, and to overcome strong correlations coming from the discretization grid fineness. The main aim of this thesis is to assess the relevance of the functional approach in the field of hyperspectral remote sensing for statistical analysis. We focused on the nonparametric fonctional regression model, including supervised classification. Firstly, the functional approach has been compared with multivariate methods usually involved in remote sensing. The functional approach outperforms multivariate methods in critical situations where one has a small training sample size combined with relatively homogeneous classes (that is to say, hard to discriminate). Secondly, an alternative to the functional approach to overcome the curse of dimensionality has been proposed using parsimonious models. This latter allows, through the selection of few measurement points, to reduce problem dimensionality while increasing results interpretability. Finally, we were interested in the almost systematic situation where one has contaminated functional data. We proved that for a fixed sample size, the finer the discretization, the better the prediction. In other words, the larger dis compared to n, the more effective the functional statistical methodis.
46

A relação entre o tamanho das propriedades agrícolas e a produtividade no Brasil: uma análise não paramétrica / The relationship between farm size and productivity in Brazil: a nonparametric analysis

Alexandre Amorim de Souza Ferreira 05 April 2018 (has links)
A análise de regressão kernel não paramétrica desconsidera qualquer influência das formas funcionais geralmente empregadas em análises de regressões paramétricas, permitindo os dados \"falarem por si mesmos\". Enquanto os estimadores paramétricos são considerados globais, os kernels não paramétricos usam uma amostra de dados próximas (definida pela largura da janela) a um ponto para ajustar a estimação, o que permite focar em peculiaridades locais dos dados. Ambas as análises foram aplicadas aos dados do Censo Agropecuário de 2006 realizado pelo IBGE, agregados municipalmente e em dezessete faixas de áreas, para estimar uma função de produção com o objetivo de estabelecer a relação entre o tamanho das propriedades agrícolas e o valor da produção por hectare (produtividade). A relação constatada foi inversa, porém a análise local feita pelos estimadores kernels explicitou uma relação direta entre as elasticidades de produção dos insumos e o tamanho das propriedades agrícolas, o que não justifica uma política de redistribuição de terras no sentido do aumento da produtividade. Além disto, análises gráficas contra fatuais (que manteve os insumos, exceto a área, constantes em seus valores médios) mostraram que a relação não é linear, não é monotônica, e difere dentre as regiões, o que é um desafio para a elaboração de políticas de redistribuição de terras. / Nonparametric kernel regression analysis disregards any influence of the functional forms commonly employed in parametric regression analyzes, allowing the data to \"speak for itself.\" While parametric estimators are considered global, nonparametric kernels use a sample of nearby data (defined by the bandwidth) at a point to adjust the estimation, which allows focusing on local peculiarities of the data. Both analyzes were applied to data from the 2006 IBGE Census of Agriculture, aggregated in municipalities and in seventeen areas, to estimate a production function with the objective of establishing the relationship between the size of agricultural properties and the value of production by hectare (productivity). The observed relationship was reversed, but the local analysis made by the kernels estimators explained a direct relationship between the elasticities of production of the inputs and the size of the agricultural properties, which does not justify a policy of redistribution of land in order to increase productivity. In addition, graphical analyzes against factors (which kept the inputs, except the area, constant in their mean values) showed that the relationship is not linear, is not monotonic, and differs among regions, which is a challenge for the elaboration of land redistribution policies.
47

[en] SEMIPARAMETRIC POISSON-GAMMA MODELS: A ROUGHNESS PENALTY APPROACH / [pt] MODELO POISSON-GAMA SEMI-PARAMÉTRICO: UMA ABORDAGEM DE PENALIZAÇÃO POR RUGOSIDADE

WASHINGTON LEITE JUNGER 19 February 2004 (has links)
[pt] Neste trabalho, os modelos Poisson-gama são estendidos para uma formulação mais geral onde o preditor linear das covariáveis é substituído por um preditor aditivo de funções genéricas destas covariáveis. Como nos modelos aditivos generalizados (MAG), as funções lineares das covariáveis constituem um caso particular de modelo aditivo e as funções suavizadores utilizadas são as splines cúbicas naturais. A formulação semi-paramétrica permite ampliar o campo de aplicação desta classe de modelos. Os modelos semi-paramétricos são estimados por um processo iterativo combinando maximização da verossimilhança e algoritmo backfitting. Todos os algoritmos de estimação e diagnósticos estão implementados nas linguagens de programação R e C. / [en] This work is aimed at extending the Poisson-Gamma models towards a more general specification, where the linear predictor of covariates is replaced by an additive predictor of generic functions of these covariates. Just like the generalized additive models (GAM), the linear functions of covariates are a particular case of additive models and the natural cubic splines are used as smoothing functions. The semiparametric specification allows to enlarge the possibilities of application of these models. The semiparametric models are fitted by an iterative process that combines maximization of likelihood and backfitting algorithm. All the routines for model fitting and diagnostics are implemented in R and C programming languages.
48

Least squares estimation for binary decision trees

Albrecht, Nadine 14 December 2020 (has links)
In this thesis, a binary decision tree is used as an approximation of a nonparametric regression curve. The best fitted decision tree is estimated from data via least squares method. It is investigated how and under which conditions the estimator converges. These asymptotic results then are used to create asymptotic convergence regions.
49

Contribution à la régression non paramétrique avec un processus erreur d'autocovariance générale et application en pharmacocinétique / Contribution to nonparametric regression estimation with general autocovariance error process and application to pharmacokinetics

Benelmadani, Djihad 18 September 2019 (has links)
Dans cette thèse, nous considérons le modèle de régression avec plusieurs unités expérimentales, où les erreurs forment un processus d'autocovariance dans un cadre générale, c'est-à-dire, un processus du second ordre (stationnaire ou non stationnaire) avec une autocovariance non différentiable le long de la diagonale. Nous sommes intéressés, entre autres, à l'estimation non paramétrique de la fonction de régression de ce modèle.Premièrement, nous considérons l'estimateur classique proposé par Gasser et Müller. Nous étudions ses performances asymptotiques quand le nombre d'unités expérimentales et le nombre d'observations tendent vers l'infini. Pour un échantillonnage régulier, nous améliorons les vitesses de convergence d'ordre supérieur de son biais et de sa variance. Nous montrons aussi sa normalité asymptotique dans le cas des erreurs corrélées.Deuxièmement, nous proposons un nouvel estimateur à noyau pour la fonction de régression, basé sur une propriété de projection. Cet estimateur est construit à travers la fonction d'autocovariance des erreurs et une fonction particulière appartenant à l'Espace de Hilbert à Noyau Autoreproduisant (RKHS) associé à la fonction d'autocovariance. Nous étudions les performances asymptotiques de l'estimateur en utilisant les propriétés de RKHS. Ces propriétés nous permettent d'obtenir la vitesse optimale de convergence de la variance de cet estimateur. Nous prouvons sa normalité asymptotique, et montrons que sa variance est asymptotiquement plus petite que celle de l'estimateur de Gasser et Müller. Nous conduisons une étude de simulation pour confirmer nos résultats théoriques.Troisièmement, nous proposons un nouvel estimateur à noyau pour la fonction de régression. Cet estimateur est construit en utilisant la règle numérique des trapèzes, pour approximer l'estimateur basé sur des données continues. Nous étudions aussi sa performance asymptotique et nous montrons sa normalité asymptotique. En outre, cet estimateur permet d'obtenir le plan d'échantillonnage optimal pour l'estimation de la fonction de régression. Une étude de simulation est conduite afin de tester le comportement de cet estimateur dans un plan d'échantillonnage de taille finie, en terme d'erreur en moyenne quadratique intégrée (IMSE). De plus, nous montrons la réduction dans l'IMSE en utilisant le plan d'échantillonnage optimal au lieu de l'échantillonnage uniforme.Finalement, nous considérons une application de la régression non paramétrique dans le domaine pharmacocinétique. Nous proposons l'utilisation de l'estimateur non paramétrique à noyau pour l'estimation de la fonction de concentration. Nous vérifions son bon comportement par des simulations et une analyse de données réelles. Nous investiguons aussi le problème de l'estimation de l'Aire Sous la Courbe de concentration (AUC), pour lequel nous proposons un nouvel estimateur à noyau, obtenu par l'intégration de l'estimateur à noyau de la fonction de régression. Nous montrons, par une étude de simulation, que le nouvel estimateur est meilleur que l'estimateur classique en terme d'erreur en moyenne quadratique. Le problème crucial de l'obtention d'un plan d'échantillonnage optimale pour l'estimation de l'AUC est discuté en utilisant l'algorithme de recuit simulé généralisé. / In this thesis, we consider the fixed design regression model with repeated measurements, where the errors form a process with general autocovariance function, i.e. a second order process (stationary or nonstationary), with a non-differentiable covariance function along the diagonal. We are interested, among other problems, in the nonparametric estimation of the regression function of this model.We first consider the well-known kernel regression estimator proposed by Gasser and Müller. We study its asymptotic performance when the number of experimental units and the number of observations tend to infinity. For a regular sequence of designs, we improve the higher rates of convergence of the variance and the bias. We also prove the asymptotic normality of this estimator in the case of correlated errors.Second, we propose a new kernel estimator of the regression function based on a projection property. This estimator is constructed through the autocovariance function of the errors, and a specific function belonging to the Reproducing Kernel Hilbert Space (RKHS) associated to the autocovariance function. We study its asymptotic performance using the RKHS properties. These properties allow to obtain the optimal convergence rate of the variance. We also prove its asymptotic normality. We show that this new estimator has a smaller asymptotic variance then the one of Gasser and Müller. A simulation study is conducted to confirm this theoretical result.Third, we propose a new kernel estimator for the regression function. This estimator is constructed through the trapezoidal numerical approximation of the kernel regression estimator based on continuous observations. We study its asymptotic performance, and we prove its asymptotic normality. Moreover, this estimator allow to obtain the asymptotic optimal sampling design for the estimation of the regression function. We run a simulation study to test the performance of the proposed estimator in a finite sample set, where we see its good performance, in terms of Integrated Mean Squared Error (IMSE). In addition, we show the reduction of the IMSE using the optimal sampling design instead of the uniform design in a finite sample set.Finally, we consider an application of the regression function estimation in pharmacokinetics problems. We propose to use the nonparametric kernel methods, for the concentration-time curve estimation, instead of the classical parametric ones. We prove its good performance via simulation study and real data analysis. We also investigate the problem of estimating the Area Under the concentration Curve (AUC), where we introduce a new kernel estimator, obtained by the integration of the regression function estimator. We prove, using a simulation study, that the proposed estimators outperform the classical one in terms of Mean Squared Error. The crucial problem of finding the optimal sampling design for the AUC estimation is investigated using the Generalized Simulating Annealing algorithm.
50

Non- and semiparametric models for conditional probabilities in two-way contingency tables / Modèles non-paramétriques et semiparamétriques pour les probabilités conditionnelles dans les tables de contingence à deux entrées

Geenens, Gery 04 July 2008 (has links)
This thesis is mainly concerned with the estimation of conditional probabilities in two-way contingency tables, that is probabilities of type P(R=i,S=j|X=x), for (i,j) in {1, . . . , r}×{1, . . . , s}, where R and S are the two categorical variables forming the contingency table, with r and s levels respectively, and X is a vector of explanatory variables possibly associated with R, S, or both. Analyzing such a conditional distribution is often of interest, as this allows to go further than the usual unconditional study of the behavior of the variables R and S. First, one can check an eventual effect of these covariates on the distribution of the individuals through the cells of the table, and second, one can carry out usual analyses of contingency tables, such as independence tests, taking into account, and removing in some sense, this effect. This helps for instance to identify the external factors which could be responsible for an eventual association between R and S. This also gives the possibility to adapt for a possible heterogeneity in the population of interest, when analyzing the table.

Page generated in 0.1122 seconds