Global ETD Search

41	Une procédure de sélection automatique de la discrétisation optimale de la ligne du temps pour des méthodes longitudinales d’inférence causale Ferreira Guerra, Steve 07 1900 (has links) No description available. Bases de données administratives TMLE Estimation semi-paramétrique Apprentissage machine Administrative databases Coarsening Semi-parametric estimation Machine learning
42	Distribuição empírica dos autovalores associados à matriz de interação dos modelos AMMI pelo método bootstrap não-paramétrico / Empirical distribution of eigenvalues associated with the interaction matrix of the AMMI models for non-parametric bootstrap method Kuang Hongyu 25 January 2012 (has links) A interação genótipos ambientes (G E) foi definido por Shelbourne (1972) como sendo a variação entre genótipos em resposta a diferentes condições ambientais. Sua magnitude na expressão fenotípica do caráter pode reduzir a correlação entre fenótipo e genótipo, in acionando a variância genética e, por sua vez, parâmetros dependentes desta, como herdabilidade e ganho genético com a seleção. Estudos sobre a adaptabilidade e a estabilidade fenotípica permitem particularizar os efeitos da interação GE ao nível de genótipo e ambiente, identificando a contribuição relativa de cada um para a interação total. Varias metodologias estatísticas têm sido propostas para a interpretação da interação G E proveniente de um grupo de cultivares testados em vários ambientes. Entre essas metodologias destaca-se os modelos AMMI (Additive Main Eects and Multiplicative Interaction Model), que vem ganhando grande aplicabilidade nos últimos anos. O modelo AMMI e um método uni-multivariado, que engloba uma analise de variância para os efeitos principais, que são os efeitos dos genótipos (G) e os ambientes (E) e para os efeitos multiplicativos (interação genótipo ambiente), para a qual utiliza-se a decomposição em valor singular (DVS). Essa técnica multivariada baseia-se no uso dos autovalores e autovetores provenientes da matriz de interação G E. Araujo e Dias (2005) verificaram o problema de superestimação e subestimação de autovalores estimados da maneira convencional. Efron(1979) propôs uma técnica de simulação numérica chamada Bootstrap para avaliar tais incertezas. O método Bootstrap consiste em uma técnica de reamostragem que permite aproximar a distribuição de uma função das observações a partir da distribuição empírica dos dados. Por meio desse método, podem ser estimados o erro-padrão da referida estimativa e os intervalos de confiança, com o intuito de fazer inferência sobre os parâmetros em questão. O objetivo deste trabalho será estudar o efeito da interação G E, avaliar a adaptabilidade e estabilidade de genótipos em diferentes ambientes através do modelo AMMI, com as analises através dos gráficos Biplot, encontrar a distribuição empírica dos autovalores e calcular o intervalo de confiança através o método Bootstrap não-paramétrico. Com o estudo da distribuição empírica dos autovalores poder-se-a validar os testes de hipóteses propostos na literatura para identificar o numero de IPCA (Incremental Principal Component Analysis) para seleção dos modelos AMMI, e propor um teste para seleção dos modelos. / The genotype environment interaction (G E) was dened by Shelbourne (1972) as the variation among genotypes in response to dierent environmental conditions. Its magnitude in phenotypic expression of the character can reduce the correlation between genotype and phenotype, in ating the genetic variance and, in turn, dependent on the parameters, as heritability and genetic gain with selection. Studies on the phenotypic adaptability and stability allow particularize the eects of interaction G E at the level of genotype and environment, identifying the relative contribution of each to the total interaction. There are several methods of analysis and interpretation for the genotype environment interaction from a group of genotype tested in several environments. These methods include AMMI models (Additive Main Eects and Multiplicative Interaction Model), coming gaining great applicability past years. The AMMI model is a uni-multivariate method, that includes an analysis of variance for the main eects (the eects of the genotypes (G) and environments (E)) and assumes multiplicative eects for the genotype environment interaction, using a singular value decomposition (DVS). This method estimates the eigenvalues and eigenvectors deriving from the matrix of genotype environment interaction. Araujo and Dias (2005) found an overestimation and underestimation problem with the eigenvalues in the conventional way. Efron (1979) proposed a numerical resampling technique called Bootstrap for evaluate such uncertainties. The bootstrap method consists of a resampling technique that allows to approximate the distribution of a function of the observations from the empirical distribution of the data. Through this method, can be estimated by the standard error of that estimate and condence intervals, in order to make inferences about the parameters in question. The aim of this work was to study the eect of genotype environment interection (GE), evaluate the adaptability and stability of genotypes in dierent environments through the AMMI model, with the analysis through the Biplot graphs, nd the empirical distribution of eigenvalues and calculate the condence interval using the nonparametric bootstrap, the study of the empirical distribution of eigenvalues serve to validate the hypothesis tests proposed in the literature to identify the number of IPCA (Incremental Principal Component Analysis) for selecting the AMMI model, and propose a test for selection of models. Análise de variância Estimação não-paramétrica Genótipos Interação genótipo-ambiente Modelos matemáticos Probabilidade Analysis of variance Genotype-environment interaction Genotypes Mathematical models Non-parametric estimation Probability
43	"Métodos de estimação na teoria de resposta ao item" / Estimation methods in item response theory Caio Lucidius Naberezny Azevedo 27 February 2003 (has links) Neste trabalho apresentamos os mais importantes processos de estimação em algumas classes de modelos de resposta ao item (Dicotômicos e Policotômicos). Discutimos algumas propriedades desses métodos. Com o objetivo de comparar o desempenho dos métodos conduzimos simulações apropriadas. / In this work we show the most important estimation methods for some item response models (both dichotomous and polichotomous). We discuss some proprieties of these methods. To compare the characteristic of these methods we conducted appropriate simulations. Estimação Paramétrica Máxima Verossimilhança Métodos Bayesianos Modelos de Variáveis Latentes simulação MCMC Bayesian Methods Latent variable models Maximum likelihood MCMC simulation Parametric estimation
44	Diferencia de género y determinantes de la duración del desempleo formal / Gender difference and determinants of the duration of formal unemployment Achaica Rodriguez, Luis Guillermo 25 June 2021 (has links) En la presente investigación se analizan los determinantes de la duración del desempleo de los ocupados y desocupados de lima metropolitana para el período de estudio 2014 – 2020. Para ello se realiza un análisis de supervivencia usando estimaciones no paramétricas de Kaplan-Meier, el cual demuestra que a medida que se prolonga el período de desempleo, el riesgo de salida aumenta. Asimismo, se analiza la existencia de diferencia de género y el efecto de sus determinantes mediante la estimación Weibull. Los resultados muestran que ser mujer, en lima metropolitana, reduce las probabilidades de salir del desempleo. Dentro de los factores que incrementan la duración del desempleo se encuentran el ser mujer, tener un nivel educativo superior o tener más años de experiencia reducen las probabilidades de salir del desempleo. Por otro lado, las estimaciones paramétricas revelan que dentro de los factores que disminuyen la duración del desempleo se encuentra el pertenecer al grupo étnico mestizo, tener como lengua materna el castellano o poseer un seguro médico, incrementan el riesgo de salir del desempleo. Estas variables permiten identificar los grupos de la población más vulnerable al problema del desempleo. / In this research, the determinants of the duration of unemployment of the employed and unemployed of metropolitan lima for the study period 2014 - 2020 are analyzed. For this, a survival analysis is carried out using non-parametric Kaplan-Meier estimates, which shows that as the unemployment period lengthens, the risk of leaving increases. Likewise, the existence of gender difference and the effect of its determinants are analyzed using the Weibull estimation. The results show that being a woman, in metropolitan Lima, reduces the chances of getting out of unemployment. Among the factors that increase the duration of unemployment are being a woman, having a higher education level and having more years of experience reduce the chances of leaving unemployment. On the other hand, the parametric estimates reveal that among the factors that decrease the duration of unemployment, belonging to the mestizo ethnic group, having Spanish as their mother tongue, having health insurance, increase the risk of leaving unemployment. These variables make it possible to identify the groups of the population most vulnerable to the problem of unemployment. / Trabajo de investigación Desempleo Análisis de supervivencia Estimación paramétrica Unemployment Survival analysis Parametric estimation
45	Recherche de structure dans un graphe aléatoire : modèles à espace latent / Clustering in a random graph : models with latent space Channarond, Antoine 10 December 2013 (has links) Cette thèse aborde le problème de la recherche d'une structure (ou clustering) dans lesnoeuds d'un graphe. Dans le cadre des modèles aléatoires à variables latentes, on attribue à chaque noeud i une variable aléatoire non observée (latente) Zi, et la probabilité de connexion des noeuds i et j dépend conditionnellement de Zi et Zj . Contrairement au modèle d'Erdos-Rényi, les connexions ne sont pas indépendantes identiquement distribuées; les variables latentes régissent la loi des connexions des noeuds. Ces modèles sont donc hétérogènes, et leur structure est décrite par les variables latentes et leur loi; ce pourquoi on s'attache à en faire l'inférence à partir du graphe, seule variable observée.La volonté commune des deux travaux originaux de cette thèse est de proposer des méthodes d'inférence de ces modèles, consistentes et de complexité algorithmique au plus linéaire en le nombre de noeuds ou d'arêtes, de sorte à pouvoir traiter de grands graphes en temps raisonnable. Ils sont aussi tous deux fondés sur une étude fine de la distribution des degrés, normalisés de façon convenable selon le modèle.Le premier travail concerne le Stochastic Blockmodel. Nous y montrons la consistence d'un algorithme de classiffcation non supervisée à l'aide d'inégalités de concentration. Nous en déduisons une méthode d'estimation des paramètres, de sélection de modèles pour le nombre de classes latentes, et un test de la présence d'une ou plusieurs classes latentes (absence ou présence de clustering), et nous montrons leur consistence.Dans le deuxième travail, les variables latentes sont des positions dans l'espace ℝd, admettant une densité f, et la probabilité de connexion dépend de la distance entre les positions des noeuds. Les clusters sont définis comme les composantes connexes de l'ensemble de niveau t > 0 fixé de f, et l'objectif est d'en estimer le nombre à partir du graphe. Nous estimons la densité en les positions latentes des noeuds grâce à leur degré, ce qui permet d'établir une correspondance entre les clusters et les composantes connexes de certains sous-graphes du graphe observé, obtenus en retirant les nœuds de faible degré. En particulier, nous en déduisons un estimateur du nombre de clusters et montrons saconsistence en un certain sens / .This thesis addresses the clustering of the nodes of a graph, in the framework of randommodels with latent variables. To each node i is allocated an unobserved (latent) variable Zi and the probability of nodes i and j being connected depends conditionally on Zi and Zj . Unlike Erdos-Renyi's model, connections are not independent identically distributed; the latent variables rule the connection distribution of the nodes. These models are thus heterogeneous and their structure is fully described by the latent variables and their distribution. Hence we aim at infering them from the graph, which the only observed data.In both original works of this thesis, we propose consistent inference methods with a computational cost no more than linear with respect to the number of nodes or edges, so that large graphs can be processed in a reasonable time. They both are based on a study of the distribution of the degrees, which are normalized in a convenient way for the model.The first work deals with the Stochastic Blockmodel. We show the consistency of an unsupervised classiffcation algorithm using concentration inequalities. We deduce from it a parametric estimation method, a model selection method for the number of latent classes, and a clustering test (testing whether there is one cluster or more), which are all proved to be consistent. In the second work, the latent variables are positions in the ℝd space, having a density f. The connection probability depends on the distance between the node positions. The clusters are defined as connected components of some level set of f. The goal is to estimate the number of such clusters from the observed graph only. We estimate the density at the latent positions of the nodes with their degree, which allows to establish a link between clusters and connected components of some subgraphs of the observed graph, obtained by removing low degree nodes. In particular, we thus derive an estimator of the cluster number and we also show the consistency in some sense. Statistiques Graphes aléatoires Stochastic Blockmodel Clustering Classification non supervisée Estimation non-paramétrique Sélection de modèles Linkage Estimation non-paramétrique Ensembles de niveau Statistics Random graphs Stochastic Blockmodel Hidden or latent variables models Clustering Unsupervised classification Parametric estimation Model selection Linkage Non-parametric estimation Level sets
46	Sur l’inférence statistique pour des processus spatiaux et spatio-temporels extrêmes / On statistical inference for spatial and spatio-temporal extreme processes Abu-Awwad, Abdul-Fattah 20 June 2019 (has links) Les catastrophes naturelles comme les canicules, les tempêtes ou les précipitations extrêmes, proviennent de processus physiques et ont, par nature, une dimension spatiale ou spatiotemporelle. Le développement de modèles et de méthodes d'inférences pour ces processus est un domaine de recherche très actif. Cette thèse traite de l'inférence statistique pour les événements extrêmes dans le cadre spatial et spatio-temporel. En particulier, nous nous intéressons à deux classes de processus stochastique: les processus spatiaux max-mélange et les processus max-stable spatio-temporels. Nous illustrons les résultats obtenus sur des données de précipitations dans l'Est de l'Australie et dans une région de la Floride aux Etats-Unis. Dans la partie spatiale, nous proposons deux tests sur le paramètre de mélange a d'un processus spatial max-mélange: le test statistique Za et le rapport de vraisemblance par paire LRa. Nous comparons les performances de ces tests sur simulations. Nous utilisons la vraisemblance par paire pour l'estimation. Dans l'ensemble, les performances des deux tests sont satisfaisantes. Toutefois, les tests rencontrent des difficultés lorsque le paramètre a se situe à la frontière de l'espace des paramètres, i.e., a ∈ {0,1}, dues à la présence de paramètre de “nuisance” qui ne sont pas identifiés sous l'hypothèse nulle. Nous appliquons ces tests dans le cadre d'une analyse d'excès au delà d'un grand seuil pour des données de précipitations dans l'Est de l'Australie. Nous proposons aussi une nouvelle procédure d'estimation pour ajuster des processus spatiaux max-mélanges lorsqu'on ne connait pas la classe de dépendance extrêmal. La nouveauté de cette procédure est qu'elle permet de faire de l'inférence sans spécifier au préalable la famille de distributions, laissant ainsi parle les données et guider l'estimation. En particulier, la procédure d'estimation utilise un ajustement par la méthode des moindres carrés sur l'expression du Fλ-madogramme d'un modèle max-mélange qui contient les paramètres d'intérêt. Nous montrons la convergence de l'estimateur du paramètre de mélange a. Une indication sur la normalité asymptotique est donnée numériquement. Une étude sur simulation montrent que la méthode proposée améliore les coefficients empiriques pour la classe de modèles max-mélange. Nous implémentons notre procédure d'estimations sur des données de maximas mensuels de précipitations en Australie dans un but exploratoire et confirmatoire. Dans la partie spatio-temporelle, nous proposons une méthode d'estimation semi-paramétrique pour les processus max-stables spatio-temporels en nous basant sur une expression explicite du F-madogramme spatio-temporel. Cette partie permet de faire le pont entre la géostatistique et la théorie des valeurs extrêmes. En particulier, pour des observations sur grille régulière, nous estimons le F-madogramme spatio-temporel par sa version empirique et nous appliquons une procédure basée sur les moments pour obtenir les estimations des paramètres d'intérêt. Nous illustrons les performances de cette procédure par une étude sur simulations. Ensuite, nous appliquons cette méthode pour quantifier le comportement extrêmal de maximum de données radar de précipitations dans l'Etat de Floride. Cette méthode peut être une alternative ou une première étape pour la vraisemblance composite. En effet, les estimations semi-paramétriques pourrait être utilisées comme point de départ pour les algorithmes d'optimisation utilisés dans la méthode de vraisemblance par paire, afin de réduire le temps de calcul mais aussi d'améliorer l'efficacité de la méthode / Natural hazards such as heat waves, extreme wind speeds, and heavy rainfall, arise due to physical processes and are spatial or spatio-temporal in extent. The development of models and inference methods for these processes is a very active area of research. This thesis deals with the statistical inference of extreme and rare events in both spatial and spatio-temporal settings. Specifically, our contributions are dedicated to two classes of stochastic processes: spatial max-mixture processes and space-time max-stable processes. The proposed methodologies are illustrated by applications to rainfall data collected from the East of Australia and from a region in the State of Florida, USA. In the spatial part, we consider hypothesis testing for the mixture parameter a of a spatial maxmixture model using two classical statistics: the Z-test statistic Za and the pairwise likelihood ratio statistic LRa. We compare their performance through an extensive simulation study. The pairwise likelihood is employed for estimation purposes. Overall, the performance of the two statistics is satisfactory. Nevertheless, hypothesis testing presents some difficulties when a lies on the boundary of the parameter space, i.e., a ∈ {0,1}, due to the presence of additional nuisance parameters which are not identified under the null hypotheses. We apply this testing framework in an analysis of exceedances over a large threshold of daily rainfall data from the East of Australia. We also propose a novel estimation procedure to fit spatial max-mixture processes with unknown extremal dependence class. The novelty of this procedure is to provide a way to make inference without specifying the distribution family prior to fitting the data. Hence, letting the data speak for themselves. In particular, the estimation procedure uses nonlinear least squares fit based on a closed form expression of the so-called Fλ-madogram of max-mixture models which contains the parameters of interest. We establish the consistency of the estimator of the mixing parameter a. An indication for asymptotic normality is given numerically. A simulation study shows that the proposed procedure improves empirical coefficients for the class of max-mixture models. In an analysis of monthly maxima of Australian daily rainfall data, we implement the proposed estimation procedure for diagnostic and confirmatory purposes. In the spatio-temporal part, based on a closed form expression of the spatio-temporal Fmadogram, we suggest a semi-parametric estimation methodology for space-time max-stable processes. This part provides a bridge between geostatistics and extreme value theory. In particular, for regular grid observations, the spatio-temporal F-madogram is estimated nonparametrically by its empirical version and a moment-based procedure is applied to obtain parameter estimates. The performance of the method is investigated through an extensive simulation study. Afterward, we apply this method to quantify the extremal behavior of radar daily rainfall maxima data from a region in the State of Florida. This approach could serve as an alternative or a prerequisite to pairwise likelihood estimation. Indeed, the semi-parametric estimates could be used as starting values for the optimization algorithm used to maximize the pairwise log-likelihood function in order to reduce the computational burden and also to improve the statistical efficiency Dépendance/Indépendance asymptotique Vraisemblance composite Événement extrême Fλ-madogramme Processus max-stable Processus max-mélange Précipitations Estimation semi-paramétrique Asymptotic dependence/independence Composite likelihood Extreme event Fλ- madogram Max-stable process Max-mixture process Rainfall data Semi-parametric estimation 510
47	Regressão não-paramétrica com erros correlacionados via ondaletas. / Non-parametric regression with correlated errors using wavelets Porto, Rogério de Faria 03 October 2008 (has links) Nesta tese, são obtidas taxas de convergência a zero, do risco de estimação obtido com regressão não-paramétrica via ondaletas, quando há erros correlacionados. Quatro métodos de regressão não-paramétrica via ondaletas, com delineamento desigualmente espaçado são estudados na presença de erros correlacionados, oriundos de processos estocásticos. São apresentadas condições sobre os erros e adaptações aos procedimentos necessárias à obtenção de taxas de convergência quase minimax, para os estimadores. Sempre que possível são obtidas taxas de convergência para os estimadores no domínio da função, sob condições bastante gerais a respeito da função a ser estimada, do delineamento e da correlação dos erros. Mediante estudos de simulação, são avaliados os comportamentos de alguns métodos propostos quando aplicados a amostras finitas. Em geral sugere-se usar um dos procedimentos estudados, porém aplicando-se limiares por níveis. Como a estimação da variância dos coecientes de detalhes pode ser problemática em alguns casos, também se propõe um procedimento iterativo semi-paramétrico geral para métodos que utilizam ondaletas, na presença de erros em séries temporais. / In this thesis, rates of convergence to zero are obtained for the estimation risk, for non-parametric regression using wavelets, when the errors are correlated. Four non-parametric regression methods using wavelets, with un-equally spaced design are studied in the presence of correlated errors, that come from stochastic processes. Conditions on the errors and adaptations to the procedures are presented, so that the estimators achieve quasi-minimax rates of convergence. Whenever is possible, rates of convergence are obtained for the estimators in the domain of the function, under mild conditions on the function to be estimated, on the design and on the error correlation. Through simulation studies, the behavior of some of the proposed methods is evaluated, when used on finite samples. Generally, it is suggested to use one of the studied methods, however applying thresholds by level. Since the estimation of the detail coecients can be dicult in some cases, it is also proposed a general semi-parametric iterative procedure, for wavelet methods in the presence of time-series errors. autocorrelação autocorrelation design-adapted wavelets erros em séries temporais estimação semi-paramética lifting lifting non-parametric regression ondaletas ondaletas adaptativas ondaletas deformadas regressão não-paramétrica semi-parametric estimation time-series errors warped wavelets wavelets
48	Choix optimal du paramètre de lissage dans l'estimation non paramétrique de la fonction de densité pour des processus stationnaires à temps continu / Optimal choice of smoothing parameter in non parametric density estimation for continuous time stationary processes El Heda, Khadijetou 25 October 2018 (has links) Les travaux de cette thèse portent sur le choix du paramètre de lissage dans le problème de l'estimation non paramétrique de la fonction de densité associée à des processus stationnaires ergodiques à temps continus. La précision de cette estimation dépend du choix de ce paramètre. La motivation essentielle est de construire une procédure de sélection automatique de la fenêtre et d'établir des propriétés asymptotiques de cette dernière en considérant un cadre de dépendance des données assez général qui puisse être facilement utilisé en pratique. Cette contribution se compose de trois parties. La première partie est consacrée à l'état de l'art relatif à la problématique qui situe bien notre contribution dans la littérature. Dans la deuxième partie, nous construisons une méthode de sélection automatique du paramètre de lissage liée à l'estimation de la densité par la méthode du noyau. Ce choix issu de la méthode de la validation croisée est asymptotiquement optimal. Dans la troisième partie, nous établissons des propriétés asymptotiques, de la fenêtre issue de la méthode de la validation croisée, données par des résultats de convergence presque sûre. / The work this thesis focuses on the choice of the smoothing parameter in the context of non-parametric estimation of the density function for stationary ergodic continuous time processes. The accuracy of the estimation depends greatly on the choice of this parameter. The main goal of this work is to build an automatic window selection procedure and establish asymptotic properties while considering a general dependency framework that can be easily used in practice. The manuscript is divided into three parts. The first part reviews the literature on the subject, set the state of the art and discusses our contribution in within. In the second part, we design an automatical method for selecting the smoothing parameter when the density is estimated by the Kernel method. This choice stemming from the cross-validation method is asymptotically optimal. In the third part, we establish an asymptotic properties pertaining to consistency with rate for the resulting estimate of the window-width. Paramètre de lissage Estimation non paramétrique Estimateur à noyau Consistance Convergence presque sûre Ergodicité Stationarité Temps continu Densité Vitesse de convergence Smoothing parameter Non parametric estimation Kernel estimator Consistence Almost surely consistence Ergodicity Stationarity Continuous time Density Asymptotic normality
49	Regressão não-paramétrica com erros correlacionados via ondaletas. / Non-parametric regression with correlated errors using wavelets Rogério de Faria Porto 03 October 2008 (has links) Nesta tese, são obtidas taxas de convergência a zero, do risco de estimação obtido com regressão não-paramétrica via ondaletas, quando há erros correlacionados. Quatro métodos de regressão não-paramétrica via ondaletas, com delineamento desigualmente espaçado são estudados na presença de erros correlacionados, oriundos de processos estocásticos. São apresentadas condições sobre os erros e adaptações aos procedimentos necessárias à obtenção de taxas de convergência quase minimax, para os estimadores. Sempre que possível são obtidas taxas de convergência para os estimadores no domínio da função, sob condições bastante gerais a respeito da função a ser estimada, do delineamento e da correlação dos erros. Mediante estudos de simulação, são avaliados os comportamentos de alguns métodos propostos quando aplicados a amostras finitas. Em geral sugere-se usar um dos procedimentos estudados, porém aplicando-se limiares por níveis. Como a estimação da variância dos coecientes de detalhes pode ser problemática em alguns casos, também se propõe um procedimento iterativo semi-paramétrico geral para métodos que utilizam ondaletas, na presença de erros em séries temporais. / In this thesis, rates of convergence to zero are obtained for the estimation risk, for non-parametric regression using wavelets, when the errors are correlated. Four non-parametric regression methods using wavelets, with un-equally spaced design are studied in the presence of correlated errors, that come from stochastic processes. Conditions on the errors and adaptations to the procedures are presented, so that the estimators achieve quasi-minimax rates of convergence. Whenever is possible, rates of convergence are obtained for the estimators in the domain of the function, under mild conditions on the function to be estimated, on the design and on the error correlation. Through simulation studies, the behavior of some of the proposed methods is evaluated, when used on finite samples. Generally, it is suggested to use one of the studied methods, however applying thresholds by level. Since the estimation of the detail coecients can be dicult in some cases, it is also proposed a general semi-parametric iterative procedure, for wavelet methods in the presence of time-series errors. autocorrelação erros em séries temporais estimação semi-paramética lifting ondaletas ondaletas adaptativas ondaletas deformadas regressão não-paramétrica autocorrelation design-adapted wavelets lifting non-parametric regression semi-parametric estimation time-series errors warped wavelets wavelets
50	Estimation of the mincerian wage model addressing its specification and different econometric issues Bhatti, Sajjad Haider 03 December 2012 (has links) (PDF) In the present doctoral thesis, we estimated Mincer's (1974) semi logarithmic wage function for the French and Pakistani labour force data. This model is considered as a standard tool in order to estimate the relationship between earnings/wages and different contributory factors. Despite of its vide and extensive use, simple estimation of the Mincerian model is biased because of different econometric problems. The main sources of bias noted in the literature are endogeneity of schooling, measurement error, and sample selectivity. We have tackled the endogeneity and measurement error biases via instrumental variables two stage least squares approach for which we have proposed two new instrumental variables. The first instrumental variable is defined as "the average years of schooling in the family of the concerned individual" and the second instrumental variable is defined as "the average years of schooling in the country, of particular age group, of particular gender, at the particular time when an individual had joined the labour force". Schooling is found to be endogenous for the both countries. Comparing two said instruments we have selected second instrument to be more appropriate. We have applied the Heckman (1979) two-step procedure to eliminate possible sample selection bias which found to be significantly positive for the both countries which means that in the both countries, people who decided not to participate in labour force as wage worker would have earned less than participants if they had decided to work as wage earner. We have estimated a specification that tackled endogeneity and sample selectivity problems together as we found in respect to present literature relative scarcity of such studies all over the globe in general and absence of such studies for France and Pakistan, in particular. Differences in coefficients proved worth of such specification. We have also estimated model semi-parametrically, but contrary to general norm in the context of the Mincerian model, our semi-parametric estimation contained non-parametric component from first-stage schooling equation instead of non-parametric component from selection equation. For both countries, we have found parametric model to be more appropriate. We found errors to be heteroscedastic for the data from both countries and then applied adaptive estimation to control adverse effects of heteroscedasticity. Comparing simple and adaptive estimations, we prefer adaptive specification of parametric model for both countries. Finally, we have applied quantile regression on the selected model from mean regression. Quantile regression exposed that different explanatory factors influence differently in different parts of the wage distribution of the two countries. For both Pakistan and France, it would be the first study that corrected both sample selectivity and endogeneity in single specification in quantile regression framework Sample selection bias Adaptive estimation Endogeneity Semi-parametric estimation Wage regression Heteroscedasticity Mincerian model Quantile regression Instrumental variables

Search results