Global ETD Search

111	Pénalisation et réduction de la dimension des variables auxiliaires en théorie des sondages / Penalization and data reduction of auxiliary variables in survey sampling Shehzad, Muhammad Ahmed 12 October 2012 (has links) Les enquêtes par sondage sont utiles pour estimer des caractéristiques d'une populationtelles que le total ou la moyenne. Cette thèse s'intéresse à l'étude detechniques permettant de prendre en compte un grand nombre de variables auxiliairespour l'estimation d'un total.Le premier chapitre rappelle quelques définitions et propriétés utiles pour lasuite du manuscrit : l'estimateur de Horvitz-Thompson, qui est présenté commeun estimateur n'utilisant pas l'information auxiliaire ainsi que les techniques decalage qui permettent de modifier les poids de sondage de facon à prendre encompte l'information auxiliaire en restituant exactement dans l'échantillon leurstotaux sur la population.Le deuxième chapitre, qui est une partie d'un article de synthèse accepté pourpublication, présente les méthodes de régression ridge comme un remède possibleau problème de colinéarité des variables auxiliaires, et donc de mauvais conditionnement.Nous étudions les points de vue "model-based" et "model-assisted" dela ridge regression. Cette technique qui fournit de meilleurs résultats en termed'erreur quadratique en comparaison avec les moindres carrés ordinaires peutégalement s'interpréter comme un calage pénalisé. Des simulations permettentd'illustrer l'intérêt de cette technique par compar[a]ison avec l'estimateur de Horvitz-Thompson.Le chapitre trois présente une autre manière de traiter les problèmes de colinéaritévia une réduction de la dimension basée sur les composantes principales. Nousétudions la régression sur composantes principales dans le contexte des sondages.Nous explorons également le calage sur les moments d'ordre deux des composantesprincipales ainsi que le calage partiel et le calage sur les composantes principalesestimées. Une illustration sur des données de l'entreprise Médiamétrie permet deconfirmer l'intérêt des ces techniques basées sur la réduction de la dimension pourl'estimation d'un total en présence d'un grand nombre de variables auxiliaires / Survey sampling techniques are quite useful in a way to estimate population parameterssuch as the population total when the large dimensional auxiliary data setis available. This thesis deals with the estimation of population total in presenceof ill-conditioned large data set.In the first chapter, we give some basic definitions that will be used in thelater chapters. The Horvitz-Thompson estimator is defined as an estimator whichdoes not use auxiliary variables. Along with, calibration technique is defined toincorporate the auxiliary variables for sake of improvement in the estimation ofpopulation totals for a fixed sample size.The second chapter is a part of a review article about ridge regression estimationas a remedy for the multicollinearity. We give a detailed review ofthe model-based, design-based and model-assisted scenarios for ridge estimation.These estimates give improved results in terms of MSE compared to the leastsquared estimates. Penalized calibration is also defined under survey sampling asan equivalent estimation technique to the ridge regression in the classical statisticscase. Simulation results confirm the improved estimation compared to theHorvitz-Thompson estimator.Another solution to the ill-conditioned large auxiliary data is given in terms ofprincipal components analysis in chapter three. Principal component regression isdefined and its use in survey sampling is explored. Some new types of principalcomponent calibration techniques are proposed such as calibration on the secondmoment of principal component variables, partial principal component calibrationand estimated principal component calibration to estimate a population total. Applicationof these techniques on real data advocates the use of these data reductiontechniques for the improved estimation of population totals Sondage Colinéarité Régression ridge Calage pénalisé Estimateur assisté par un modèle Estimateur basé sur un modèle Estimateur de Horvitz-Thompson Calage sur composantes principales Survey sampling Multicollinearity Ridge regression Penalized calibration Model-based estimator Model-assisted estimator Horvitz-Thompson estimator Principal component calibration 519
112	Medium term load forecasting in South Africa using Generalized Additive models with tensor product interactions Ravele, Thakhani 21 September 2018 (has links) MSc (Statistics) / Department of Statistics / Forecasting of electricity peak demand levels is important for decision makers in Eskom. The overall objective of this study was to develop medium term load forecasting models which will help decision makers in Eskom for planning of the operations of the utility company. The frequency table of hourly daily demands was carried out and the results show that most peak loads occur at hours 19:00 and 20:00, over the period 2009 to 2013. The study used generalised additive models with and without tensor product interactions to forecast electricity demand at 19:00 and 20:00 including daily peak electricity demand. Least absolute shrinkage and selection operator (Lasso) and Lasso via hierarchical interactions were used for variable selection to increase the model interpretability by eliminating irrelevant variables that are not associated with the response variable, this way also over tting is reduced. The parameters of the developed models were estimated using restricted maximum likelihood and penalized regression. The best models were selected based on smallest values of the Akaike information criterion (AIC), Bayesian information criterion (BIC) and Generalized cross validation (GCV) along with the highest Adjusted R2. Forecasts from best models with and without tensor product interactions were evaluated using mean absolute percentage error (MAPE), mean absolute error (MAE) and root mean square error (RMSE). Operational forecasting was proposed to forecast the demand at hour 19:00 with unknown predictor variables. Empirical results from this study show that modelling hours individually during the peak period results in more accurate peak forecasts compared to forecasting daily peak electricity demand. The performance of the proposed models for hour 19:00 were compared and the generalized additive model with tensor product interactions was found to be the best tting model. / NRF Generalized additive models Lazeso Lazeso via hierarchical interaction Medium term load forecasting Penalized regression Restricted maximum likelihood Tensor product interactions Time series 333.79320968 Electric power-plants -- Load Electric utilities -- South Africa Electric power -- Rates -- South Africa Electricity -- South Africa

Search results

Pénalisation et réduction de la dimension des variables auxiliaires en théorie des sondages / Penalization and data reduction of auxiliary variables in survey sampling

Medium term load forecasting in South Africa using Generalized Additive models with tensor product interactions