• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 104
  • 67
  • 36
  • 32
  • 20
  • 20
  • 18
  • 6
  • 6
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 344
  • 344
  • 71
  • 65
  • 63
  • 53
  • 53
  • 40
  • 34
  • 33
  • 32
  • 28
  • 26
  • 25
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

用戶別售電量與電費收入之研究:台電公司實證案例 / A Study on Customer-by-Category Energy Sales and Power Sales Revenue Model: The Case of Taiwan Power Company

蔡佩容 Unknown Date (has links)
本文旨在檢定台電公司現行季節電價月份劃分之合理性,並探討影響用戶別售電量與電費收入之經濟因素。為達成此目的,本文先就負載觀點與成本觀點進行群集分析,以檢定季節電價是否具統計意義之正當性;其次建立經濟計量模型,分別採用戶別之總售電量與總電費收入做為被解釋變數,運用民國88年1月至民國91年12月之月資料進行實證分析。本文建立之經濟模型有二,分別為時間序列以及複迴歸方程式模型。經檢定分析後,本文就各實證參數之經濟意涵加以闡示,最後並提出結論以及未來研究之方向。 本文透過月資料之群集分析,顯示夏月相對於非夏月之群集差異與台電公司現行季節電價夏月與非夏月之月份相一致,證實台電公司季節電價月份劃分之合理性。其次,透過ARIMA時間序列建立之短期電力需求預測模型,經實證結果顯示:電燈與電力用戶別之售電量均逐年增加,預測民國93年1月至民國99年12月,電燈用戶之年售電量平均成長率為3.33%、電力用戶為3.23%。再者,利用複迴歸模型進行實證分析之結果發現:(一)影響售電量之主要變數為溫度。惟因電燈用戶每隔兩月抄表一次,與電力用戶按月抄表之作業方式不同,故電燈用戶每月售電量係受前期(月)溫度影響,而電力用戶則受當期(月)溫度影響。(二)各用戶別之總電費收入與售電量有明顯相關,且經估算出各月售電量之電費收入彈性顯示:電燈用戶約為0.5,電力用戶約為1。由於總電費收入為總售電量與平均電價之乘積,故電燈用戶之電費收入增加1% 時,其售電量僅增加0.5%,顯示總電費的收入增加係有部分來自於平均電價的提高;換言之,就電燈用戶別而言,其電費收入增減變化之百分比除了會受到售電量增減幅度之影響外,亦反映了平均電價變化的情形。同理,對電力用戶來說,其各月售電量之電費收入彈性接近於1,表示電費收入變化1% 時,售電量亦增加1%,即電費收入之增減變化比例主要受到售電量之同向等幅變化所影響。 至於各用戶別之電費收入方面,電燈與電力兩類用戶自民國88年初至91年底四年期間均有逐年增加之趨勢,惟電力用戶之年增加幅度有隨時間遞減之現象,且歷年大抵以7-10月份較高,2月份最低。此外,影響用戶別電費收入之解釋變數中,各類用戶之售電量最為顯著,其參數值係隱示每增加一度售電量對其電費收入之影響。其中,電燈用戶之估計參數值為2.69,而電力用戶則為1.35。再者,由其電費收入之售電量彈性係數可以發現:電燈用戶約為1.2,電力用戶約為0.7,顯示電燈用戶總售電量增加1%時,總電費收入增加的幅度大於1%,而電力用戶則相反。推估電力用戶此一彈性係數較電燈用戶低之原因在於:電力用戶與電燈用戶之電價結構不同,前者係採需量電費與能量電費之兩部電價制,而後者僅包含流動電費之一部電價。最後,實證結果亦顯示電力系統之尖峰負載與負載率會影響電費收入,惟其影響幅度不大。 / A Study on Customer-by-Category Energy Sales and Power Sales Revenue Model: The Case of Taiwan Power Company Abstract The main purposes of this study are to examine the rationality of the seasonal pricing scheme defined by summer and non-summer months and to identify economic factors influencing customer-by-category energy sales and power sales revenue, utilizing the data of Taiwan Power Company (Taipower) as an empirical case. In order to achieve this objective, the cluster analysis from the perspective of load pattern and cost pattern are examined respectively to see if the seasonal pricing scheme has statistical meaning in its pattern differences in terms of summer vs. non-summer season. Second, two economic models including time-series analysis and multiple regression equations are formulated for the empirical case study. The subtotal energy sales and the subtotal power sales revenue by different type of customer categories, i.e. lighting and industrial customers, are set to be the explained variables. Data from January 1999 to December 2002 are collected for modeling simulation tests. The economic meanings and policy implications of the modeling results are elaborated on. And conclusions with directions for further research are presented. Through the cluster analysis utilizing monthly data within the time frame mentioned above, empirical research results on the grouping cluster of summer vs. non-summer months shows a consistent trend with those defined by Taipower’s present seasonal pricing scheme. Second, the empirical results of ARIMA time-series model show that the forecasted energy sales of both lighting and industrial customers will be gradually increasing through January 2004 to December 2010, and the average annual growth rate of energy sales for the lighting customer is 3.33%, and for the industrial customer is 3.23%. On the other hand, the empirical research results through the multiple regression equations show that the main factor affecting the energy sales is temperature. Due to the different time schedules for reading electricity meters between the lighting customer and the industrial customer, i.e. the time interval for reading the meter of lighting customers is every two months and for industrial customers is every month, the monthly energy sales of the lighting customer are directly related to the temperature of the previous month, while the monthly sales of the industrial customer are directly related to the temperature of the present month. In addition, for each type of customers, there is an obvious correlation between the total power sales revenue and the total energy sales. Furthermore, the estimated elasticity of the total power sales revenue versus total energy sales is about 0.5 for the lighting customer, and about 1 for the industrial customer. Since the total power sales revenue is the product of total energy sales times the average electricity price, when the total power sales revenue increases 1% with the total energy sales only increases 0.5%, it implies that the increase of total power sales revenue not just only comes from the increase of energy sales, but also partially affected by the increase of average electricity price. Similarly, for the industrial customer, when the elasticity of their monthly total power sales revenue versus total energy sales is close to 1, it implies that when the total power sales revenue increases 1%, the total energy sales also increase about 1%. In other words, the change of percentage of the total power sales revenue is mostly attributed to the variation of total energy sales, not because of the average electricity price. As for the simulation results of the total power sales revenue, those of the lighting and industrial customers are both gradually increasing between the years 1999 to 2002. However, the increasing pace of the industrial customer tended to slow down. Moreover, both types of the customers possess a similar trend that their total power sales are higher in statistical meaning for the months from July to October, and lower for February, for those above three years. Besides, among the variables affecting each type of customer’s power sales revenue, the energy sales is the most significant one, its parameter implies that whenever the total energy sales increases one unit, i.e. one kwh, it would affect the total power sales revenue by that amount equivalent to the figure of the parameter. According to the empirical results, the estimated parameter mentioned-above of the lighting customer is 2.69, and 1.35 of the industrial customer respectively. That implies one kwh unit price for the lighting customer is 2.69 N.T. dollars, and 1.35 N.T. dollars for the industrial customer. Moreover, from the elasticity of the total energy sales versus the total power sales revenue, it shows that the elasticity of the lighting customer is around 1.2, and the elasticity of the industrial customer is around 0.7. The underlining reason of the difference between the two figures could be that the electricity pricing structure of the lighting and industrial customers are quite different. The industrial customer is charged by two-part tariff including a demand charge for the capacity use and an energy charge for the kwh use. While the lighting customer is charged simply by a single rate, i.e. the energy use. Finally, the empirical results also show that the magnitude of the peak load and the load factor of the whole electricity system also affect the total power sales revenue of each type of the customer, though with much less effect.
262

台灣有線電視系統業者經營效率之探討 / A Study of Efficiency of Cable System Operators in Taiwan

張美惠, chang , mei-hui Unknown Date (has links)
本研究依據2003年「行政院新聞局廣播電視白皮書」、「公開上市、上櫃資訊觀測網站有線電視系統業者財務報告書」等文獻資料,先以資料包絡分析法評估個別系統業者的技術效率,再應用Tobit截斷迴歸方法,探討影響台灣有線電視系統業者經營效率的因素。 評估結果顯示,造成個別有線電視系統業者經營技術無效率的因素,主要歸咎於浪費資源所造成之無效率,而非因生產規模不適當所造成之無效率,迴歸結果發現,營業收入與技術效率間具正向關係,而頻道數、廣告密集度、經營區面積、集團化及業務集中度對技術效率間具負向關係。 / Based on the information of 2003 “The Broadcasting television paper of Government Information Office of Executive Yuan” and “Finance Statements of The Cable System Operators of Listed Companies and OTC Listed Companies from Market Observation Post System in Taiwan”, this study first uses DEA to assess technical efficiency of cable system operators, and then applies the Tobit censored regression technique to investigate the determinants of technical efficiency. The efficiency-evaluating result shows that the main factor, which causes inefficient management, is an ineffective use of resource; an improper production scale has less impact on it. The regression result also shows that the revenue has a positive impact on technical efficiency, but the number of channels, Area, the density of advertisement, conglomeration, and business focus has negative impacts on technical efficiency.
263

多期邏輯斯迴歸模型應用在企業財務危機預測之研究 / Forecasting corporate financial distress:using multi-period logistic regression model

卜志豪, Pu, Chih-Hao Unknown Date (has links)
本研究延續Shumway (2001) 從存活分析(Survival Analysis)觀點切入,利用離散型風險模型(Discrete-time Hazard Model)──亦即Shumway 所稱之多期邏輯斯迴歸模型(Multi-period Logistic Regression Model),建立企業財務危機預警模型。研究選取1986 年至2008 年間718 家上市公司,其中110 家發生財務危機事件,共計6,782 公司/年資料 (firm-year)。有別於Shumway 提出的Log 基期風險型式,本文根據事件發生率圖提出Quadratic 基期風險型式,接著利用4組(或基於會計測量,或基於市場測量)時間相依共變量 (Time-dependent Covariate)建立2 組離散型風險模型(Log 與Quadratic),並與傳統僅考量單期資料的邏輯斯迴歸模型比較。實證結果顯示,離散型風險模型的解釋變數與破產機率皆符合預期關係,而傳統邏輯斯迴歸模型則有時會出現不符合預期關係的情況;研究亦顯示離散型風險模型預測能力絕大多數情況下優於傳統邏輯斯迴歸模型,在所有模型組合中,以Quadratic 基期風險型式搭配財務變數、市場變數的解釋變數組合而成的離散型風險模型,擁有最佳預測能力。 / Based on the viewpoint of survival analysis from Shumway (2001), the presentthesis utilizes discrete-time hazard model, also called multi-period logistic regression model, to forecast corporate financial distress. From 1986 to 2008, this research chooses 718 listed companies within, which includes 110 failures, as the subjects, summing to 6,782 firm-year data. Being different from Shumway’s log baseline hazard form,we proposed to use quadratic baseline hazard form according to empirical evidence. Then, four groups of time-dependent covariates, which are accounting-based measure or market-based measure, are applied to build two sets of discrete-time hazard model, which is compared with the single-period logistic regression model. The results show that there exists the expected relationship between covariates and predict probability in discrete-time hazard model, while there sometimes lacks it in single-period logistic regression model. The results also show that discrete-time hazard model has better predictive capability than single-period logistic regression model. The model, which combines quadratic baseline hazard form with market and accounting variables, has the best predictive capability among all models.
264

Quantitative Retrieval of Organic Soil Properties from Visible Near-Infrared Shortwave Infrared (Vis-NIR-SWIR) Spectroscopy Using Fractal-Based Feature Extraction.

Liu, Lanfa, Buchroithner, Manfred, Ji, Min, Dong, Yunyun, Zhang, Rongchung 27 March 2017 (has links) (PDF)
Visible and near-infrared diffuse reflectance spectroscopy has been demonstrated to be a fast and cheap tool for estimating a large number of chemical and physical soil properties, and effective features extracted from spectra are crucial to correlating with these properties. We adopt a novel methodology for feature extraction of soil spectroscopy based on fractal geometry. The spectrum can be divided into multiple segments with different step–window pairs. For each segmented spectral curve, the fractal dimension value was calculated using variation estimators with power indices 0.5, 1.0 and 2.0. Thus, the fractal feature can be generated by multiplying the fractal dimension value with spectral energy. To assess and compare the performance of new generated features, we took advantage of organic soil samples from the large-scale European Land Use/Land Cover Area Frame Survey (LUCAS). Gradient-boosting regression models built using XGBoost library with soil spectral library were developed to estimate N, pH and soil organic carbon (SOC) contents. Features generated by a variogram estimator performed better than two other estimators and the principal component analysis (PCA). The estimation results for SOC were coefficient of determination (R2) = 0.85, root mean square error (RMSE) = 56.7 g/kg, the ratio of percent deviation (RPD) = 2.59; for pH: R2 = 0.82, RMSE = 0.49 g/kg, RPD = 2.31; and for N: R2 = 0.77, RMSE = 3.01 g/kg, RPD = 2.09. Even better results could be achieved when fractal features were combined with PCA components. Fractal features generated by the proposed method can improve estimation accuracies of soil properties and simultaneously maintain the original spectral curve shape.
265

Adaptation via des inéqualités d'oracle dans le modèle de regression avec design aléatoire / Adaptation via oracle inequality in regression model with random design

Nguyen, Ngoc Bien 21 May 2014 (has links)
À partir des observations Z(n) = {(Xi, Yi), i = 1, ..., n} satisfaisant Yi = f(Xi) + ζi, nous voulons reconstruire la fonction f. Nous évaluons la qualité d'estimation par deux critères : le risque Ls et le risque uniforme. Dans ces deux cas, les hypothèses imposées sur la distribution du bruit ζi serons de moment borné et de type sous-gaussien respectivement. En proposant une collection des estimateurs à noyau, nous construisons une procédure, qui est initié par Goldenshluger et Lepski, pour choisir l'estimateur dans cette collection, sans aucune condition sur f. Nous prouvons ensuite que cet estimateur satisfait une inégalité d'oracle, qui nous permet d'obtenir les estimations minimax et minimax adaptatives sur les classes de Hölder anisotropes. / From the observation Z(n) = {(Xi, Yi), i = 1, ..., n} satisfying Yi = f(Xi) + ζi, we would like to approximate the function f. This problem will be considered in two cases of loss function, Ls-risk and uniform risk, where the condition imposed on the distribution of the noise ζi is of bounded moment and of type sub-gaussian, respectively. From a proposed family of kernel estimators, we construct a procedure, which is initialized by Goldenshluger and Lepski, to choose in this family a final estimator, with no any assumption imposed on f. Then, we show that this estimator satisfies an oracle inequality which implies the minimax and minimax adaptive estimation over the anisotropic Hölder classes.
266

A distribuição beta generalizada semi-normal / The beta generalized half-normal distribution

Pescim, Rodrigo Rossetto 29 January 2010 (has links)
Uma nova família de distribuições denominada distribuição beta generalizada semi-normal, que inclui algumas distribuições importantes como casos especiais, tais como as distribuições semi-normal e generalizada semi-normal (Cooray e Ananda, 2008), é proposta neste trabalho. Para essa nova família de distribuições, foi realizado o estudo da função densidade probabilidade, função de distribuição acumulada e da função de taxa de falha (ou risco), que não dependeram de funções matemáticas complicadas. Obteve-se uma expressão formal para os momentos, função geradora de momentos, função densidade da distribuição de estatística de ordem, desvios médios, entropia, contabilidade e para as curvas de Bonferroni e Lorenz. Examinaram-se os estimadores de máxima verossimilhança dos parâmetros e deduziu- se a matriz de informação esperada. Neste trabalho é proposto, também, um modelo de regressão utilizando a distribuição beta generalizada semi-normal. A utilidade dessa nova distribuição é ilustrada através de dois conjuntos de dados, mostrando que ela é mais flexível na análise de dados de tempo de vida do que outras distribuições existentes na literatura. / A new family of distributions so-called beta generalized half-normal distribution, which includes some important distributions as special cases, such as the half-normal and generalized half-normal (Cooray and Ananda, 2008) distributions, is proposed in this work. For this new family of distributions, we studied the probability density function, cumulative distribution function and failure rate function (or hazard function), which did not depend on complicated mathematical functions. We obtained a formal expression for the moments, moment generating function, density function of order statistics distribution, mean deviation, entropy, reliability and Bonferroni and Lorenz curves. We examined maximum likelihood estimation of parameters and provided the information matrix. This work also proposed a regression model using the beta generalized half-normal distribution. The usefulness of the new distribution is illustrated through two data sets by showing that it is quite °exible in analyzing lifetime data instead other distributions in the literature.
267

Modelos preditivos para LGD / Predictive models for LGD

Silva, João Flávio Andrade 04 May 2018 (has links)
As instituições financeiras que pretendem utilizar a IRB (Internal Ratings Based) avançada precisam desenvolver métodos para estimar a componente de risco LGD (Loss Given Default). Desde a década de 1950 são apresentadas propostas para modelagem da PD (Probability of default), em contrapartida, a previsão da LGD somente recebeu maior atenção após a publicação do Acordo Basileia II. A LGD possui ainda uma literatura pequena, se comparada a PD, e não há um método eficiente em termos de acurácia e interpretação como é a regressão logística para a PD. Modelos de regressão para LGD desempenham um papel fundamental na gestão de risco das instituições financeiras. Devido sua importância este trabalho propõe uma metodologia para quantificar a componente de risco LGD. Considerando as características relatadas sobre a distribuição da LGD e na forma flexível que a distribuição beta pode assumir, propomos uma metodologia de estimação da LGD por meio do modelo de regressão beta bimodal inflacionado em zero. Desenvolvemos a distribuição beta bimodal inflacionada em zero, apresentamos algumas propriedades, incluindo momentos, definimos estimadores via máxima verossimilhança e construímos o modelo de regressão para este modelo probabilístico, apresentamos intervalos de confiança assintóticos e teste de hipóteses para este modelo, bem como critérios para seleção de modelos, realizamos um estudo de simulação para avaliar o desempenho dos estimadores de máxima verossimilhança para os parâmetros da distribuição beta bimodal inflacionada em zero. Para comparação com nossa proposta selecionamos os modelos de regressão beta e regressão beta inflacionada, que são abordagens mais usuais, e o algoritmo SVR , devido a significativa superioridade relatada em outros trabalhos. / Financial institutions willing to use the advanced Internal Ratings Based (IRB) need to develop methods to estimate the LGD (Loss Given Default) risk component. Proposals for PD (Probability of default) modeling have been presented since the 1950s, in contrast, LGDs forecast has received more attention only after the publication of the Basel II Accord. LGD also has a small literature, compared to PD, and there is no efficient method in terms of accuracy and interpretation such as logistic regression for PD. Regression models for LGD play a key role in the risk management of financial institutions, due to their importance this work proposes a methodology to quantify the LGD risk component. Considering the characteristics reported on the distribution of LGD and in the flexible form that the beta distribution may assume, we propose a methodology for estimation of LGD using the zero inflated bimodal beta regression model. We developed the zero inflated bimodal beta distribution, presented some properties, including moments, defined estimators via maximum likelihood and constructed the regression model for this probabilistic model, presented asymptotic confidence intervals and hypothesis test for this model, as well as selection criteria of models, we performed a simulation study to evaluate the performance of the maximum likelihood estimators for the parameters of the zero inflated bimodal beta distribution. For comparison with our proposal we selected the beta regression models and inflated beta regression, which are more usual approaches, and the SVR algorithm, due to the significant superiority reported in other studies.
268

Modelos de regressão beta inflacionados / Inflated beta regression models

Ospina Martinez, Raydonal 04 April 2008 (has links)
Nos últimos anos têm sido desenvolvidos modelos de regressão beta, que têm uma variedade de aplicações práticas como, por exemplo, a modelagem de taxas, razões ou proporções. No entanto, é comum que dados na forma de proporções apresentem zeros e/ou uns, o que não permite admitir que os dados provêm de uma distribuição contínua. Nesta tese, são propostas, distribuições de mistura entre uma distribuição beta e uma distribuição de Bernoulli, degenerada em zero e degenerada em um para modelar dados observados nos intervalos [0, 1], [0, 1) e (0, 1], respectivamente. As distribuições propostas são inflacionadas no sentido de que a massa de probabilidade em zero e/ou um excede o que é permitido pela distribuição beta. Propriedades dessas distribuições são estudadas, métodos de estimação por máxima verossimilhança e momentos condicionais são comparados. Aplicações a vários conjuntos de dados reais são examinadas. Desenvolvemos também modelos de regressão beta inflacionados assumindo que a distribuição da variável resposta é beta inflacionada. Estudamos estimação por máxima verossimilhança. Derivamos expressões em forma fechada para o vetor escore, a matriz de informação de Fisher e sua inversa. Discutimos estimação intervalar para diferentes quantidades populacionais (parâmetros de regressão, parâmetro de precisão) e testes de hipóteses assintóticos. Derivamos expressões para o viés de segunda ordem dos estimadores de máxima verossimilhança dos parâmetros, possibilitando a obtenção de estimadores corrigidos que são mais precisos que os não corrigidos em amostras finitas. Finalmente, desenvolvemos técnicas de diagnóstico para os modelos de regressão beta inflacionados, sendo adotado o método de influência local baseado na curvatura normal conforme. Ilustramos a teoria desenvolvida em um conjuntos de dados reais. / The last years have seen new developments in the theory of beta regression models, which are useful for modelling random variables that assume values in the standard unit interval such as proportions, rates and fractions. In many situations, the dependent variable contains zeros and/or ones. In such cases, continuous distributions are not suitable for modeling this kind of data. In this thesis we propose mixed continuous-discrete distributions to model data observed on the intervals [0, 1],[0, 1) and (0, 1]. The proposed distributions are inflated beta distributions in the sense that the probability mass at 0 and/or 1 exceeds what is expected for the beta distribution. Properties of the inflated beta distributions are given. Estimation based on maximum likelihood and conditional moments is discussed and compared. Empirical applications using real data set are provided. Further, we develop inflated beta regression models in which the underlying assumption is that the response follows an inflated beta law. Estimation is performed by maximum likelihood. We provide closed-form expressions for the score function, Fishers information matrix and its inverse. Interval estimation for different population quantities (such as regression parameters, precision parameter, mean response) is discussed and tests of hypotheses on the regression parameters can be performed using asymptotic tests. We also derive the second order biases of the maximum likelihood estimators and use them to define bias-adjusted estimators. The numerical results show that bias reduction can be effective in finite samples. We also develop a set of diagnostic techniques that can be employed to identify departures from the postulated model and influential observations. To that end, we adopt the local influence approach based in the conformal normal curvature. Finally, we consider empirical examples to illustrate the theory developed.
269

Regressão logística com erro de medida: comparação de métodos de estimação / Logistic regression model with measurement error: a comparison of estimation methods

Rodrigues, Agatha Sacramento 27 June 2013 (has links)
Neste trabalho estudamos o modelo de regressão logística com erro de medida nas covariáveis. Abordamos as metodologias de estimação de máxima pseudoverossimilhança pelo algoritmo EM-Monte Carlo, calibração da regressão, SIMEX e naïve (ingênuo), método este que ignora o erro de medida. Comparamos os métodos em relação à estimação, através do viés e da raiz do erro quadrático médio, e em relação à predição de novas observações, através das medidas de desempenho sensibilidade, especificidade, verdadeiro preditivo positivo, verdadeiro preditivo negativo, acurácia e estatística de Kolmogorov-Smirnov. Os estudos de simulação evidenciam o melhor desempenho do método de máxima pseudoverossimilhança na estimação. Para as medidas de desempenho na predição não há diferença entre os métodos de estimação. Por fim, utilizamos nossos resultados em dois conjuntos de dados reais de diferentes áreas: área médica, cujo objetivo está na estimação da razão de chances, e área financeira, cujo intuito é a predição de novas observações. / We study the logistic model when explanatory variables are measured with error. Three estimation methods are presented, namely maximum pseudo-likelihood obtained through a Monte Carlo expectation-maximization type algorithm, regression calibration, SIMEX and naïve, which ignores the measurement error. These methods are compared through simulation. From the estimation point of view, we compare the different methods by evaluating their biases and root mean square errors. The predictive quality of the methods is evaluated based on sensitivity, specificity, positive and negative predictive values, accuracy and the Kolmogorov-Smirnov statistic. The simulation studies show that the best performing method is the maximum pseudo-likelihood method when the objective is to estimate the parameters. There is no difference among the estimation methods for predictive purposes. The results are illustrated in two real data sets from different application areas: medical area, whose goal is the estimation of the odds ratio, and financial area, whose goal is the prediction of new observations.
270

Modelos de regressão beta com erro nas variáveis / Beta regression model with measurement error

Carrasco, Jalmar Manuel Farfan 25 May 2012 (has links)
Neste trabalho de tese propomos um modelo de regressão beta com erros de medida. Esta proposta é uma área inexplorada em modelos não lineares na presença de erros de medição. Abordamos metodologias de estimação, como máxima verossimilhança aproximada, máxima pseudo-verossimilhança aproximada e calibração da regressão. O método de máxima verossimilhança aproximada determina as estimativas maximizando diretamente o logaritmo da função de verossimilhança. O método de máxima pseudo-verossimilhança aproximada é utilizado quando a inferência em um determinado modelo envolve apenas alguns mas não todos os parâmetros. Nesse sentido, dizemos que o modelo apresenta parâmetros de interesse como também de perturbação. Quando substituímos a verdadeira covariável (variável não observada) por uma estimativa da esperança condicional da variável não observada dada a observada, o método é conhecido como calibração da regressão. Comparamos as metodologias de estimação mediante um estudo de simulação de Monte Carlo. Este estudo de simulação evidenciou que os métodos de máxima verossimilhança aproximada e máxima pseudo-verossimilhança aproximada tiveram melhor desempenho frente aos métodos de calibração da regressão e naïve (ingênuo). Utilizamos a linguagem de programação Ox (Doornik, 2011) como suporte computacional. Encontramos a distribuição assintótica dos estimadores, com o objetivo de calcular intervalos de confiança e testar hipóteses, tal como propõem Carroll et. al.(2006, Seção A.6.6), Guolo (2011) e Gong e Samaniego (1981). Ademais, são utilizadas as estatísticas da razão de verossimilhanças e gradiente para testar hipóteses. Num estudo de simulação realizado, avaliamos o desempenho dos testes da razão de verossimilhanças e gradiente. Desenvolvemos técnicas de diagnóstico para o modelo de regressão beta com erros de medida. Propomos o resíduo ponderado padronizado tal como definem Espinheira (2008) com o objetivo de verificar as suposições assumidas ao modelo e detectar pontos aberrantes. Medidas de influência global, tais como a distância de Cook generalizada e o afastamento da verossimilhança, são utilizadas para detectar pontos influentes. Além disso, utilizamos a técnica de influência local conformal sob três esquemas de perturbação (ponderação de casos, perturbação da variável resposta e perturbação da covariável com e sem erros de medida). Aplicamos nossos resultados a dois conjuntos de dados reais para exemplificar a teoria desenvolvida. Finalmente, apresentamos algumas conclusões e possíveis trabalhos futuros. / In this thesis, we propose a beta regression model with measurement error. Among nonlinear models with measurement error, such a model has not been studied extensively. Here, we discuss estimation methods such as maximum likelihood, pseudo-maximum likelihood, and regression calibration methods. The maximum likelihood method estimates parameters by directly maximizing the logarithm of the likelihood function. The pseudo-maximum likelihood method is used when the inference in a given model involves only some but not all parameters. Hence, we say that the model under study presents parameters of interest, as well as nuisance parameters. When we replace the true covariate (observed variable) with conditional estimates of the unobserved variable given the observed variable, the method is known as regression calibration. We compare the aforementioned estimation methods through a Monte Carlo simulation study. This simulation study shows that maximum likelihood and pseudo-maximum likelihood methods perform better than the calibration regression method and the naïve approach. We use the programming language Ox (Doornik, 2011) as a computational tool. We calculate the asymptotic distribution of estimators in order to calculate confidence intervals and test hypotheses, as proposed by Carroll et. al (2006, Section A.6.6), Guolo (2011) and Gong and Samaniego (1981). Moreover, we use the likelihood ratio and gradient statistics to test hypotheses. We carry out a simulation study to evaluate the performance of the likelihood ratio and gradient tests. We develop diagnostic tests for the beta regression model with measurement error. We propose weighted standardized residuals as defined by Espinheira (2008) to verify the assumptions made for the model and to detect outliers. The measures of global influence, such as the generalized Cook\'s distance and likelihood distance, are used to detect influential points. In addition, we use the conformal approach for evaluating local influence for three perturbation schemes: case-weight perturbation, respose variable perturbation, and perturbation in the covariate with and without measurement error. We apply our results to two sets of real data to illustrate the theory developed. Finally, we present our conclusions and possible future work.

Page generated in 0.1091 seconds