此篇文章探討,使用離散選擇模型(discrete choice model)中的邏輯模型(logit model)分析,若資料具有遺漏值(incomplete-data),比較將具有遺漏值樣本值皆移除與使用多重插補方法補值之參數估計結果。
本文使用的多重差補法為Buuren(2007)等人所提出的Multiple Imputation by Chained Equation(MICE)多重插補方法進行補值,並使用Rubin(1987)所提出的方法合併參數估計結果。從模擬結果之參數偏誤盒狀圖可知插補後參數估計與設定參數差異不大,另外插補次數對於參數估計結果影響不大,且在遺漏比例(missing percentage)大時,參數估計結果比起將具有遺漏值樣本直接移除的參數估計較為穩定。
另外使用實際資料分析,發現具有遺漏值樣本直接移除的參數估標準差比起插補後參數估計標準差大的趨勢,與模擬結果相同。 / This paper focuses on using discrete choice logit model to analyze incompleted data. To deal with the incompleted data, complete case analysis and multiple imputation are used, and compare the result of parameter estimates of the two methods.
The method of multiple imputation which this paper used is Multiple Imputation by Chained Equation (MICE). With the estimates from multiple imputed data sets, using Rubin’s method (1987) to pool the estimates. The simulation shows that after imputing the missing values, the estimates from the imputed data are not much difference from the real parameters. The number of imputation does not effect the estimates much. With larger missing percentage, the estimates from the imputed data is more robust than the estimates from the complete case analysis.
In real data analysis, the standard deviation of estimates from using complete case analysis are bigger than imputed data, this result is the same with the simulation.
Identifer | oai:union.ndltd.org:CHENGCHI/G1043540012 |
Creators | 簡廷翰, Jian, Ting Han |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 中文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0021 seconds