11 |
台灣省各地區普查資料之統計分析莊靖芬 Unknown Date (has links)
本研究的目的為研究台灣省在1990年之15-17歲的在學率,在找出可能影響因素並蒐集好相關的資料後,我們將蒐集到的資料分成兩個部份,一個部份用來建造模型,而另一個部份則用來測試所建立出來的模型。主要的過程是:先利用簡單迴歸模型了解各個可能的因素對於15-17歲的在學率的影響程度,經過許多分析及了解後再對這些變數採取可能的變數轉換(variable transformations),而後再利用三種常用的統計迴歸方法﹝包含有逐步迴歸(stepwise regression)方法、前進選擇(forward selection)方法以及後退消除(backward elimination)方法﹞去發展出一個適當的複迴歸模型(multiple regression model)。對於這個模型,以實際的台灣在學情況來看,我們看不出它有任何的不合理;同時也利用圖形及檢定去驗證模型的假設,其次還做有關迴歸參數的推論(inferences about regression parameters)。再其次,我們運用變異數分析的結果(analysis of variance results)以及新觀察值的預測情形(predictions of new observations)來評估模型的預測能力。最後並利用所得到的最適當的模型,對如何提昇15-17歲青少年的在學率給予適當的建議。 / The objective of this research is to study what factors may affect the schooling rates of 15-17 years old in Taiwan province in 1990. After finding out some possible factors and collecting those data regarding those factors, we separate the data (by stratified random sampling) into two sets. One set is used to construct the model, and the other set shall be used to test the model. The main process to build a regression model is as follows. First, we shall use simple linear regression models to help us to see if each factor may have relation with the schooling rates. With the analysis of residuals and so on, we then make appropriate transformations on each of these factors. Finally, we use three common statistical regression techniques (including stepwise regression, forward selection, and backward elimination methods) to develop a suitable multiple regression model. It seems that, by our understanding of schooling rates in Taiwan, this model is not unreasonable. In addition, we verify the assumptions of the model by graphical methods and statistical tests. We also do the inferences about regression parameters. Furthermore, ye use the results of the analysis of variance and predictions of new observations to evaluate the prediction ability of the model. Finally, we use the most appropriate multiple regression model to give some suggestions to improve (or keep) the schooling rates of 15-17 years old.
|
12 |
Lasso顯著性檢定與向前逐步迴歸變數選取方法之比較 / A Comparison between Lasso Significance Test and Forward Stepwise Selection Method鄒昀庭, Tsou, Yun Ting Unknown Date (has links)
迴歸模式的變數選取是很重要的課題,Tibshirani於1996年提出最小絕對壓縮挑選機制(Least Absolute Shrinkage and Selection Operator;簡稱Lasso),主要特色是能在估計的過程中自動完成變數選取。但因為Lasso本身並沒有牽扯到統計推論的層面,因此2014年時Lockhart et al.所提出的Lasso顯著性檢定是重要的突破。由於Lasso顯著性檢定的建構過程與傳統向前逐步迴歸相近,本研究接續Lockhart et al.(2014)對兩種變數選取方法的比較,提出以Bootstrap來改良傳統向前逐步迴歸;最後並比較Lasso、Lasso顯著性檢定、傳統向前逐步迴歸、以AIC決定變數組合的向前逐步迴歸,以及以Bootstrap改良的向前逐步迴歸等五種方法變數選取之效果。最後發現Lasso顯著性檢定雖然不容易犯型一錯誤,選取變數時卻過於保守;而以Bootstrap改良的向前逐步迴歸跟Lasso顯著性檢定一樣不容易犯型一錯誤,而選取變數上又比起Lasso顯著性檢定更大膽,因此可算是理想的方法改良結果。 / Variable selection of a regression model is an essential topic. In 1996, Tibshirani proposed a method called Lasso (Least Absolute Shrinkage and Selection Operator), which completes the matter of selecting variable set while estimating the parameters. However, the original version of Lasso does not provide a way for making inference. Therefore, the significance test for lasso proposed by Lockhart et al. in 2014 is an important breakthrough. Based on the similarity of construction of statistics between Lasso significance test and forward selection method, continuing the comparisons between the two methods from Lockhart et al. (2014), we propose an improved version of forward selection method by bootstrap. And at the second half of our research, we compare the variable selection results of Lasso, Lasso significance test, forward selection, forward selection by AIC, and forward selection by bootstrap. We find that although the Type I error probability for Lasso Significance Test is small, the testing method is too conservative for including new variables. On the other hand, the Type I error probability for forward selection by bootstrap is also small, yet it is more aggressive in including new variables. Therefore, based on our simulation results, the bootstrap improving forward selection is rather an ideal variable selecting method.
|
Page generated in 0.0121 seconds