Global ETD Search

1	遞迴支持向量迴歸資料縮減法 / Recursive SVR data reduction 江政舉 Unknown Date (has links) 近年來，支持向量機(SVM, Support Vector Machine)及支持向量迴歸(SVR, Support Vector Regression)已被廣泛的應用在分類及預測上的問題，然而實務上常見資料過於龐大，而導致需要較長的計算時間及較高的計算成本。為了解決這樣的問題，Zhang等人（2006）及Chen, Wang與Cao（2008）發展兩種類型的資料縮減方法。前者為減少變數數量的遞迴支持向量機(RSVM, Recursive Support Vector Machine），藉由交叉驗證以及定義所謂的貢獻因子來找出重要的變數，而考慮僅利用重要的變數做分類。後者的方法稱為DSKR（Direct Sparse Kernel Regression），考慮在支持向量迴歸中，僅選取部份支持向量個數做預測，以達到資料縮減效果。本研究將遞迴支持向量機的方法延伸至支持向量迴歸上，此法稱為遞迴支持向量迴歸(RSVR, Recursive Support Vector Regression），藉由交叉驗證以及依據決策函數來定義各變數的貢獻因子，藉此選取出重要的變數，並且保留這些重要變數來做後續分析與預測。本研究將此方法應用於兩組實際的化學資料:Triazines及Pyrim，我們發現資料被大幅縮減，僅有六分之一至五分之一的變數被保留。而資料縮減後的預測效果，與利用整組原始資料來進行支持向量迴歸的結果相近，但較DSKR的結果差。關鍵字：支持向量機，支持向量迴歸，資料縮減支持向量機支持向量迴歸資料縮減
2	IEEE 802.16網路以支持向量機配置頻寬 / Bandwidth allocation using support vector machine in IEEE 802.16 networks 李俊毅, Li, Chun-Yi Unknown Date (has links) 近幾年無線寬頻網路崛起，寄望WiMAX可以取代最後一哩，雖然WiMAX有QoS的設計，但是對於Call Admission Control、Bandwidth Allocation、Scheduler並沒有實際定義，給予廠商彈性設計。本篇論文提出以機器學習的方式依據網路狀態動態配置頻寬，以符合實際頻寬需求。由於BS在配置頻寬的時候並沒有SS佇列的訊息，使得BS無法配置適合的頻寬，達到較好的效能，尤其是有期限的rtPS封包最為明顯。在系統負載較高的環境下，容易導致封包遺失提升，吞吐量降低的情形發生。因此本研究提出了支持向量機的方式，收集大量Training Data，訓練成動態頻寬配置模組；以動態配置適合的頻寬給rtPS，使rtPS在負載高的環境下的封包遺失率降低，且延遲能夠維持一定水準。搭配適應性頻寬配置策略，在低負載的環境下可以保留少許頻寬給Non Real Time Traffic，在高負載環境下，先滿足Real Time Traffic為原則。模擬工具採用NS 2-2.29、長庚大學-資策會的WiMAX模組，以及台大林智仁老師開發的支持向量機函式庫libSVM。 / In recent years, the rise of wireless broadband access networks. Hope that WiMAX can solve the last mile problem. Although WiMAX has QoS design, but for call admission control, bandwidth allocation, scheduler are not defined in standard. In this paper, we proposed a machine learning approach dynamic bandwidth allocation based on network state. BS because of the bandwidth allocation at a time when there is no message of SS’s queue. Enables BS can not configure a more suitable bandwidth to achieve better performance. In particular, there is the deadline of rtPS packets. At the higher loading on the system environment, easily lead to packet loss raise, lower throughput situations happen. In this study, a support vector machine approach to collect a large number of training data. Training modules into a dynamic bandwidth allocation. We can dynamically allocate bandwidth to fit rtPS. Adaptive bandwidth allocation strategy, at the low loading environment can keep some bandwidth for non real time traffic. At a high loading environment must first meet the real time traffic. We use Network Simulator 2-2.29, CGU-III WiMAX module, libSVM library. 頻寬配置支持向量機 Bandwidth Allocation WiMAX SVM IEEE 802.16
3	使用Meta-Learning在蛋白質質譜資料特徵選取之探討 / Feature Selection via Meta-Learning on Proteomic Mass Spectrum Data 陳詩佳 Unknown Date (has links) 癌症高居國人十大死因之首，由於癌症初期病患接受適時治療的存活率較高，因此若能「早期發現，早期診斷，早期治療」則可降低死亡率。本研究主要針對「表面強化雷射解析電離飛行質譜技術」(Surface-Enhanced Laser Desorption / Ionization Time-of-Flight Mass Spectrometry，SELDI-TOF-MS)所蒐集而來的攝護腺癌症蛋白質質譜之事前處理資料進行分析。目的是希望藉由Meta-Learning的方式結合分類器，並以逐步特徵選取之，期望以較少且具代表的特徵變數將資料分類，以達到較高的正確率。本文利用正確率決定逐步特徵選取時變數加入的順序，並進一步以Elastic Net與判定係數作為特徵變數排序依據，以改善變數間共線性高的問題。並且考慮投票法(多數表決法與權重投票法)以及串聯法(cascading)：多個分類器串聯與單一分類器串聯。研究發現，以判定係數刪選特徵變數加入的先後順序並以支持向量機(Support Vector Machine，SVM)串聯的特徵選取結果在各分類下皆有良好表現，為較佳的特徵選取方式。關鍵字：特徵選取、串聯法、蛋白質質譜、meta-learning、支持向量機特徵選取串聯法蛋白質質譜支持向量機
4	透過利率期限結構建立總體經濟產出缺口之預測模型 ─ 以美國為例 / Construct the forecast models for economic output gap through the term structure of interest rates ─ evidences for the United States 張楷翊 Unknown Date (has links) 經濟體的產出缺口一直是政策執行者的觀察重點，當一國出現產出缺口時，代表資源配置並不均衡，將發生通貨膨脹或是失業的現象，如能提早預期到未來是否會出現產出缺口，將可讓政策執行者即早進行政策實施，且有文獻指出，殖利率曲線資料中具有隱含未來經濟狀況之資訊。本研究以美國財政部與聯準會之公開資料，將以殖利率曲線之斜率進行預測產出缺口；本文研究美國1977年至2016年之國民生產毛額成分與殖利率之資料，目標為建立對於未來一季將出現正向或負向缺口現象之模型，本研究建立三種預測模型進行比較，分別為線性迴歸模型、羅吉斯迴歸模型與機器學習中的支持向量機，以實質GDP的缺口預測而言，研究結果顯示，三者預測準確度均達到65%以上，支持向量機的準確度更達到80.85%。得出以下結論，第一，殖利率曲線對於未來總體經濟產出缺口具有一定之解釋力；第二，對於高維度之預測模型在機器學習中的支持向量機表現會較一般常用之迴歸模型佳；第三，進出口的預測力在三個模型下均表現較差，可能為殖利率曲線對於進出口並不具有完整有效的資訊，可能有其餘的經濟指標或金融市場資訊可以解釋；第四，對於實質消費與投資等民間部門經濟行為有超過80%的預測力。 / The output gap of the economy has always been the objectives of policy practitioners. When a country appear the output gap, it means that the allocation of resources is not equilibrium and the inflation or unemployment will occur. The output gap will allow policymakers to implement the policy as early as possible, and the literature notes that the information of the yield curve has information about the future economic situation. In this paper, we using the data from the U.S. Department of Treasury and the Federal Reserve to predict the output gap by the slopes of the yield curve. Our goal is to construct the prediction model for the next quarter. To forecast the real GDP gap, three prediction models were compared, linear regression model, logistic regression model and support vector machine. The results show that the accuracy of the three predictions are more than 65%, support vector machine accuracy to reach 80.85%. We can have conclusions showing below: First, the yield curve has significant explanatory power for the overall economic output gap in the future. Second, the support vector machine perform better than the commonly used regression model. Third, the predictive power of real import and export in the three models are poor performance, there may be the rest of the economic indicators or financial market information can be explained. Fourth, the real consumption and investment has the predictive power more than 80% of the forecast. 殖利率曲線總體經濟預測支持向量機羅吉斯迴歸 Yield curve Economic forecast SVM Logistic regression
5	分類蛋白質質譜資料變數選取的探討 / On Variable Selection of Classifying Proteomic Spectra Data 林婷婷 Unknown Date (has links) 本研究所利用的資料是來自美國東維吉尼亞醫學院所提供的攝護腺癌蛋白質質譜資料，其資料有原始資料和另一筆經過事前處理過的資料，而本研究是利用事前處理過的資料來作實証分析。由於此種資料通常都是屬於高維度資料，故變數間具有高度相關的現象也很常見，因此從大量的特徵變數中選取到重要的特徵變數來準確的判斷攝護腺的病變程度成為一個非常普遍且重要的課題。那麼本研究的目的是欲探討各(具有懲罰項)迴歸模型對於分類蛋白質質譜資料之變數選取結果，藉由LARS、Stagewise、LASSO、Group LASSO和Elastic Net各(具有懲罰項)迴歸模型將變數選入的先後順序當作其排序所產生的判別結果與利用「統計量排序」(t檢定、ANOVA F檢定以及Kruskal-Wallis檢定)以及SVM「分錯率排序」的判別結果相比較。而分析的結果顯示，Group LASSO對於六種兩兩分類的分錯率，其分錯率趨勢的表現都較其他方法穩定，並不會有大起大落的現象發生，且最小分錯率也幾乎較其他方法理想。此外Group LASSO在四分類的判別結果在與其他方法相較下也顯出此法可得出最低的分錯率，亦表示若須同時判別四種類別時，相較於其他方法之下Group LASSO的判別準確度最優。 / Our research uses the prostate proteomic spectra data which is offered by Eastern Virginia Medical School. The materials have raw data and preprocessed data. Our research uses the preprocessed data to do the analysis of real example. Because this kind of materials usually have high dimension, so it maybe has highly correlation between variables very common, therefore choose from a large number of characteristic variables to accurately determine the pathological change degree of the Prostate is become a very general and important subject. Then the purpose of our research wants to discuss every (penalized) regression model in variable selection results for classifying the proteomic spectra data. With LARS, Stagewise, LASSO, Group LASSO and Elastic Net, each variable is chosen successively by each (penalized) regression model, and it is regarded as each variable’s order then produce discrimination results. After that, we use their results to compare with using statistic order (t-test, ANOVA F-test and Kruskal-Wallis test) and SVM fault rate order. And the result of analyzing reveals Group LASSO to two by two of six kinds of rate by mistake that classify, the mistake rate behavior of trend is more stable than other ways, it doesn’t appear big rise or big fall phenomenon. Furthermore, this way’s mistake rate is almostly more ideal than other ways. Moreover, using Group LASSO to get the discrimination result of four classifications has the lowest mistake rate under comparing with other methods. In other words, when must distinguish four classifications in the same time, Group LASSO’s discrimination accuracy is optimum. LARS Forward Stagewise LASSO Group LASSO Elastic Net 支持向量機 LARS Forward Stagewise LASSO Group LASSO Elastic Net SVM
6	運用支持向量機和決策樹預測台指期走勢 / Predicting Taiwan Stock Index Future Trend Using SVM and Decision Tree 吳永樂, Wu, Yong Le Unknown Date (has links) 本研究利用479個全球指標對台指期建立預測模型。該模型可以預測台指期在未來K天的漲跌走勢。我們使用了兩種演算法（支持向量機和決策樹）以及兩種取樣方式（交叉驗證和移動視窗）進行預測。在交叉驗證的建模過程中，決策樹展現了較高的預測力，最高準確度達到了93.4%。在移動視窗的建模過程中，支持向量機表現較好，達到了79.97%的預測准確度。於此同時，不管是哪一種條件設定都表明當我們預測的週期拉長時，預測的效果相對較好。這說明全球市場對台灣市場的影響很大，但是需要一定的市場反應時間。該研究結果對投資人有一定的參考作用。在未來方向裡，可以嘗試使用改進的決策樹演算法，也可以結合回歸預測進行深入研究。 / In this research, we build a stock price direction forecasting model with Taiwan Stock Index Future (TXF). The input data we used is 479 global indices. The classification algorithms we used are SVM and Decision Tree. This model can predict the up and down trend in the next k days. In the model building process, both cross validation and moving window are taking into account. As for the time period, both short term prediction (i.e. 1 day) and long term prediction (i.e. 100 days) are tested for comparison. The results showed that cross validation performs best with 93.4% in precision, and moving window reached 79.97% in precision when we use the last 60 days historical data to predict the up and down trend in the next 20 days. The results imply Taiwan stock market is significantly influenced by the global market in the long run. This finding could be further used by investors and also be studied with regression algorithms as a combination model to enhance its performance. 支持向量機決策樹台指期預測模型 SVM Decision Tree Global Indices Taiwan Stock Market
7	多項分配之分類方法比較與實證研究 / An empirical study of classification on multinomial data 高靖翔, Kao, Ching Hsiang Unknown Date (has links) 由於電腦科技的快速發展，網際網路(World Wide Web；簡稱WWW)使得資料共享及搜尋更為便利，其中的網路搜尋引擎(Search Engine)更是尋找資料的利器，最知名的「Google」公司就是藉由搜尋引擎而發跡。網頁搜尋多半依賴各網頁的特徵，像是熵(Entropy)即是最為常用的特徵指標，藉由使用者選取「關鍵字詞」，找出與使用者最相似的網頁，換言之，找出相似指標函數最高的網頁。藉由相似指標函數分類也常見於生物學及生態學，但多半會計算兩個社群間的相似性，再判定兩個社群是否相似，與搜尋引擎只計算單一社群的想法不同。本文的目標在於研究若資料服從多項分配，特別是似幾何分配的多項分配（許多生態社群都滿足這個假設），單一社群的指標、兩個社群間的相似指標，何者會有較佳的分類正確性。本文考慮的指標包括單一社群的熵及Simpson指標、兩社群間的熵及相似指標(Yue and Clayton, 2005)、支持向量機(Support Vector Machine)、邏輯斯迴歸等方法，透過電腦模擬及交叉驗證(cross-validation)比較方法的優劣。本文發現單一社群熵指標之表現，在本文的模擬研究有不錯的分類結果，甚至普遍優於支持向量機，但單一社群熵指標分類法的結果並不穩定，為該分類方法之主要缺點。 / Since computer science had changed rapidly, the worldwide web made it much easier to share and receive the information. Search engines would be the ones to help us find the target information conveniently. The famous Google was also founded by the search engine. The searching process is always depends on the characteristics of the web pages, for example, entropy is one of the characteristics index. The target web pages could be found by combining the index with the keywords information given by user. Or in other words, it is to find out the web pages which are the most similar to the user’s demands. In biology and ecology, similarity index function is commonly used for classification problems. But in practice, the pairwise instead of single similarity would be obtained to check if two communities are similar or not. It is dislike the thinking of search engines. This research is to find out which has better classification result between single index and pairwise index for the data which is multinomial distributed, especially distributed like a geometry distribution. This data assumption is often satisfied in ecology area. The following classification methods would be considered into this research: single index including entropy and Simpson index, pairwise index including pairwise entropy and similarity index (Yue and Clayton, 2005), and also support vector machine and logistic regression. Computer simulations and cross validations would also be considered here. In this research, it is found that the single index, entropy, has good classification result than imagine. Sometime using entropy to classify would even better than using support vector machine with raw data. But using entropy to classify is not very robust, it is the one needed to be improved in future. 多項分配熵相似指標電腦模擬支持向量機冪次定理 Multinomial distribution Entropy Similarity index Computer simulation Support vector machine Power Law Zipf's Law
8	應用共變異矩陣描述子及半監督式學習於行人偵測 / Semi-supervised learning for pedestrian detection with covariance matrix feature 黃靈威, Huang, Ling Wei Unknown Date (has links) 行人偵測為物件偵測領域中一個極具挑戰性的議題。其主要問題在於人體姿勢以及衣著服飾的多變性，加之以光源照射狀況迥異，大幅增加了辨識的困難度。吾人在本論文中提出利用共變異矩陣描述子及結合單純貝氏分類器與級聯支持向量機的線上學習辨識器，以增進行人辨識之正確率與重現率。實驗結果顯示，本論文所提出之線上學習策略在某些辨識狀況較差之資料集中能有效提升正確率與重現率達百分之十四。此外，即便於相同之初始訓練條件下，在USC Pedestrian Detection Test Set、 INRIA Person dataset 及 Penn-Fudan Database for Pedestrian Detection and Segmentation三個資料集中，本研究之正確率與重現率亦較HOG搭配AdaBoost之行人辨識方式為優。 / Pedestrian detection is an important yet challenging problem in object classification due to flexible body pose, loose clothing and ever-changing illumination. In this thesis, we employ covariance feature and propose an on-line learning classifier which combines naïve Bayes classifier and cascade support vector machine (SVM) to improve the precision and recall rate of pedestrian detection in a still image. Experimental results show that our on-line learning strategy can improve precision and recall rate about 14% in some difficult situations. Furthermore, even under the same initial training condition, our method outperforms HOG + AdaBoost in USC Pedestrian Detection Test Set, INRIA Person dataset and Penn-Fudan Database for Pedestrian Detection and Segmentation. 半監督式學習支持向量機單純貝氏分類器共變異描述子 Semi-supervised learning Support vector machine Naïve Bayes classifier Covariance descriptor
9	基植於非負矩陣分解之華語流行音樂曲式分析 / Chinese popular music structure analysis based on non-negative matrix factorization 黃柏堯, Huang, Po Yao Unknown Date (has links) 近幾年來，華語流行音樂的發展越來越多元，而大眾所接收到的資訊是流行音樂當中的組成元素”曲與詞”，兩者分別具有賦予人類感知的功能，使人能夠深刻體會音樂作品當中所表答的內容與意境。然而，作曲與作詞都是屬於專業的創作藝術，作詞者通常在填詞時，會先對樂曲當中的結構進行粗略的分析，找出整首曲子的曲式，而針對可以填詞的部份，再進行更細部的分析將詞填入最適當的位置。流行音樂當中，曲與詞存在著密不可分的關係，瞭解歌曲結構不僅能降低填詞的門檻，亦能夠明白曲子的骨架與脈絡;在音樂教育與音樂檢索方面亦有幫助。本研究的目標為，使用者輸入流行音樂歌曲，系統會自動分析出曲子的『曲式結構』。方法主要分成三個部分，分別為主旋律擷取、歌句分段與音樂曲式結構擷取。首先，我們利用Support Vector Machine以學習之方式建立模型後，擷取出符號音樂中之主旋律。第二步驟我們以”歌句”為單位，對主旋律進行分段，對於分段之結果建構出Self-Similarity Matrix矩陣。最後再利用Non-Negative Matrix Factorization針對不同特徵值矩陣進行分解並建立第二層之Self-Similarity Matrix矩陣，以歧異度之方式找出曲式邊界。我們針對分段方式對歌曲結構之影響進行分析與觀察。實驗數據顯示，事先將歌曲以歌句單位分段之效果較未分段佳，而歌句分段之評測結果F-Score為0.82;將音樂中以不同特徵值建構之自相似度矩進行Non-Negative Matrix Factorization後，另一空間中之基底特徵更能有效地分辨出不同的歌曲結構，其F-Score為0.71。 / Music structure analysis is helpful for music information retrieval, music education and alignment between lyrics and music. This thesis investigates the techniques of music structure analysis for Chinese popular music. Our work is to analyze music form automatically by three steps, main melody finding, sentence discovery, and music form discovery. First, we extract main melody based on learning from user-labeled sample using support vector machine. Then, the boundary of music sentence is detected by two-way classification using support vector machine. To discover the music form, the sentence-based Self-Similarity Matrix is constructed for each music feature. Non-negative Matrix Factorization is employed to extract the new features and to construct the second level Self-Similarity Matrix. The checkerboard kernel correlation is utilized to find music form boundaries on the second level Self-Similarity Matrix. Experiments on eighty Chinese popular music are performed for performance evaluation of the proposed approaches. For the main melody finding, our proposed learning-based approach is better than existing methods. The proposed approaches achieve 82% F-score for sentence discovery while 71% F-score for music form discovery. 曲式分析音樂分段支持向量機非負矩陣分解 Music Form Analysis Music Segmentation Support Vector Machine Non-Negative Matrix Factorization
10	以財務比率、共同比分析和公司治理指標預測上市公司財務危機之基因演算法與支持向量機的計算模型 / Applying Genetic Algorithms and Support Vector Machines for Predicting Financial Distresses with Financial Ratios and Features for Common-Size Analysis and Corporate Governance 黃珮雯, Huang, Pei-Wen Unknown Date (has links) 過去已有許多技術應用來建立預測財務危機的模型，如統計學的多變量分析或是類神經網路等分類技術。這些早期預測財務危機的模型大多以財務比率作為變數。然而歷經安隆（Enron）、世界通訊（WorldCom）等世紀騙局，顯示財務數字計算而成的財務比率有其天生的限制，無法在公司管理階層蓄意虛增盈餘時，及時給予警訊。因此，本論文初步探勘共同比分析、公司治理及傳統的Altman財務比率等研究方法，試圖突破財務比率在財務危機預測問題的限制，選出可能提高財務危機預測的特徵群。接著，我們進一步應用基因演算法篩選質性與非質性的特徵，期望藉由基因演算法裡子代獲得親代間最優基因的交配過程，可以讓子代的適應值最大化，找出最佳組合的特徵群，然後以此特徵群訓練支持向量機預測模型，以提高財務預測效果並降低公眾的損失。實驗結果顯示，共同比分析與公司治理等相關特徵確實能提升預測財務危機模型的預測效果，我們應當用基因演算法嘗試更多質性與非質性的特徵組合，及早預警財務危機公司以降低社會成本。財務危機預測共同比分析公司治理基因演算法支持向量機 Financial Distress Prediction Common-Size Analysis Corporate Governance Genetic Algorithms Support Vector Machines

1	遞迴支持向量迴歸資料縮減法 / Recursive SVR data reduction 江政舉 Unknown Date (has links) 近年來，支持向量機(SVM, Support Vector Machine)及支持向量迴歸(SVR, Support Vector Regression)已被廣泛的應用在分類及預測上的問題，然而實務上常見資料過於龐大，而導致需要較長的計算時間及較高的計算成本。為了解決這樣的問題，Zhang等人（2006）及Chen, Wang與Cao（2008）發展兩種類型的資料縮減方法。前者為減少變數數量的遞迴支持向量機(RSVM, Recursive Support Vector Machine），藉由交叉驗證以及定義所謂的貢獻因子來找出重要的變數，而考慮僅利用重要的變數做分類。後者的方法稱為DSKR（Direct Sparse Kernel Regression），考慮在支持向量迴歸中，僅選取部份支持向量個數做預測，以達到資料縮減效果。本研究將遞迴支持向量機的方法延伸至支持向量迴歸上，此法稱為遞迴支持向量迴歸(RSVR, Recursive Support Vector Regression），藉由交叉驗證以及依據決策函數來定義各變數的貢獻因子，藉此選取出重要的變數，並且保留這些重要變數來做後續分析與預測。本研究將此方法應用於兩組實際的化學資料:Triazines及Pyrim，我們發現資料被大幅縮減，僅有六分之一至五分之一的變數被保留。而資料縮減後的預測效果，與利用整組原始資料來進行支持向量迴歸的結果相近，但較DSKR的結果差。關鍵字：支持向量機，支持向量迴歸，資料縮減支持向量機支持向量迴歸資料縮減
2	IEEE 802.16網路以支持向量機配置頻寬 / Bandwidth allocation using support vector machine in IEEE 802.16 networks 李俊毅, Li, Chun-Yi Unknown Date (has links) 近幾年無線寬頻網路崛起，寄望WiMAX可以取代最後一哩，雖然WiMAX有QoS的設計，但是對於Call Admission Control、Bandwidth Allocation、Scheduler並沒有實際定義，給予廠商彈性設計。本篇論文提出以機器學習的方式依據網路狀態動態配置頻寬，以符合實際頻寬需求。由於BS在配置頻寬的時候並沒有SS佇列的訊息，使得BS無法配置適合的頻寬，達到較好的效能，尤其是有期限的rtPS封包最為明顯。在系統負載較高的環境下，容易導致封包遺失提升，吞吐量降低的情形發生。因此本研究提出了支持向量機的方式，收集大量Training Data，訓練成動態頻寬配置模組；以動態配置適合的頻寬給rtPS，使rtPS在負載高的環境下的封包遺失率降低，且延遲能夠維持一定水準。搭配適應性頻寬配置策略，在低負載的環境下可以保留少許頻寬給Non Real Time Traffic，在高負載環境下，先滿足Real Time Traffic為原則。模擬工具採用NS 2-2.29、長庚大學-資策會的WiMAX模組，以及台大林智仁老師開發的支持向量機函式庫libSVM。 / In recent years, the rise of wireless broadband access networks. Hope that WiMAX can solve the last mile problem. Although WiMAX has QoS design, but for call admission control, bandwidth allocation, scheduler are not defined in standard. In this paper, we proposed a machine learning approach dynamic bandwidth allocation based on network state. BS because of the bandwidth allocation at a time when there is no message of SS’s queue. Enables BS can not configure a more suitable bandwidth to achieve better performance. In particular, there is the deadline of rtPS packets. At the higher loading on the system environment, easily lead to packet loss raise, lower throughput situations happen. In this study, a support vector machine approach to collect a large number of training data. Training modules into a dynamic bandwidth allocation. We can dynamically allocate bandwidth to fit rtPS. Adaptive bandwidth allocation strategy, at the low loading environment can keep some bandwidth for non real time traffic. At a high loading environment must first meet the real time traffic. We use Network Simulator 2-2.29, CGU-III WiMAX module, libSVM library. 頻寬配置支持向量機 Bandwidth Allocation WiMAX SVM IEEE 802.16
3	使用Meta-Learning在蛋白質質譜資料特徵選取之探討 / Feature Selection via Meta-Learning on Proteomic Mass Spectrum Data 陳詩佳 Unknown Date (has links) 癌症高居國人十大死因之首，由於癌症初期病患接受適時治療的存活率較高，因此若能「早期發現，早期診斷，早期治療」則可降低死亡率。本研究主要針對「表面強化雷射解析電離飛行質譜技術」(Surface-Enhanced Laser Desorption / Ionization Time-of-Flight Mass Spectrometry，SELDI-TOF-MS)所蒐集而來的攝護腺癌症蛋白質質譜之事前處理資料進行分析。目的是希望藉由Meta-Learning的方式結合分類器，並以逐步特徵選取之，期望以較少且具代表的特徵變數將資料分類，以達到較高的正確率。本文利用正確率決定逐步特徵選取時變數加入的順序，並進一步以Elastic Net與判定係數作為特徵變數排序依據，以改善變數間共線性高的問題。並且考慮投票法(多數表決法與權重投票法)以及串聯法(cascading)：多個分類器串聯與單一分類器串聯。研究發現，以判定係數刪選特徵變數加入的先後順序並以支持向量機(Support Vector Machine，SVM)串聯的特徵選取結果在各分類下皆有良好表現，為較佳的特徵選取方式。關鍵字：特徵選取、串聯法、蛋白質質譜、meta-learning、支持向量機特徵選取串聯法蛋白質質譜支持向量機
4	透過利率期限結構建立總體經濟產出缺口之預測模型 ─ 以美國為例 / Construct the forecast models for economic output gap through the term structure of interest rates ─ evidences for the United States 張楷翊 Unknown Date (has links) 經濟體的產出缺口一直是政策執行者的觀察重點，當一國出現產出缺口時，代表資源配置並不均衡，將發生通貨膨脹或是失業的現象，如能提早預期到未來是否會出現產出缺口，將可讓政策執行者即早進行政策實施，且有文獻指出，殖利率曲線資料中具有隱含未來經濟狀況之資訊。本研究以美國財政部與聯準會之公開資料，將以殖利率曲線之斜率進行預測產出缺口；本文研究美國1977年至2016年之國民生產毛額成分與殖利率之資料，目標為建立對於未來一季將出現正向或負向缺口現象之模型，本研究建立三種預測模型進行比較，分別為線性迴歸模型、羅吉斯迴歸模型與機器學習中的支持向量機，以實質GDP的缺口預測而言，研究結果顯示，三者預測準確度均達到65%以上，支持向量機的準確度更達到80.85%。得出以下結論，第一，殖利率曲線對於未來總體經濟產出缺口具有一定之解釋力；第二，對於高維度之預測模型在機器學習中的支持向量機表現會較一般常用之迴歸模型佳；第三，進出口的預測力在三個模型下均表現較差，可能為殖利率曲線對於進出口並不具有完整有效的資訊，可能有其餘的經濟指標或金融市場資訊可以解釋；第四，對於實質消費與投資等民間部門經濟行為有超過80%的預測力。 / The output gap of the economy has always been the objectives of policy practitioners. When a country appear the output gap, it means that the allocation of resources is not equilibrium and the inflation or unemployment will occur. The output gap will allow policymakers to implement the policy as early as possible, and the literature notes that the information of the yield curve has information about the future economic situation. In this paper, we using the data from the U.S. Department of Treasury and the Federal Reserve to predict the output gap by the slopes of the yield curve. Our goal is to construct the prediction model for the next quarter. To forecast the real GDP gap, three prediction models were compared, linear regression model, logistic regression model and support vector machine. The results show that the accuracy of the three predictions are more than 65%, support vector machine accuracy to reach 80.85%. We can have conclusions showing below: First, the yield curve has significant explanatory power for the overall economic output gap in the future. Second, the support vector machine perform better than the commonly used regression model. Third, the predictive power of real import and export in the three models are poor performance, there may be the rest of the economic indicators or financial market information can be explained. Fourth, the real consumption and investment has the predictive power more than 80% of the forecast. 殖利率曲線總體經濟預測支持向量機羅吉斯迴歸 Yield curve Economic forecast SVM Logistic regression
5	分類蛋白質質譜資料變數選取的探討 / On Variable Selection of Classifying Proteomic Spectra Data 林婷婷 Unknown Date (has links) 本研究所利用的資料是來自美國東維吉尼亞醫學院所提供的攝護腺癌蛋白質質譜資料，其資料有原始資料和另一筆經過事前處理過的資料，而本研究是利用事前處理過的資料來作實証分析。由於此種資料通常都是屬於高維度資料，故變數間具有高度相關的現象也很常見，因此從大量的特徵變數中選取到重要的特徵變數來準確的判斷攝護腺的病變程度成為一個非常普遍且重要的課題。那麼本研究的目的是欲探討各(具有懲罰項)迴歸模型對於分類蛋白質質譜資料之變數選取結果，藉由LARS、Stagewise、LASSO、Group LASSO和Elastic Net各(具有懲罰項)迴歸模型將變數選入的先後順序當作其排序所產生的判別結果與利用「統計量排序」(t檢定、ANOVA F檢定以及Kruskal-Wallis檢定)以及SVM「分錯率排序」的判別結果相比較。而分析的結果顯示，Group LASSO對於六種兩兩分類的分錯率，其分錯率趨勢的表現都較其他方法穩定，並不會有大起大落的現象發生，且最小分錯率也幾乎較其他方法理想。此外Group LASSO在四分類的判別結果在與其他方法相較下也顯出此法可得出最低的分錯率，亦表示若須同時判別四種類別時，相較於其他方法之下Group LASSO的判別準確度最優。 / Our research uses the prostate proteomic spectra data which is offered by Eastern Virginia Medical School. The materials have raw data and preprocessed data. Our research uses the preprocessed data to do the analysis of real example. Because this kind of materials usually have high dimension, so it maybe has highly correlation between variables very common, therefore choose from a large number of characteristic variables to accurately determine the pathological change degree of the Prostate is become a very general and important subject. Then the purpose of our research wants to discuss every (penalized) regression model in variable selection results for classifying the proteomic spectra data. With LARS, Stagewise, LASSO, Group LASSO and Elastic Net, each variable is chosen successively by each (penalized) regression model, and it is regarded as each variable’s order then produce discrimination results. After that, we use their results to compare with using statistic order (t-test, ANOVA F-test and Kruskal-Wallis test) and SVM fault rate order. And the result of analyzing reveals Group LASSO to two by two of six kinds of rate by mistake that classify, the mistake rate behavior of trend is more stable than other ways, it doesn’t appear big rise or big fall phenomenon. Furthermore, this way’s mistake rate is almostly more ideal than other ways. Moreover, using Group LASSO to get the discrimination result of four classifications has the lowest mistake rate under comparing with other methods. In other words, when must distinguish four classifications in the same time, Group LASSO’s discrimination accuracy is optimum. LARS Forward Stagewise LASSO Group LASSO Elastic Net 支持向量機 LARS Forward Stagewise LASSO Group LASSO Elastic Net SVM
6	運用支持向量機和決策樹預測台指期走勢 / Predicting Taiwan Stock Index Future Trend Using SVM and Decision Tree 吳永樂, Wu, Yong Le Unknown Date (has links) 本研究利用479個全球指標對台指期建立預測模型。該模型可以預測台指期在未來K天的漲跌走勢。我們使用了兩種演算法（支持向量機和決策樹）以及兩種取樣方式（交叉驗證和移動視窗）進行預測。在交叉驗證的建模過程中，決策樹展現了較高的預測力，最高準確度達到了93.4%。在移動視窗的建模過程中，支持向量機表現較好，達到了79.97%的預測准確度。於此同時，不管是哪一種條件設定都表明當我們預測的週期拉長時，預測的效果相對較好。這說明全球市場對台灣市場的影響很大，但是需要一定的市場反應時間。該研究結果對投資人有一定的參考作用。在未來方向裡，可以嘗試使用改進的決策樹演算法，也可以結合回歸預測進行深入研究。 / In this research, we build a stock price direction forecasting model with Taiwan Stock Index Future (TXF). The input data we used is 479 global indices. The classification algorithms we used are SVM and Decision Tree. This model can predict the up and down trend in the next k days. In the model building process, both cross validation and moving window are taking into account. As for the time period, both short term prediction (i.e. 1 day) and long term prediction (i.e. 100 days) are tested for comparison. The results showed that cross validation performs best with 93.4% in precision, and moving window reached 79.97% in precision when we use the last 60 days historical data to predict the up and down trend in the next 20 days. The results imply Taiwan stock market is significantly influenced by the global market in the long run. This finding could be further used by investors and also be studied with regression algorithms as a combination model to enhance its performance. 支持向量機決策樹台指期預測模型 SVM Decision Tree Global Indices Taiwan Stock Market
7	多項分配之分類方法比較與實證研究 / An empirical study of classification on multinomial data 高靖翔, Kao, Ching Hsiang Unknown Date (has links) 由於電腦科技的快速發展，網際網路(World Wide Web；簡稱WWW)使得資料共享及搜尋更為便利，其中的網路搜尋引擎(Search Engine)更是尋找資料的利器，最知名的「Google」公司就是藉由搜尋引擎而發跡。網頁搜尋多半依賴各網頁的特徵，像是熵(Entropy)即是最為常用的特徵指標，藉由使用者選取「關鍵字詞」，找出與使用者最相似的網頁，換言之，找出相似指標函數最高的網頁。藉由相似指標函數分類也常見於生物學及生態學，但多半會計算兩個社群間的相似性，再判定兩個社群是否相似，與搜尋引擎只計算單一社群的想法不同。本文的目標在於研究若資料服從多項分配，特別是似幾何分配的多項分配（許多生態社群都滿足這個假設），單一社群的指標、兩個社群間的相似指標，何者會有較佳的分類正確性。本文考慮的指標包括單一社群的熵及Simpson指標、兩社群間的熵及相似指標(Yue and Clayton, 2005)、支持向量機(Support Vector Machine)、邏輯斯迴歸等方法，透過電腦模擬及交叉驗證(cross-validation)比較方法的優劣。本文發現單一社群熵指標之表現，在本文的模擬研究有不錯的分類結果，甚至普遍優於支持向量機，但單一社群熵指標分類法的結果並不穩定，為該分類方法之主要缺點。 / Since computer science had changed rapidly, the worldwide web made it much easier to share and receive the information. Search engines would be the ones to help us find the target information conveniently. The famous Google was also founded by the search engine. The searching process is always depends on the characteristics of the web pages, for example, entropy is one of the characteristics index. The target web pages could be found by combining the index with the keywords information given by user. Or in other words, it is to find out the web pages which are the most similar to the user’s demands. In biology and ecology, similarity index function is commonly used for classification problems. But in practice, the pairwise instead of single similarity would be obtained to check if two communities are similar or not. It is dislike the thinking of search engines. This research is to find out which has better classification result between single index and pairwise index for the data which is multinomial distributed, especially distributed like a geometry distribution. This data assumption is often satisfied in ecology area. The following classification methods would be considered into this research: single index including entropy and Simpson index, pairwise index including pairwise entropy and similarity index (Yue and Clayton, 2005), and also support vector machine and logistic regression. Computer simulations and cross validations would also be considered here. In this research, it is found that the single index, entropy, has good classification result than imagine. Sometime using entropy to classify would even better than using support vector machine with raw data. But using entropy to classify is not very robust, it is the one needed to be improved in future. 多項分配熵相似指標電腦模擬支持向量機冪次定理 Multinomial distribution Entropy Similarity index Computer simulation Support vector machine Power Law Zipf's Law
8	應用共變異矩陣描述子及半監督式學習於行人偵測 / Semi-supervised learning for pedestrian detection with covariance matrix feature 黃靈威, Huang, Ling Wei Unknown Date (has links) 行人偵測為物件偵測領域中一個極具挑戰性的議題。其主要問題在於人體姿勢以及衣著服飾的多變性，加之以光源照射狀況迥異，大幅增加了辨識的困難度。吾人在本論文中提出利用共變異矩陣描述子及結合單純貝氏分類器與級聯支持向量機的線上學習辨識器，以增進行人辨識之正確率與重現率。實驗結果顯示，本論文所提出之線上學習策略在某些辨識狀況較差之資料集中能有效提升正確率與重現率達百分之十四。此外，即便於相同之初始訓練條件下，在USC Pedestrian Detection Test Set、 INRIA Person dataset 及 Penn-Fudan Database for Pedestrian Detection and Segmentation三個資料集中，本研究之正確率與重現率亦較HOG搭配AdaBoost之行人辨識方式為優。 / Pedestrian detection is an important yet challenging problem in object classification due to flexible body pose, loose clothing and ever-changing illumination. In this thesis, we employ covariance feature and propose an on-line learning classifier which combines naïve Bayes classifier and cascade support vector machine (SVM) to improve the precision and recall rate of pedestrian detection in a still image. Experimental results show that our on-line learning strategy can improve precision and recall rate about 14% in some difficult situations. Furthermore, even under the same initial training condition, our method outperforms HOG + AdaBoost in USC Pedestrian Detection Test Set, INRIA Person dataset and Penn-Fudan Database for Pedestrian Detection and Segmentation. 半監督式學習支持向量機單純貝氏分類器共變異描述子 Semi-supervised learning Support vector machine Naïve Bayes classifier Covariance descriptor
9	基植於非負矩陣分解之華語流行音樂曲式分析 / Chinese popular music structure analysis based on non-negative matrix factorization 黃柏堯, Huang, Po Yao Unknown Date (has links) 近幾年來，華語流行音樂的發展越來越多元，而大眾所接收到的資訊是流行音樂當中的組成元素”曲與詞”，兩者分別具有賦予人類感知的功能，使人能夠深刻體會音樂作品當中所表答的內容與意境。然而，作曲與作詞都是屬於專業的創作藝術，作詞者通常在填詞時，會先對樂曲當中的結構進行粗略的分析，找出整首曲子的曲式，而針對可以填詞的部份，再進行更細部的分析將詞填入最適當的位置。流行音樂當中，曲與詞存在著密不可分的關係，瞭解歌曲結構不僅能降低填詞的門檻，亦能夠明白曲子的骨架與脈絡;在音樂教育與音樂檢索方面亦有幫助。本研究的目標為，使用者輸入流行音樂歌曲，系統會自動分析出曲子的『曲式結構』。方法主要分成三個部分，分別為主旋律擷取、歌句分段與音樂曲式結構擷取。首先，我們利用Support Vector Machine以學習之方式建立模型後，擷取出符號音樂中之主旋律。第二步驟我們以”歌句”為單位，對主旋律進行分段，對於分段之結果建構出Self-Similarity Matrix矩陣。最後再利用Non-Negative Matrix Factorization針對不同特徵值矩陣進行分解並建立第二層之Self-Similarity Matrix矩陣，以歧異度之方式找出曲式邊界。我們針對分段方式對歌曲結構之影響進行分析與觀察。實驗數據顯示，事先將歌曲以歌句單位分段之效果較未分段佳，而歌句分段之評測結果F-Score為0.82;將音樂中以不同特徵值建構之自相似度矩進行Non-Negative Matrix Factorization後，另一空間中之基底特徵更能有效地分辨出不同的歌曲結構，其F-Score為0.71。 / Music structure analysis is helpful for music information retrieval, music education and alignment between lyrics and music. This thesis investigates the techniques of music structure analysis for Chinese popular music. Our work is to analyze music form automatically by three steps, main melody finding, sentence discovery, and music form discovery. First, we extract main melody based on learning from user-labeled sample using support vector machine. Then, the boundary of music sentence is detected by two-way classification using support vector machine. To discover the music form, the sentence-based Self-Similarity Matrix is constructed for each music feature. Non-negative Matrix Factorization is employed to extract the new features and to construct the second level Self-Similarity Matrix. The checkerboard kernel correlation is utilized to find music form boundaries on the second level Self-Similarity Matrix. Experiments on eighty Chinese popular music are performed for performance evaluation of the proposed approaches. For the main melody finding, our proposed learning-based approach is better than existing methods. The proposed approaches achieve 82% F-score for sentence discovery while 71% F-score for music form discovery. 曲式分析音樂分段支持向量機非負矩陣分解 Music Form Analysis Music Segmentation Support Vector Machine Non-Negative Matrix Factorization
10	以財務比率、共同比分析和公司治理指標預測上市公司財務危機之基因演算法與支持向量機的計算模型 / Applying Genetic Algorithms and Support Vector Machines for Predicting Financial Distresses with Financial Ratios and Features for Common-Size Analysis and Corporate Governance 黃珮雯, Huang, Pei-Wen Unknown Date (has links) 過去已有許多技術應用來建立預測財務危機的模型，如統計學的多變量分析或是類神經網路等分類技術。這些早期預測財務危機的模型大多以財務比率作為變數。然而歷經安隆（Enron）、世界通訊（WorldCom）等世紀騙局，顯示財務數字計算而成的財務比率有其天生的限制，無法在公司管理階層蓄意虛增盈餘時，及時給予警訊。因此，本論文初步探勘共同比分析、公司治理及傳統的Altman財務比率等研究方法，試圖突破財務比率在財務危機預測問題的限制，選出可能提高財務危機預測的特徵群。接著，我們進一步應用基因演算法篩選質性與非質性的特徵，期望藉由基因演算法裡子代獲得親代間最優基因的交配過程，可以讓子代的適應值最大化，找出最佳組合的特徵群，然後以此特徵群訓練支持向量機預測模型，以提高財務預測效果並降低公眾的損失。實驗結果顯示，共同比分析與公司治理等相關特徵確實能提升預測財務危機模型的預測效果，我們應當用基因演算法嘗試更多質性與非質性的特徵組合，及早預警財務危機公司以降低社會成本。財務危機預測共同比分析公司治理基因演算法支持向量機 Financial Distress Prediction Common-Size Analysis Corporate Governance Genetic Algorithms Support Vector Machines

Search results