1 |
應用資料採礦技術建置中小企業傳統產業之信用評等系統 / Applications of data mining techniques in establishing credit scoring system for the traditional industry of the SMEs羅浩禎, Luo, Hao-Chen Unknown Date (has links)
中小企業是台灣經濟貿易發展的命脈,過去以中小企業為主的出口貿易經濟體系,是創造台灣經濟奇蹟的主要動力。隨著2006年底新巴賽爾協定的正式實施,金融機構為符合新協定規範,亦需將中小企業信用評分程序,納入其徵、授信管理系統,以求信用風險評估皆可量化處理。故本研究將資料採礦技術應用於建置中小企業違約風險模型,針對內部評等法中的企業型暴險,根據新協定與金管會的準則,不僅以財務變數為主,也廣泛增加如企業基本特性及總體經濟因子等非財務變數,納入模型作為考慮變數,計算違約機率進而建置一信用評等系統,作為金融機構對於未來新授信戶之風險管理的參考依據。而本研究將以中小企業中製造傳統產業公司為主要的研究對象,建構企業違約風險模型及其信用評等系統,資料的觀察期間為2003至2005年。
本研究分別利用羅吉斯迴歸、類神經網路、和C&R Tree三種方法建立模型並加以評估比較其預測能力。研究結果發現,經評估確立以1:1精細抽樣比例下,使用羅吉斯迴歸技術建模的效果最佳,共選出六個變數作為企業違約機率模型之建模變數。經驗證後,此模型即使應用到不同期間或其他實際資料,仍具有一定的穩定性與預測效力,且符合新巴塞資本協定與金管會的各項規範,表示本研究之信用評等模型,確實能夠在銀行授信流程實務中加以應用。 / To track the development of Taiwan’s economy history, one very important factor that should never be ignored is the role of small enterprise businesses (the SMEs) which has always been played as a main driving force in the growth of Taiwan’s export trade economic system. With the formal implementation of Basel II in the end of 2006, there arises the need in the banking institutions to establish a credit scoring process for the SMEs into their credit evaluation systems in order to conform to the new accords and to quantify the credit risk assessment process.
Consequently, in this research we apply data mining techniques to construct the default risk model for the SMEs in accordance to the new accords and the guidelines published by the FSC (the Financial Supervisory Commission). In addition we not only take the financial variables as the core variables but also increase the non- financial variables such as the enterprise basic characteristics and overall economic factors extensively into the default risk model in order to formulate the probability of credit default risk as well as to establish the credit rating system for the enterprise-based at risk for default in the IRB in the second pillars of the Basel II. The data which used in this research is taken from the traditional SMEs industry ranging from the year of 2003 to 2005.
We use each of the following three methods, the Logistic Regression, the Neural Network and the C&R Tree, to build the model. Evaluation of the models is carried out using several statistics test results to compare the prediction accuracy of each model. Based on the result of this research under the 1:1 oversampling proportion, we are inclined to adopt the Logistic Regression techniques modeling as our chosen choice of model. There are six variables being selected from the dataset as the final significant variables in the default risk model. After multiple testing of the model, we believe that this model can withstand the testing for its capability of prediction even when applying in a different time frame or on other data set. More importantly this model is in conformity with the Basel II requirements published by the FSC which makes it even more practical in terms of evaluating credit risk default and credit rating system in the banking industry.
|
2 |
應用大數據於信用評等之模型探討 / The Application of Big Data on Credit Scoring Model林瑀甯 Unknown Date (has links)
信用風險或信用違約意旨金融機構提供給客戶服務卻未得償還的機率,故其在銀行信貸決策的領域是常被鑽研的對象,因為其對於金融機構所扮演的角色尤其重要,對商業銀行來說更是常難以解釋或控制,然而拜現今進步的科技所賜,金融機構可以藉由操控較過去低的成本即可進一步發展強健且精煉的系統與模型去做預測還有信用風險的控管,有鑑於對客戶的評分自大數據時代來臨起,即使是學生亦開始有了可以評鑑的痕跡,憑藉前人所實驗或仰賴的基本考量面向如客戶基本資料、財力狀況或是其於該公司今昔的借貸訊息,再輔以藉由開放資料所帶來的資訊,發想可能影響信用違約率的變數如外在規範對該客戶的紀錄,想驗證是否真有尚可開發的方向,若有則其影響可以到多深。
眾所皆知從過去到現在即有很多種方法被開創以及提出以預測信用違約率,當然所使用的方法和金融機構本身的複雜性、規模大小以及信貸類型有關,最常見的有判別分析,但其對於變數有嚴格的假設,而新興的方法神經網路可以克服判別分析的缺陷且預測的效能也不錯,但神經網路只給予預測結果而運算過程是未知的,對於想要了解變數間的關係無濟於事,故還是選擇從可以對二元分類做預測亦可以藉由模型係數看到應變數和自變數間關係的羅吉斯迴歸方法著手,而研究過程即是依著前人對於羅吉斯迴歸在信用風險上的繩索摸索,將資料如何清理、變數如何轉換、模型如何建立以及最後如何篩選做一個完整的陳述,縱然長道漫漫,對於研究假設在結果終得驗證也始見曙光,考慮的新面向確有其影響力,而在模型係數上也看到其影響的大小,為了更彰顯羅吉斯迴歸對於變數間提供的訊息,故在最後將研究結果以較文字易讀的視覺化方式作呈現。 / Credit risk or credit default means the probability of non-repayment that banks or financial institutions get after they provide services to their customers. Credit risk is also studied intensively in the field of bank lending strategy because it’s usually hard to interpret and control. However, thanks to advanced technology nowadays, banks can manipulate reduced cost to develop robust and well-trained system and models so as to predict and mange credit risk. In the light of the score on customers from the beginning of big data era, every single one can be tracked to assess even though he or she is student. Relying on common facets like personal information, financial statement and past relationship of loan in a specific bank, come up with possible variables like regulations which influence credit risk according to information from open data. Try to verify if there is a new aspect of modeling and how far it effects.
As everyone knows, there are several created and offered methodologies in order to predict credit default. They differ from complexity of banks and institutions, size and type of loan. One of the most popular method is discriminant analysis, but variables are restricted to its assumption. Neural network can fix the flaws of the assumption and work efficiently. Considering the unknown process of calculation in neural network, choose logistic regression as research method which can see the relationship between variables and predict the binary category. With the posterior research on credit risk, make a complete statement about how to clean data, how to transform variables and how to build or screen models. Although the procedure is complicated, the result of this study still validates original hypothesis that new aspect indeed has an impact on credit risk and the coefficient shows how deep it affects.
|
3 |
Predicting Subprime Customers' Probability of Default Using Transaction and Debt Data from NPLs / Predicering av högriskkunders sannolikhet för fallissemang baserat på transaktions- och lånedata på nödlidande lånWong, Lai-Yan January 2021 (has links)
This thesis aims to predict the probability of default (PD) of non-performing loan (NPL) customers using transaction and debt data, as a part of developing credit scoring model for Hoist Finance. Many NPL customers face financial exclusion due to default and therefore are considered as bad customers. Hoist Finance is a company that manages NPLs and believes that not all conventionally considered subprime customers are high-risk customers and wants to offer them financial inclusion through favourable loans. In this thesis logistic regression was used to model the PD of NPL customers at Hoist Finance based on 12 months of data. Different feature selection (FS) methods were explored, and the best model utilized l1-regularization for FS and predicted with 85.71% accuracy that 6,277 out of 27,059 customers had a PD between 0% to 10%, which support this belief. Through analysis of the PD it was shown that the PD increased almost linearly with respect to an increase in either debt quantity, original total claim amount or number of missed payments. The analysis also showed that the payment behaviour in the last quarter had the most predictive power. At the same time, from analysing the type II error it was shown that the model was unable to capture some bad payment behaviour, due to putting to large emphasis on the last quarter. / Det här examensarbetet syftar till att predicera sannolikheten för fallissemang för nödlidande lånekunder genom transaktions- och lånedata. Detta som en del av kreditvärdighetsmodellering för Hoist Finance. På engelska kallas sannolikheten för fallissemang för "probability of default" (PD) och nödlidande lån kallas för "non-performing loan" (NPL). Många NPL-kunder står inför ekonomisk uteslutning på grund av att de konventionellt betraktas som kunder med dålig kreditvärdighet. Hoist Finance är ett företag som förvaltar nödlidande lån och påstår att inte alla konventionellt betraktade "dåliga" kunder är högrisk kunder. Därför vill Hoist Finance inkludera dessa kunder ekonomisk genom att erbjuda gynnsamma lån. I detta examensarbetet har Logistisk regression används för att predicera PD på nödlidande lånekunder på Hoist Finance baserat på 12 månaders data. Olika metoder för urval av attribut undersöktes och den bästa modellen utnyttjade lasso för urval. Denna modell predicerade med 85,71 % noggrannhet att 6 277 av 27 059 kunder har en PD mellan 0 % till 10 %, vilket stödjer påståendet. Från analys av PD visade det sig att PD ökade nästan linjärt med avseende på ökning i antingen kvantitet av lån, det ursprungliga totala lånebeloppet eller antalet missade betalningar. Analysen visade också att betalningsbeteendet under det sista kvartalet hade störst prediktivt värde. Genom analys av typ II-felet, visades det sig samtidigt att modellen hade svårigheter att fånga vissa dåliga betalningsbeteende just på grund av att för stor vikt lades på det sista kvartalet.
|
Page generated in 0.1068 seconds