增長層級式自我組織映射網路(GHSOM)屬於一種非監督式類神經網路,為自我組織映射網路(SOM)的延伸,擅長於對樣本分群,以輔助分析樣本族群裡的共同特徵,並且可以透過族群間存在的空間關係假設來建立分類器,進而辨別出異常的資料。
因此本研究提出一個創新的對偶方法(即為一個建立決策支援系統架構的方法)分別對舞弊與非舞弊樣本分群,首先兩類別之群組會被配對,即辨識某一特定無弊群體的非舞弊群體對照組,針對這些配對族群,套用基於不同空間假設所設立的分類規則以檢測舞弊與非舞弊群體中是否有存在某種程度的空間關係,此外並對於舞弊樣本的分群結果加入特徵萃取機制。分類績效最好的分類規則會被用來偵測受測樣本是否有舞弊的嫌疑,萃取機制的結果則會用來標示有舞弊嫌疑之受測樣本的舞弊行為特徵以及相關的輸入變數,以做為後續的決策輔助。
更明確地說,本研究分別透過非舞弊樣本與舞弊樣本建立一個非舞弊GHSOM樹以及舞弊GHSOM樹,且針對每一對GHSOM群組建立分類規則,其相應的非舞弊/舞弊為中心規則會適應性地依循決策者的風險偏好最佳化調整規則界線,整體而言較優的規則會被決定為分類規則。非舞弊為中心的規則象徵絕大多數的舞弊樣本傾向分布於非舞弊樣本的周圍,而舞弊為中心的規則象徵絕大多數的非舞弊樣本傾向分布於舞弊樣本的周圍。
此外本研究加入了一個特徵萃取機制來發掘舞弊樣本分群結果中各群組之樣本資料的共同特質,其包含輸入變數的特徵以及舞弊行為模式,這些資訊將能輔助決策者(如資本提供者)評估受測樣本的誠實性,輔助決策者從分析結果裡做出更進一步的分析來達到審慎的信用決策。
本研究將所提出的方法套用至財報舞弊領域(屬於財務舞弊偵測的子領域)進行實證,實驗結果證實樣本之間存在特定的空間關係,且相較於其他方法如SVM、SOM+LDA和GHSOM+LDA皆具有更佳的分類績效。因此顯示本研究所提出的機制可輔助驗證財務相關數據的可靠性。此外,根據SOM的特質,即任何受測樣本歸類到某特定族群時,該族群訓練樣本的舞弊行為特徵將可以代表此受測樣本的特徵推論。這樣的原則可以用來協助判斷受測樣本的可靠性,並可供持續累積成一個舞弊知識庫,做為進一步分析以及制定相關信用決策的參考。本研究所提出之基於對偶方法的決策支援系統架構可以被套用到其他使用財務數據為資料來源的財務舞弊偵測情境中,作為輔助決策的基礎。 / The Growing Hierarchical Self-Organizing Map (GHSOM) is extended from the Self-Organizing Map (SOM). The GHSOM’s unsupervised learning nature such as the adaptive group size as well as the hierarchy structure renders its availability to discover the statistical salient features from the clustered groups, and could be used to set up a classifier for distinguishing abnormal data from regular ones based on spatial relationships between them.
Therefore, this study utilizes the advantage of the GHSOM and pioneers a novel dual approach (i.e., a proposal of a DSS architecture) with two GHSOMs, which starts from identifying the counterparts within the clustered groups. Then, the classification rules are formed based on a certain spatial hypothesis, and a feature extraction mechanism is applied to extract features from the fraud clustered groups. The dominant classification rule is adapted to identify suspected samples, and the results of feature extraction mechanism are used to pinpoint their relevant input variables and potential fraud activities for further decision aid.
Specifically, for the financial fraud detection (FFD) domain, a non-fraud (fraud) GHSOM tree is constructed via clustering the non-fraud (fraud) samples, and a non-fraud-central (fraud-central) rule is then tuned via inputting all the training samples to determine the optimal discrimination boundary within each leaf node of the non-fraud (fraud) GHSOM tree. The optimization renders an adjustable and effective rule for classifying fraud and non-fraud samples. Following the implementation of the DSS architecture based on the proposed dual approach, the decision makers can objectively set their weightings of type I and type II errors. The classification rule that dominates another is adopted for analyzing samples. The dominance of the non-fraud-central rule leads to an implication that most of fraud samples cluster around the non-fraud counterpart, meanwhile the dominance of fraud-central rule leads to an implication that most of non-fraud samples cluster around the fraud counterpart.
Besides, a feature extraction mechanism is developed to uncover the regularity of input variables and fraud categories based on the training samples of each leaf node of a fraud GHSOM tree. The feature extraction mechanism involves extracting the variable features and fraud patterns to explore the characteristics of fraud samples within the same leaf node. Thus can help decision makers such as the capital providers evaluate the integrity of the investigated samples, and facilitate further analysis to reach prudent credit decisions.
The experimental results of detecting fraudulent financial reporting (FFR), a sub-field of FFD, confirm the spatial relationship among fraud and non-fraud samples. The outcomes given by the implemented DSS architecture based on the proposed dual approach have better classification performance than the SVM, SOM+LDA, GHSOM+LDA, SOM, BPNN and DT methods, and therefore show its applicability to evaluate the reliability of the financial numbers based decisions. Besides, following the SOM theories, the extracted relevant input variables and the fraud categories from the GHSOM are applicable to all samples classified into the same leaf nodes. This principle makes that the extracted pre-warning signal can be applied to assess the reliability of the investigated samples and to form a knowledge base for further analysis to reach a prudent decision. The DSS architecture based on the proposed dual approach could be applied to other FFD scenarios that rely on financial numbers as a basis for decision making.
Identifer | oai:union.ndltd.org:CHENGCHI/G0096356511 |
Creators | 黃馨瑩, Huang, Shin Ying |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 英文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0019 seconds