本文自美國人口普查局 (United States Census Bureau) 取得多達十萬筆的勞工資料,然而在如此大量的勞工資料中因維度的詛咒,所以我們無法使用傳統的資料探勘的方法分析資料,而且傳統的序述統計也無法提供一個好的分析方向,因此我們藉由 Tzeng et al (2008) 所提出的分解與結合多元尺度法 (Split-and-combine Multidimensional Scaling, SC-MDS) 為分析方法來剖析此資料。多元尺度法主要的目的有二:第一,使資料展現在空間中,並以資料點與點之間的距離表示其相關性;第二,降低資料維度避免維度的詛咒。SC-MDS 提供我們在分析此大資料相關聯性時的優先順序為年齡、學歷、性別;並結合職位資訊聯合資料庫 (Occupational Information Network)分析在此架構下不同分類的勞工在其就業的職位特質上的差異。我們發現了教育程度會影響性別間在勞工職位特質上的差異,且這些差異的數量又會隨年齡的增加而增加;教育程度在各個年齡層都對勞工職位特質產生很大的差異;最後,青年與壯年的勞工在職位特質上相較於壯年與中年勞工相似,並對以上產生相似或差異的原因提出解釋。 / A big labor data from United States Census Bureau will occur two problems. First, since the big data issue, we can not use the traditional method of data mining. Second, the descriptive statistics can not offer an explicit analysis, so we use Split-and-combine Multidimensional Scaling (SC-MDS), which is proposed by Tzeng et al (2008) to mining this labor data. MDS has two main purposes: First, Express data similarity by the distance between each pair points in spatial configuration. Second, Reducing data dimension to aviod the curse of dimension. After SC-MDS, the big labor data can be analysed by age, education and sex. We combine this order and the Occupational Information Network data base to develope the differences in occupational characteristics. We find the following phenomenon: first, differences are increasing with ages. Second, eduction do impact labors' characteristics in every ages. Third, the youth labors are more similar in occupational characteristics than olders. Finally, we try to explain the results above.
Identifer | oai:union.ndltd.org:CHENGCHI/G0099258014 |
Creators | 陳烽威, Chen, Fong Wei |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 中文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0014 seconds