1 |
中文動詞自動分類研究 / Automatic Classification of Chinese Unknown Verbs曾慧馨, Tseng, Hui-Hsin Unknown Date (has links)
本文提出以規則法與相似法將未知動詞自動分類至中研院詞庫小組(1993)的動詞分類標記上。規則法中的規則從訓練語料中訓練出,並加上未知動詞重疊的規律,包含率約二成五,正確率約86.86%∼91.32%。規則法的優點在於正確率高,但缺點在於可以處理的未知動詞數量太少。相似法利用與未知動詞的相似例子猜測未知動詞的可能分類,利用詞彙內部的訊息---詞基的詞類、語意類與詞彙結構來計算相似度。相似法的可以全面性的處理未知動詞,缺點容易受到訓練語料中標記錯誤的例子誤導與訓練語料的大小所影響。我們結合規則法與相似法預測未知動詞分類的正確率為72%。 / We present two methods to classify the Chinese unknown verbs. First, we summarize some linguistic rules and morphological patterns from corpus. The accuracy of the rule-based method is 86.86%~91.32%. Second, we use the instance-based categorization to classify the Chinese unknown words. The accuracy of the instance-based method is 67.86%~70.92% and the accuracy of the integrated classifier is about 72%.
|
Page generated in 0.0224 seconds