本研究主要在探討普林斯頓大學所開發出來的WordNet線上辭典是否適合用在語意結構(Semantic Structure)的表達上,在整個研究中,我們會先將重點放在WordNet架構的討論,接著研究關於WordNet在建立語意結構上的文獻,以在研究前先取得過去研究的狀況,並針對缺點提出改進方案,最後則進行模式的驗證與修改,期望能得出一個較具代表性且完整的WordNet語意結構。
本研究採用Jarmasz, Szpakowicz(2001)的語意距離計算模式併Resnik(1995)的相似度(similarity)計算模式,透過這兩個模式來計算出詞彙的距離,並以此距離來辨別語意的關係,最後透過117道證券考題來實證這個架構的正確性與完整性,並針對不足之處作補強修改,以達到較佳的結果。
本研究的主要限制為下列幾項:
一、無法全盤的將證券業的所有的詞彙及其關係一次含括進來
二、測試的題目無法完整代表所有的問題可能性
三、由於最後結果並非實際架構與修改WordNet系統,僅僅是採用相似度
計算演算法算出結果,因此與實際機上測試難免會有所差距。
四、並沒有針對WordNet中所有的關係都做定義,僅只挑選較具代表性的
幾個詞彙關係做定義,在細部上可能會有所影響。 / This paper is mainly focusing on does the Princeton WordNet fit the Semantic Structure. In this research, we’ll discuss the structure of WordNet, then the reference of WordNet in Semantic Structure. Before we get start, we may collect all the passed data, and study the data more detail. Then we can know the situation and result of passed reseach, so we can modify the model of pass. Finally, we hope we can get a more completed WordNet semantic structure.
This paper uses the Jarmasz, Szpakowicz’s (2001) semantic distance and Resnik’s Similarity calculative model. Through
this two models to calculating the distance between two words, and calculating the similarity.
We collect 117 stock exam questions to verify the correctiveness and the completeness of this structure. And to complement the weakness, so we can have a more strong result.
This research has three constraints:
1.We can’t collect all words of stock domain
2.The 117 questions can’t explain all probability of query
3.We just run an algorithm to calculate the similarity, not
real testing on WordNet system, so it may be some bias.
4.Only identifying some chief words relationship, so it can not cover whole relations.
Identifer | oai:union.ndltd.org:CHENGCHI/G0923560161 |
Creators | 游舒帆, Yu,Shu Fan |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 中文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0022 seconds