1 |
英文介系詞片語定位與英文介系詞推薦 / Attachment of English prepositional phrases and suggestions of English prepositions蔡家琦, Tsai, Chia Chi Unknown Date (has links)
英文介系詞在句子裡所扮演的角色通常是用來使介系詞片語更精確地補述上下文,英文的母語使用者可以很直覺地使用。然而電腦不瞭解語義,因此不容易判斷介系詞修飾對象;非英文母語使用者則不容易直覺地使用正確的介系詞。所以本研究將專注於介系詞片語定位與介系詞推薦的議題。
在本研究將這二個介系詞議題抽象化為一個決策問題,並提出一個一般化的解決方法。這二個問題共通的部分在於動詞片語,一個簡單的動詞片語含有最重要的四個中心詞(headword):動詞、名詞一、介系詞和名詞二。由這四個中心詞做為出發點,透過WordNet做階層式的選擇,在大量的案例中尋找語義上共通的部分,再利用機器學習的方法建構一般化的模型。此外,針對介系詞片語定的問題,我們挑選較具挑戰性介系詞做實驗。
藉由使用真實生活語料,我們的方法處理介系詞片語定位的問題,比同樣考慮四個中心詞的最大熵值法(Max Entropy)好;但與考慮上下文的Stanford剖析器差不多。而在介系詞推薦的問題裡,較難有全面比較的對象,但我們的方法精準度可達到53.14%。
本研究發現,高層次的語義可以使分類器有不錯的分類效果,而透過階層式的選擇語義能使分類效果更佳。這顯示我們確實可以透過語義歸納一套準則,用於這二個介系詞的議題。相信成果在未來會對機器翻譯與文本校對的相關研究有所價值。 / This thesis focuses on problems of attachment of prepositional phrases (PPs) and problems of prepositional suggestions. Determining the correct PP attachment is not easy for computers. Using correct prepositions is not easy for learners of English as a second language.
I transform the problems of PPs attachment and prepositional suggestion into an abstract model, and apply the same computational procedures to solve these two problems. The common model features four headwords, i.e., the verb, the first noun, the preposition, and the second noun in the prepositional phrases. My methods consider the semantic features of the headwords in WordNet to train classification models, and apply the learned models for tackling the attachment and suggestion problems. This exploration of PP attachment problems is special in that only those PPs that are almost equally possible to attach to the verb and the first noun were used in the study.
The proposed models consider only four headwords to achieve satisfactory performances. In experiments for PP attachment, my methods outperformed a Maximum Entropy classifier which also considered four headwords. The performances of my methods and of the Stanford parsers were similar, while the Stanford parsers had access to the complete sentences to judge the attachments. In experiments for prepositional suggestions, my methods found the correct prepositions 53.14% of the time, which is not as good as the best performing system today.
This study reconfirms that semantic information is instrument for both PP attachment and prepositional suggestions. High level semantic information helped to offer good performances, and hierarchical semantic synsets helped to improve the observed results. I believe that the reported results are valuable for future studies of PP attachment and prepositional suggestions, which are key components for machine translation and text proofreading.
|
Page generated in 0.0261 seconds