Return to search

電腦輔助漢英與英漢翻譯例句搜尋服務 / A Computer Assisted Environment for Searching Related Translations between Chinese and English

本論文為提供一個能輔助學生學習英漢翻譯與漢英翻譯時,搜尋翻譯例句的環境。我們的平行語料是利用網路上可取得的文件,如:英語教學網站、學習單等,利用人工擷取中英文對照的句子。標記化語料庫中記錄了中文句、英文句、中文句斷詞後的結果、中文句的詞性標記、中文句結構樹以及英文結構樹等資訊。
使用者輸入的查詢句可包含中文句、英文句及中英文混合句。我們的系統會依據不同的搜尋功能,針對查詢句作前處理,如:斷詞、詞性擷取、結構樹分析、詞性還原、查詢句的詞彙擴展等,再與標記化語料庫作比對,最後提供與查詢句相似的中英文對照句子給使用者,讓使用者在學習翻譯時,有更多類似句可以參考。
我們的系統不容易使用正規的方式評估;為了評估系統的效能,我們記錄各個搜尋功能,在不同門檻值下所得到的類似句句數,並利用NIST及BLEU來評估本系統所提供的類似句品質;另外我們透過問卷調查請受試者勾選本系統所提供類似句。問卷調查結果顯示受試者對於本系統所提供的類似句共識度並不高;本系統在提供10句類似句中,僅有1.6句的類似句是受試者認為有幫助的。 / I present an environment for searching related translations between Chinese and English. A parallel and tagged corpus was constructed based on the text material obtained from the Internet, including English teaching websites and public learn¬ing sheets. The corpus contains both English and Chinese sentences, the infor¬mation about how the Chinese strings were segmented, the POS tags of the Chinese words, and the syntactic structures of the English and Chinese sentences.
The user can use our system to do some queries by entering a Chinese sentence, an English sentence, or any pattern with mixed Chinese and English. The query sentence will be preprocessed according to the search function which the user selects, and the results of preprocessing will be used to search in the tagged corpus. The search results will be the reference sentences that are related to the query sentence.
A formal evaluation of our system is not easy. I evaluated the system by entering a set of selected queries. For those tests, I recorded and compared the amount of reference sentences the system returned, and evaluated the quality of the reference sentences with their BLEU and NIST scores with some standard translations. In addition, I evaluated my system with the help of human subjects. Human subjects were asked to choose useful sentences from the reference sentences returned by my system. Experimental results indicated that the agreements between human subjects were not high, and the human subjects found that only about 1.6 sentences were useful from 10 reference sentences.

Identiferoai:union.ndltd.org:CHENGCHI/G0095753023
Creators賴敏華, Lai, Min Hua
Publisher國立政治大學
Source SetsNational Chengchi University Libraries
Language中文
Detected LanguageEnglish
Typetext
RightsCopyright © nccu library on behalf of the copyright holders

Page generated in 0.0019 seconds