Global ETD Search

以共現資訊為基礎增進英漢翻譯對列改進方法 / Using Co-Occurrence Information for Alignment Improvement in English-Chinese Translation

本論文承接呂明欣和張智傑兩位原有的翻譯系統，主要針對詞彙對列模組來進行改善，進而增進詞序範例樹之精確率和數量，以建立高品質的詞序範例樹資料庫，提升整體的翻譯品質。
　　我們選用國民中學、高級中學和科普雜誌，這三種在句法結構和用字遣詞皆有所差異的中英文平行語料，先透過斷詞系統進行前處理，接著藉由辭典檔索引其相對應之翻譯字詞，以進行中英文詞彙之間的對列，其中更採用了原詞還原和同義詞擴充，來對原始的字詞進行補強。並且將對列完畢之後的遺留字詞，重新搭配組合，以一個中文字詞為基礎，分別對應一個英文字詞和對應多個英文字詞兩種搭配方式，並透過分析公式篩選出可信度較高的新詞對，以便擴充原始的辭典檔，使得詞彙對列模組達到更好的效果。
　　在評估方面，以不同英文程度的平行語料當作訓練資料，將國際數學與科學教育成就趨勢調查測驗試題當做翻譯對象，利用NIST和BLEU當作評比的標準進行評估。實驗結果顯示，我們所提出的想法有助於提升詞彙對列的效果，並且可以產生更多的詞序範例樹以供翻譯系統進行詞序調動，並提升輔助式翻譯系統的翻譯品質。 / This research continues the translation systems designed by Ming-Shin Lu and Chih-Chieh Chang. We mainly ameliorate the word alignment and create high-quality databases of reordering tree to improve the quality in translation.
　　In this paper, we explore the possibility of finding alignments for words that are not aligned by methods that employ only information about word translations from English and Chinese dictionaries. With the proposed methods, we were able to align chunks of words between English and Chinese, not limiting to just word-to-word alignment.
　　In evaluation, parallel corpuses with different degrees for English are used as training data. In addition, Trends in International Mathematics and Science Study questions are chosen as testing data. The evaluation is performed by exploiting NIST and BLEU as standards. The experimental results show that the proposed method enhances the effect of word alignment. Also, it can generate more reordering tree for bilingual structured string tree corredpondence. Besides, the translation quality of assisted translation system will increase by using our method.

http://thesis.lib.nccu.edu.tw/cgi-bin/cdrfb3/gsweb.cgi?o=dstdcdr&i=sid=%22G0097753007%22.

Identifer	oai:union.ndltd.org:CHENGCHI/G0097753007
Creators	黃昭憲, Huang,Chao Shainn
Publisher	國立政治大學
Source Sets	National Chengchi University Libraries
Language	中文
Detected Language	English
Type	text
Rights	Copyright © nccu library on behalf of the copyright holders

Page generated in 0.0089 seconds

以共現資訊為基礎增進英漢翻譯對列改進方法 / Using Co-Occurrence Information for Alignment Improvement in English-Chinese Translation

Description

Links & Downloads

Tags

Additional Fields