1 |
以共現資訊為基礎增進英漢翻譯對列改進方法 / Using Co-Occurrence Information for Alignment Improvement in English-Chinese Translation黃昭憲, Huang,Chao Shainn Unknown Date (has links)
本論文承接呂明欣和張智傑兩位原有的翻譯系統,主要針對詞彙對列模組來進行改善,進而增進詞序範例樹之精確率和數量,以建立高品質的詞序範例樹資料庫,提升整體的翻譯品質。
我們選用國民中學、高級中學和科普雜誌,這三種在句法結構和用字遣詞皆有所差異的中英文平行語料,先透過斷詞系統進行前處理,接著藉由辭典檔索引其相對應之翻譯字詞,以進行中英文詞彙之間的對列,其中更採用了原詞還原和同義詞擴充,來對原始的字詞進行補強。並且將對列完畢之後的遺留字詞,重新搭配組合,以一個中文字詞為基礎,分別對應一個英文字詞和對應多個英文字詞兩種搭配方式,並透過分析公式篩選出可信度較高的新詞對,以便擴充原始的辭典檔,使得詞彙對列模組達到更好的效果。
在評估方面,以不同英文程度的平行語料當作訓練資料,將國際數學與科學教育成就趨勢調查測驗試題當做翻譯對象,利用NIST和BLEU當作評比的標準進行評估。實驗結果顯示,我們所提出的想法有助於提升詞彙對列的效果,並且可以產生更多的詞序範例樹以供翻譯系統進行詞序調動,並提升輔助式翻譯系統的翻譯品質。 / This research continues the translation systems designed by Ming-Shin Lu and Chih-Chieh Chang. We mainly ameliorate the word alignment and create high-quality databases of reordering tree to improve the quality in translation.
In this paper, we explore the possibility of finding alignments for words that are not aligned by methods that employ only information about word translations from English and Chinese dictionaries. With the proposed methods, we were able to align chunks of words between English and Chinese, not limiting to just word-to-word alignment.
In evaluation, parallel corpuses with different degrees for English are used as training data. In addition, Trends in International Mathematics and Science Study questions are chosen as testing data. The evaluation is performed by exploiting NIST and BLEU as standards. The experimental results show that the proposed method enhances the effect of word alignment. Also, it can generate more reordering tree for bilingual structured string tree corredpondence. Besides, the translation quality of assisted translation system will increase by using our method.
|
2 |
電腦輔助漢英與英漢翻譯例句搜尋服務 / A Computer Assisted Environment for Searching Related Translations between Chinese and English賴敏華, Lai, Min Hua Unknown Date (has links)
本論文為提供一個能輔助學生學習英漢翻譯與漢英翻譯時,搜尋翻譯例句的環境。我們的平行語料是利用網路上可取得的文件,如:英語教學網站、學習單等,利用人工擷取中英文對照的句子。標記化語料庫中記錄了中文句、英文句、中文句斷詞後的結果、中文句的詞性標記、中文句結構樹以及英文結構樹等資訊。
使用者輸入的查詢句可包含中文句、英文句及中英文混合句。我們的系統會依據不同的搜尋功能,針對查詢句作前處理,如:斷詞、詞性擷取、結構樹分析、詞性還原、查詢句的詞彙擴展等,再與標記化語料庫作比對,最後提供與查詢句相似的中英文對照句子給使用者,讓使用者在學習翻譯時,有更多類似句可以參考。
我們的系統不容易使用正規的方式評估;為了評估系統的效能,我們記錄各個搜尋功能,在不同門檻值下所得到的類似句句數,並利用NIST及BLEU來評估本系統所提供的類似句品質;另外我們透過問卷調查請受試者勾選本系統所提供類似句。問卷調查結果顯示受試者對於本系統所提供的類似句共識度並不高;本系統在提供10句類似句中,僅有1.6句的類似句是受試者認為有幫助的。 / I present an environment for searching related translations between Chinese and English. A parallel and tagged corpus was constructed based on the text material obtained from the Internet, including English teaching websites and public learn¬ing sheets. The corpus contains both English and Chinese sentences, the infor¬mation about how the Chinese strings were segmented, the POS tags of the Chinese words, and the syntactic structures of the English and Chinese sentences.
The user can use our system to do some queries by entering a Chinese sentence, an English sentence, or any pattern with mixed Chinese and English. The query sentence will be preprocessed according to the search function which the user selects, and the results of preprocessing will be used to search in the tagged corpus. The search results will be the reference sentences that are related to the query sentence.
A formal evaluation of our system is not easy. I evaluated the system by entering a set of selected queries. For those tests, I recorded and compared the amount of reference sentences the system returned, and evaluated the quality of the reference sentences with their BLEU and NIST scores with some standard translations. In addition, I evaluated my system with the help of human subjects. Human subjects were asked to choose useful sentences from the reference sentences returned by my system. Experimental results indicated that the agreements between human subjects were not high, and the human subjects found that only about 1.6 sentences were useful from 10 reference sentences.
|
Page generated in 0.0218 seconds