A new system for Arabic Machine Translation (called MEANA MT) has been built. This system is capable of the analysis of English language as a source and can convert the given sentences into Arabic. The designed system contains three sets of grammar rules governing the PARSING, TRANSFORMATION AND GENERATION PHASES. In the system, word sense ambiguity and some pragmatic patterns were resolved. A new two-way (Analysis/Generation) computational lexicon system dealing with the morphological analysis of the Arabic language has been created. The designed lexicon contains a set of rules governing the morphological inflection and derivation of Arabic nouns, verbs, verb "to be", verb "not to be" and pronouns. The lexicon generates Arabic word forms and their inflectional affixes such as plural and gender morphemes as well as attached pronouns, each according to its rules. It can not parse or generate unacceptable word inflections. This computational system is capable of dealing with vowelized Arabic words by parsing the vowel marks which are attached to the letters. Semantic value pairs were developed to show ~he word sense and other issues in morphology; e.g. genders, numbers and tenses. The system can parse and generate some pragmatic sentences and phrases like proper names, titles, acknowledgements, dates, telephone numbers and addresses. A Lexical Functional Grammar (LFG) formalism is used to combine the syntactic, morphological and semantic features. The grammar rules of this system were implemented and compiled in COMMON. LISP based on Tomita's Generalised LR parsing algorithm, augmented by Pseudo and Full Unification packages. After parsing, sentence constituents of the English sentence are rep- _ resented as Feature Structures (F-Structures). These take part in the transfer and generation process which uses transformation' grammar rules to change the English F-Structure into Arabic F-Structure. These Arabic F-Structure features will be suitable for the Arabic generation grammar to build the required Arabic sentence. This system has been tested on three domains (sentences and phrases); the first is a selected children's story, the second semantic sentences and the third domain consists of pragmatic sentences. This research could be considered as a complete solution for a personal MT system for small messages and sublanguage domains.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:341826 |
Date | January 2001 |
Creators | Alneami, Ahmed H. |
Publisher | University of Sheffield |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://etheses.whiterose.ac.uk/14819/ |
Page generated in 0.0017 seconds