Return to search

A syntax analyser and case-marker generator for selected speech acts in Arabic

This thesis describes the design and implementation of a syntax analyser and case-marker generator for selected speech acts in Arabic, named Rameses II. It is intended as a contribution to the field of natural language processing (NLP). The original motivation for this research was the fact that in one form of the Arabic writing system there are no diacritics. Diacritics are small marks placed above or below the main line of characters. It is hypothesised that I iterate users of Arabic supply these diacritics when the input text lacks them. A particularly important subset of diacritics are those associated with the final character of a word, which are called case-markers. It is these, in association with other grammatical information, that indicate the grammatical category of case. Thus, these case-markers are used in Arabic to determine the semantic roles of words in a sentence. It is the purpose of the project described in this thesis to model computationally the process whereby these case-markers are assigned. The Rameses II system is implemented in Prolog. It parses a small but substantial portion of Arabic syntax, speech divided namely twelve of the nineteen classes of act. Arabic sentences are traditionally into declaratives (which are sentences that accept a true or false evaluation) and speech acts (which do not). Because there are already in existence substantial morphological analysers for Arabic, Rameses II assumes an input that has already been analysed morphologically. Thus its main roles are ( 1 ) to parse this input string and ( 2 ) to generate case-markers. Such a generator will be a necessary component in future holistic systems for understanding Arabic. Speech acts are a significant and well-defined area of Arabic grammar, and many aspects of the treatment suggested here could readily be extended to other parts of the language. As its underlying linguistic model of how Arabic grammar works, the system uses systemic grammar. This is a semantically motivated model of language which, as far as I can discover, has not so far been used for the description of Arabic. However, it has been widely used for English and many other languages, and has a rapidly growing use in NLP. This thesis therefore makes a contribution the linguistic description of Arabic, as wei I as the field of NLP. The main body of the thesis is concerned with ( 1 ) current work in the field of natural language processing, (2) the Arabic language, and (3) an indepth discussion of the implemented system, including its architecture, operation, and development. I t concludes with a brief evaluation and suggestions for applications. Thus, this research has successfully applied a new linguistic model to Arabic, resulting in the first automated system for the syntactic analysis of speech acts and the generation of case-markers for that language.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:340905
Date January 1991
CreatorsShata, Osama M. A. I.
PublisherCardiff University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.0021 seconds