Return to search

Machine Transcription Conversion Between Perso-Arabic and Romanized Writing Systems

Perso-Arabic script is the official writing system in Iran. Romanized transcriptions, based on phonology of Persian, have been extensively used in electronic communications especially on Internet. Dealing with the conversion between these two types of writing systems has been an interesting topic in Natural Language Processing. Similar to Machine Translation, these conversions can be applied at different grammatical layers; such as sentence, phrase or word layer. In this thesis, by choosing Dabire as a standard Romanized transcription, we introduce two approaches to achieve such conversions at word level. In Lexicon-based approach we use Finite State Technology for bi-directional conversion between Perso-Arabic and Dabire. The second approach uses association analysis for statistical conversion from Perso-Arabic to Dabire.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-61029
Date January 2010
CreatorsYaesoubi, Maziar
PublisherLinköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/masterThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0017 seconds