Return to search

Cross-lingual Information Retrieval On Turkish And English Texts

In this thesis, cross-lingual information retrieval (CLIR) approaches are comparatively evaluated
for Turkish and English texts. As a complementary study, knowledge-based methods
for word sense disambiguation (WSD), which is one of the most important parts of the CLIR
studies, are compared for Turkish words.
Query translation and sense indexing based CLIR approaches are used in this study. In query
translation approach, we use automatic and manual word sense disambiguation methods and
Google translation service during translation of queries. In sense indexing based approach,
documents are indexed according to meanings of words instead of words themselves. Retrieval
of documents is performed according to meanings of the query words as well. During
the identification of intended meaning of query terms, manual and automatic word sense disambiguation
methods are used and compared to each other.
Knowledge based WSD methods that use different gloss enrichment techniques are compared
for Turkish words. Turkish WordNet is used as a primary knowledge base and English
WordNet and Turkish Wikipedia are employed as enrichment resources. Meanings of
words are more clearly identified by using semantic relations defined in WordNets and Turkish
Wikipedia. Also, during calculation of semantic relatedness of senses, cosine similarity
metric is used as an alternative metric to word overlap count. Effects of using cosine similarity
metric are observed for each WSD methods that use different knowledge bases.

Identiferoai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/12611903/index.pdf
Date01 April 2010
CreatorsBoynuegri, Akif
ContributorsBirturk, Aysenur
PublisherMETU
Source SetsMiddle East Technical Univ.
LanguageEnglish
Detected LanguageEnglish
TypeM.S. Thesis
Formattext/pdf
RightsTo liberate the content for public access

Page generated in 0.0021 seconds