Spelling suggestions: "subject:"crosslanguage information retrieval."" "subject:"chosenlanguage information retrieval.""
1 |
Zulu-English cross-language information retrieval : an analysis of errorsNel, Johannes Gerhardus 04 September 2006 (has links)
Please read the abstract in the 00front part of this document / Dissertation (MA (Information Science))--University of Pretoria, 2006. / Information Science / unrestricted
|
2 |
Named entity translation matching and learning with mining from multilingual news.January 2004 (has links)
Cheung Pik Shan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 79-82). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Named Entity Translation Matching --- p.2 / Chapter 1.2 --- Mining New Translations from News --- p.3 / Chapter 1.3 --- Thesis Organization --- p.4 / Chapter 2 --- Related Work --- p.5 / Chapter 3 --- Named Entity Matching Model --- p.9 / Chapter 3.1 --- Problem Nature --- p.9 / Chapter 3.2 --- Matching Model Investigation --- p.12 / Chapter 3.3 --- Tokenization --- p.15 / Chapter 3.4 --- Hybrid Semantic and Phonetic Matching Algorithm --- p.16 / Chapter 4 --- Phonetic Matching Model --- p.22 / Chapter 4.1 --- Generating Phonetic Representation for English --- p.22 / Chapter 4.1.1 --- Phoneme Generation --- p.22 / Chapter 4.1.2 --- Training the Tagging Lexicon and Transformation Rules --- p.25 / Chapter 4.2 --- Generating Phonetic Representation for Chinese --- p.29 / Chapter 4.3 --- Phonetic Matching Algorithm --- p.31 / Chapter 5 --- Learning Phonetic Similarity --- p.37 / Chapter 5.1 --- The Widrow-Hoff Algorithm --- p.39 / Chapter 5.2 --- The Exponentiated-Gradient Algorithm --- p.41 / Chapter 5.3 --- The Genetic Algorithm --- p.42 / Chapter 6 --- Experiments on Named Entity Matching Model --- p.43 / Chapter 6.1 --- Results for Learning Phonetic Similarity --- p.44 / Chapter 6.2 --- Results for Named Entity Matching --- p.46 / Chapter 7 --- Mining New Entity Translations from News --- p.48 / Chapter 7.1 --- Metadata Generation --- p.52 / Chapter 7.2 --- Discovering Comparable News Cluster --- p.54 / Chapter 7.2.1 --- News Preprocessing --- p.54 / Chapter 7.2.2 --- Gloss Translation --- p.55 / Chapter 7.2.3 --- Comparable News Cluster Discovery --- p.62 / Chapter 7.3 --- Named Entity Cognate Generation --- p.64 / Chapter 7.4 --- Entity Matching --- p.66 / Chapter 7.4.1 --- Matching Algorithm --- p.66 / Chapter 7.4.2 --- Matching Result Production --- p.68 / Chapter 8 --- Experiments on Mining New Translations --- p.69 / Chapter 9 --- Experiments on Context-based Gloss Translation --- p.72 / Chapter 9.1 --- Results on Chinese News Translation --- p.73 / Chapter 9.2 --- Results on Arabic News Translation --- p.75 / Chapter 10 --- Conclusions and Future Work --- p.77 / Bibliography --- p.79 / A --- p.83 / B --- p.85 / C --- p.87 / D --- p.89 / E --- p.91 / F --- p.94 / G --- p.95
|
3 |
Cross language information retrieval for languages with scarce resourcesLoza, Christian E. Mihalcea, Rada F., January 2009 (has links)
Thesis (M.S.)--University of North Texas, May, 2009. / Title from title page display. Includes bibliographical references.
|
4 |
Cross Language Information Retrieval for Languages with Scarce ResourcesLoza, Christian 05 1900 (has links)
Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.
|
5 |
Att maskinöversätta sökfrågor : En studie av Google Translate och Bing Translators förmåga att översätta svenska sammansättningar i ett CLIR-perspektiv / Machine translation of queries : A study of the ability of Google Translate and Bing Translator to translate Swedish compounds in a CLIR perspectiveQureshi, Karl January 2016 (has links)
Syftet med denna uppsats är att undersöka hur väl Google Translate respektive Bing Translator fungerar vid översättning av sökfrågor med avseende på svenska sammansättningar, samt försöka utröna huruvida det finns något samband mellan utfallet och sammansättningarnas komplexitet. Testmiljön utgörs av Europaparlamentets offentliga dokumentregister. Undersökningen är emellertid begränsad till Europeiska rådets handlingar, som till antalet är 1 334 på svenska respektive 1 368 på engelska. Analysen av data har dels skett utifrån precision och återvinningsgrad, dels utifrån en kontrastiv analys för att kunna ge en mer enhetlig bild på det undersökta fenomenet. Resultatet visar att medelvärdet varierar mellan 0,287 och 0,506 för precision samt 0,400 och 0,614 för återvinningsgrad beroende på ordtyp och översättningstjänst. Vidare visar resultatet att det inte tycks finnas något tydligt samband mellan effektivitet och sammansättningarnas komplexitet. I stället tycks de lägre värdena bero på synonymi, och då gärna inom själva sammansättningen, samt hyponymi. I det senare fallet beror det dels på översättningstjänsternas oförmåga att återge lämpliga översättningar, dels på det engelska språkets tendens att bilda sammansättningar med lösa substantivattribut.
|
6 |
Multi-lingual text retrieval and mining.January 2003 (has links)
Law Yin Yee. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 130-134). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cross-Lingual Information Retrieval (CLIR) --- p.2 / Chapter 1.2 --- Bilingual Term Association Mining --- p.5 / Chapter 1.3 --- Our Contributions --- p.6 / Chapter 1.3.1 --- CLIR --- p.6 / Chapter 1.3.2 --- Bilingual Term Association Mining --- p.7 / Chapter 1.4 --- Thesis Organization --- p.8 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- CLIR Techniques --- p.9 / Chapter 2.1.1 --- Existing Approaches --- p.9 / Chapter 2.1.2 --- Difference Between Our Model and Existing Approaches --- p.13 / Chapter 2.2 --- Bilingual Term Association Mining Techniques --- p.13 / Chapter 2.2.1 --- Existing Approaches --- p.13 / Chapter 2.2.2 --- Difference Between Our Model and Existing Approaches --- p.17 / Chapter 3 --- Cross-Lingual Information Retrieval (CLIR) --- p.18 / Chapter 3.1 --- Cross-Lingual Query Processing and Translation --- p.18 / Chapter 3.1.1 --- Query Context and Document Context Generation --- p.20 / Chapter 3.1.2 --- Context-Based Query Translation --- p.23 / Chapter 3.1.3 --- Query Term Weighting --- p.28 / Chapter 3.1.4 --- Final Weight Calculation --- p.30 / Chapter 3.2 --- Retrieval on Documents and Automated Summaries --- p.32 / Chapter 4 --- Experiments on Cross-Lingual Information Retrieval --- p.38 / Chapter 4.1 --- Experimental Setup --- p.38 / Chapter 4.2 --- Results of English-to-Chinese Retrieval --- p.45 / Chapter 4.2.1 --- Using Mono-Lingual Retrieval as the Gold Standard --- p.45 / Chapter 4.2.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.49 / Chapter 4.3 --- Results of Chinese-to-English Retrieval --- p.53 / Chapter 4.3.1 --- Using Mono-lingual Retrieval as the Gold Standard --- p.53 / Chapter 4.3.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.57 / Chapter 5 --- Discovering Comparable Multi-lingual Online News for Text Mining --- p.61 / Chapter 5.1 --- Story Representation --- p.62 / Chapter 5.2 --- Gloss Translation --- p.64 / Chapter 5.3 --- Comparable News Discovery --- p.67 / Chapter 6 --- Mining Bilingual Term Association Based on Co-occurrence --- p.75 / Chapter 6.1 --- Bilingual Term Cognate Generation --- p.75 / Chapter 6.2 --- Term Mining Algorithm --- p.77 / Chapter 7 --- Phonetic Matching --- p.87 / Chapter 7.1 --- Algorithm Design --- p.87 / Chapter 7.2 --- Discovering Associations of English Terms and Chinese Terms --- p.93 / Chapter 7.2.1 --- Converting English Terms into Phonetic Representation --- p.93 / Chapter 7.2.2 --- Discovering Associations of English Terms and Man- darin Chinese Terms --- p.100 / Chapter 7.2.3 --- Discovering Associations of English Terms and Can- tonese Chinese Terms --- p.104 / Chapter 8 --- Experiments on Bilingual Term Association Mining --- p.111 / Chapter 8.1 --- Experimental Setup --- p.111 / Chapter 8.2 --- Result and Discussion of Bilingual Term Association Mining Based on Co-occurrence --- p.114 / Chapter 8.3 --- Result and Discussion of Phonetic Matching --- p.121 / Chapter 9 --- Conclusions and Future Work --- p.126 / Chapter 9.1 --- Conclusions --- p.126 / Chapter 9.1.1 --- CLIR --- p.126 / Chapter 9.1.2 --- Bilingual Term Association Mining --- p.127 / Chapter 9.2 --- Future Work --- p.128 / Bibliography --- p.134 / Chapter A --- Original English Queries --- p.135 / Chapter B --- Manual translated Chinese Queries --- p.137 / Chapter C --- Pronunciation symbols used by the PRONLEX Lexicon --- p.139 / Chapter D --- Initial Letter-to-Phoneme Tags --- p.141 / Chapter E --- English Sounds with their Chinese Equivalents --- p.143
|
7 |
Exploiting common search interests across languages for web search. / 利用跨語言的共同搜索興趣幫助萬維網搜索 / CUHK electronic theses & dissertations collection / Li yong kua yu yan de gong tong sou suo xing qu bang zhu wan wei wang sou suoJanuary 2010 (has links)
This work studies something new in Web search to cater for users' cross-lingual information needs by using the common search interests found across different languages. We assume a generic scenario for monolingual users who are interested to find their relevant information under three general settings: (1) find relevant information in a foreign language, which needs machine to translate search results into the user's own language; (2) find relevant information in multiple languages including the source language, which also requires machine translation for back translating search results; (3) find relevant information only in the user's language, but due to the intrinsic cross-lingual nature of many queries, monolingual search can be done with the assistance of cross-lingual information from another language. / We approach the problem by substantially extending two core mechanics of information retrieval for Web search across languages, namely, query formulation and relevance ranking. First, unlike traditional cross-lingual methods such as query translation and expansion, we propose a novel Cross-Lingual Query Suggestion model by leveraging large-scale query logs of search engine to learn to suggest closely related queries in the target language for a given source language query. The rationale behind our approach is the ever-increasing common search interests across Web users in different languages. Second, we generalize the usefulness of common search interests to enhance relevance ranking of documents by exploiting the correlation among the search results derived from bilingual queries, and overcome the weakness of traditional relevance estimation that only uses information of a single language or that of different languages separately. To this end, we attempt to learn a ranking function that incorporates various similarity measures among the retrieved documents in different languages. By modeling the commonality or similarity of search results, relevant documents in one language may help the relevance estimation of documents in a different language, and hence can improve the overall relevance estimation. This similar intuition is applicable to all the three settings described above. / Gao, Wei. / Adviser: Kaw-Fai Wong. / Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2010. / Includes bibliographical references (leaves 114-122). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
|
8 |
A corpus-based approach for cross-lingual information retrieval. / CUHK electronic theses & dissertations collection / Digital dissertation consortiumJanuary 2004 (has links)
Li Kar Wing. / "July 2004." / Thesis (Ph.D.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (p. 127-139). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web. / Abstracts in English and Chinese.
|
9 |
Improved Cross-language Information Retrieval via Disambiguation and Vocabulary DiscoveryZhang, Ying, ying.yzhang@gmail.com January 2007 (has links)
Cross-lingual information retrieval (CLIR) allows people to find documents irrespective of the language used in the query or document. This thesis is concerned with the development of techniques to improve the effectiveness of Chinese-English CLIR. In Chinese-English CLIR, the accuracy of dictionary-based query translation is limited by two major factors: translation ambiguity and the presence of out-of-vocabulary (OOV) terms. We explore alternative methods for translation disambiguation, and demonstrate new techniques based on a Markov model and the use of web documents as a corpus to provide context for disambiguation. This simple disambiguation technique has proved to be extremely robust and successful. Queries that seek topical information typically contain OOV terms that may not be found in a translation dictionary, leading to inappropriate translations and consequent poor retrieval performance. Our novel OOV term translation method is based on the Chinese authorial practice of including unfamiliar English terms in both languages. It automatically extracts correct translations from the web and can be applied to both Chinese-English and English-Chinese CLIR. Our OOV translation technique does not rely on prior segmentation and is thus free from seg mentation error. It leads to a significant improvement in CLIR effectiveness and can also be used to improve Chinese segmentation accuracy. Good quality translation resources, especially bilingual dictionaries, are valuable resources for effective CLIR. We developed a system to facilitate construction of a large-scale translation lexicon of Chinese-English OOV terms using the web. Experimental results show that this method is reliable and of practical use in query translation. In addition, parallel corpora provide a rich source of translation information. We have also developed a system that uses multiple features to identify parallel texts via a k-nearest-neighbour classifier, to automatically collect high quality parallel Chinese-English corpora from the web. These two automatic web mining systems are highly reliable and easy to deploy. In this research, we provided new ways to acquire linguistic resources using multilingual content on the web. These linguistic resources not only improve the efficiency and effectiveness of Chinese-English cross-language web retrieval; but also have wider applications than CLIR.
|
10 |
Accessing and using multilanguage information by users searching in differenct information retrieval systemsHa, Yoo Jin. January 2008 (has links)
Thesis (Ph. D.)--Rutgers University, 2008. / "Graduate Program in Communication, Information and Library Studies." Includes bibliographical references (p. 226-238).
|
Page generated in 0.1261 seconds