Return to search

Cross-Lingual Text Categorization

With the emergence and proliferation of Internet services and e-commerce applications, a tremendous amount of information is accessible online, typically as textual documents. To facilitate subsequent access to and leverage from this information, the efficient and effective management¡Xspecifically, text categorization¡Xof the ever-increasing volume of textual documents is essential to organizations and person. Existing text categorization techniques focus mainly on categorizing monolingual documents. However, with the globalization of business environments and advances in Internet technology, an organization or person often retrieves and archives documents in different languages, thus creating the need for cross-lingual text categorization. Motivated by the significance of and need for such a cross-lingual text categorization technique, this thesis designs a technique with two different category assignment methods, namely, individual- and cluster-based. The empirical evaluation results show that the cross-lingual text categorization technique performs well and the cluster-based method outperforms the individual-based method.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0729104-235006
Date29 July 2004
CreatorsLin, Yen-Ting
ContributorsChih-Ping Wei, Paul Jen-Hwa Hu, Hsing Kenny Cheng
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0729104-235006
Rightscampus_withheld, Copyright information available at source archive

Page generated in 0.0019 seconds