With the emergence and proliferation of Internet services and e-commerce applications, a tremendous amount of information is accessible online, typically as textual documents. To facilitate subsequent access to and leverage from this information, the efficient and effective management¡Xspecifically, text categorization¡Xof the ever-increasing volume of textual documents is essential to organizations and person. Existing text categorization techniques focus mainly on categorizing monolingual documents. However, with the globalization of business environments and advances in Internet technology, an organization or person often retrieves and archives documents in different languages, thus creating the need for cross-lingual text categorization. Motivated by the significance of and need for such a cross-lingual text categorization technique, this thesis designs a technique with two different category assignment methods, namely, individual- and cluster-based. The empirical evaluation results show that the cross-lingual text categorization technique performs well and the cluster-based method outperforms the individual-based method.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0729104-235006 |
Date | 29 July 2004 |
Creators | Lin, Yen-Ting |
Contributors | Chih-Ping Wei, Paul Jen-Hwa Hu, Hsing Kenny Cheng |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0729104-235006 |
Rights | campus_withheld, Copyright information available at source archive |
Page generated in 0.002 seconds