• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Cross-Lingual Category Integration Technique

Tzeng, Guo-han 30 August 2006 (has links)
With the emergence of the Internet, many innovative and interesting applications from different countries have been stimulated and e-commerce is also getting more and more pervasive. Under this scenario, tremendous amount of information expressed in different languages are exchanged and shared by not only organizations but also individuals in the modern global environment. A large proportion of information is typically formatted and available as textual documents and managed by using categories. Consequently, the development of a practical and effective technique to deal with the problem of cross-lingual category integration (CLCI) becomes a very essential and important issue. Several category integration techniques have been proposed, but all of them deal with category integration involving only monolingual documents. In response, in this study, we combine the existing cross-lingual text categorization techniques with an existing monolingual category integration technique (specifically, Enhanced Naive Bayes) and proposed a CLCI solution to address cross-lingual category integration. Our empirical evaluation results show that our proposed CLCI technique demonstrates its feasibility and superior effectiveness.
2

A Clustering-based Approach to Document-Category Integration

Cheng, Tsang-Hsiang 04 September 2003 (has links)
E-commerce applications generate and consume tremendous amount of online information that is typically available as textual documents. Observations of textual document management practices by organizations or individuals suggest the popularity of using categories (or category hierarchies) to organize, archive and access documents. On the other hand, an organization (or individual) also constantly acquires new documents from various Internet sources. Consequently, integration of relevant categorized documents into existent categories of the organization (or individual) becomes an important issue in the e-commerce era. Existing categorization-based approach for document-category integration (specifically, the Enhanced Naïve Bayes classifier) incurs several limitations, including homogeneous assumption on categorization schemes used by master and source catalogs and requirement for a large-sized master categories as training data. In this study, we developed a Clustering-based Category Integration (CCI) technique to deal with integrating two document catalogs each of which is organized non-hierarchically (i.e., in a flat set). Using the Enhanced Naïve Bayes classifier as benchmarks, the empirical evaluation results showed that the proposed CCI technique appeared to improve the effectiveness of document-category integration accuracy in different integration scenarios and seemed to be less sensitive to the size of master categories than the categorization-based approach. Furthermore, to integrate the document categories that are organized hierarchically, we proposed a Clustering-based category-Hierarchy Integration (referred to as CHI) technique extended the CCI technique and for category-hierarchy integration. The empirical evaluation results showed that the CHI technique appeared to improve the effectiveness of hierarchical document-category integration than that attained by CCI under homogeneous and comparable scenarios.

Page generated in 0.1402 seconds