Spelling suggestions: "subject:"2positive examples"" "subject:"2positive axamples""
1 |
An Ensemble Approach for Text Categorization with Positive and Unlabeled ExamplesChen, Hsueh-Ching 29 July 2005 (has links)
Text categorization is the process of assigning new documents to predefined document categories on the basis of a classification model(s) induced from a set of pre-categorized training documents. In a typical dichotomous classification scenario, the set of training documents includes both positive and negative examples; that is, each of the two categories is associated with training documents. However, in many real-world text categorization applications, positive and unlabeled documents are readily available, whereas the acquisition of samples of negative documents is extremely expensive or even impossible. In this study, we propose and develop an ensemble approach, referred to as E2, to address the limitations of existing algorithms for learning from positive and unlabeled training documents. Using the spam email filtering as the evaluation application, our empirical evaluation results suggest that the proposed E2 technique exhibits more stable and reliable performance than PNB and PEBL.
|
2 |
Optimization and Refinement of XML Schema Inference Approaches / Optimization and Refinement of XML Schema Inference ApproachesKlempa, Michal January 2011 (has links)
Although XML is a widely used technology, the majority of real-world XML documents does not conform to any particular schema. To fill the gap, the research area of automatic schema inference from XML documents has emerged. This work refines and extends recent approaches to the automatic schema inference mainly by exploiting an obsolete schema in the inference process, designing new MDL measures and heuristic excluding of excentric data inputs. The work delivers a ready-to-use and easy-to-extend implementation integrated into the jInfer framework (developed as a software project). Experimental results are a part of the work.
|
Page generated in 0.0634 seconds