Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.
Identifer | oai:union.ndltd.org:unt.edu/info:ark/67531/metadc5470 |
Date | 12 1900 |
Creators | Coursey, Kino High |
Contributors | Mihalcea, Rada, 1974-, Tarau, Paul, Lefkowitz, Larry |
Publisher | University of North Texas |
Source Sets | University of North Texas |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Format | Text |
Rights | Public, Copyright, Coursey, Kino High, Copyright is held by the author, unless otherwise noted. All rights reserved. |
Page generated in 0.0027 seconds