Return to search

A lazy text-based approach to foundational knowledge acquisition.

Knowledge Acquisition (KA) from text requires that a large quantity of prior knowledge be made available to the Natural Language Processing (NLP) system. This prior knowledge is called foundational knowledge. The question of where foundational knowledge comes from in the first place is one of the biggest problem facing NLP. Conventionally, foundational knowledge has been hand-crafted on a task- and domain-specific basis. However, it is difficult to determine beforehand exactly what knowledge will be required. It has been shown within the TANKA project that a potential solution to this problem is to use surface NLP. Surface NLP relies solely on syntax and on the help of a user to elicit knowledge from text, hence effectively eliminating the need for prior-hand crafting of foundational knowledge. However, the domain knowledge obtained in this manner from a text contains gaps. The work presented in this thesis consisted in finding a better method than prior hand-crafting to acquire the knowledge needed to fill those gaps. The method presented, called Lazy KA, uses examples (short NL stories) and failures of an explanation mechanism such as EBL to find these gaps and to interactively and incrementally learn the required new knowledge. When the explanation of a particular example fails, the user is guided through a process that leads to the acquisition of the missing knowledge. Initially, the user is heavily involved, but as more examples are processed, the user becomes less and less involved. The convergence hypothesis, that is that the user interventions would decrease as examples are processed, was verified experimentally by using the prototype system FOKAS implementing these ideas. (Abstract shortened by UMI.)

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/10084
Date January 1995
CreatorsMassey, Louis.
ContributorsMatwin, S.,
PublisherUniversity of Ottawa (Canada)
Source SetsUniversité d’Ottawa
Detected LanguageEnglish
TypeThesis
Format133 p.

Page generated in 0.0016 seconds