There is a need to have an automated system that can read family history books or other historical texts and extract as many genealogy facts as possible from them. Embley and others have applied traditional information extraction techniques to this problem in a system called OntoES with a reasonable amount of success. In parallel much linguistic theory has been developed in the past decades, and Lonsdale and others have built computational embodiments of some of these theories using Soar. In this thesis we introduce a system called OntoSoar which combines the Link Grammar Parser using a grammar customized for family history texts with an innovative semantic analyzer inspired by construction grammars to extract genealogical facts from family history books and use them to populate a conceptual model compatible with OntoES with facts derived from the text. The system produces good results on the texts tested so far, and shows promise of being able to do even better with further development.
Identifer | oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-5132 |
Date | 24 June 2014 |
Creators | Lindes, Peter |
Publisher | BYU ScholarsArchive |
Source Sets | Brigham Young University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | http://lib.byu.edu/about/copyright/ |
Page generated in 0.0017 seconds