Return to search

Using minimal recursion semantics in Japanese question answering

Question answering is a research field with the aim of providing answers to a user’s question, phrased in natural language. In this thesis I explore some techniques used in question answering, working towards the twin goals of using deep linguistic knowledge robustly as well as using language-independent methods wherever possible. While the ultimate aim is cross-language question answering, in this research experiments are conducted over Japanese data, concentrating on factoid questions. The two main focus areas, identified as the two tasks most likely to benefit from linguistic knowledge, are question classification and answer extraction. / In question classification, I investigate the issues involved in the two common methods used for this task—pattern matching and machine learning. I find that even with a small amount of training data (2000 questions), machine learning achieves better classification accuracy than pattern matching with much less effort. The other issue I explore in question classification is the classification accuracy possible with named entity taxonomies of different sizes and shapes. Results demonstrate that, although the accuracy decreases as the taxonomy size increases, the ability to use soft decision making techniques as well as high accuracies achieved in certain classes make larger, hierarchical taxonomies a viable option. / For answer extraction, I use Robust Minimal Recursion Semantics (RMRS) as a sentence representation to determine similarity between questions and answers, and then use this similarity score, along with other information discovered during comparison, to score and rank answer candidates. Results were slightly disappointing, but close examination showed that 40% of errors were due to answer candidate extraction, and the scoring algorithm worked very well. Interestingly, despite the lower accuracy achieved during question classification, the larger named entity taxonomies allowed much better accuracy in answer extraction than the smaller taxonomies.

Identiferoai:union.ndltd.org:ADTP/245604
CreatorsDridan, Rebecca
Source SetsAustraliasian Digital Theses Program
LanguageEnglish
Detected LanguageEnglish
RightsTerms and Conditions: Copyright in works deposited in the University of Melbourne Eprints Repository (UMER) is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only, download, print, and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works., Open Access

Page generated in 0.0018 seconds