Automatic reading comprehension (RC) systems integrate various kinds of natural language processing (NLP) technologies to analyze a given passage and generate or extract answers in response to questions about the passage. Previous work applied a lot of NLP technologies including shallow syntactic analyses (e.g. base noun phrases), semantic analyses (e.g. named entities) and discourse analyses (e.g. pronoun referents) in the bag-of-words (BOW) matching approach. This thesis proposes a novel RC approach that integrates a set of NLP technologies in a maximum entropy (ME) framework to estimate candidate answer sentences' probabilities being answers. In contrast to previous RC approaches, which are in English-only, the presented RC approach is the first one for both English and Chinese, the two languages used by most people in the world. In order to support the evaluation of the bilingual RC systems, a parallel English and Chinese corpus is also designed and developed. Annotations deemed relevant to the RC task are also included in the corpus. In addition, useful NLP technologies are explored from a new perspective---referring the pedagogical guidelines of humans, reading skills are summarized and mapped to various NLP technologies. Practical NLP technologies, categorized as shallow syntactic analyses (i.e. part-of-speech tags, voices and tenses) and deep syntactic analyses (i.e. syntactic parse trees and dependency parse trees) are then selected for integration. The proposed approach is evaluated on an English corpus, namely Remedia and our bilingual corpus. The experimental results show that our approach significantly improves the RC results on both English and Chinese corpora. / Xu, Kui. / Adviser: Helen Mei-Ling Meng. / Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3618. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 132-141). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_344285 |
Date | January 2008 |
Contributors | Xu, Kui, Chinese University of Hong Kong Graduate School. Division of Systems Engineering and Engineering Management. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, theses |
Format | electronic resource, microform, microfiche, 1 online resource (xvii, 141 leaves : ill.) |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.002 seconds