Return to search

Large-scale connectionist natural language parsing using lexical semantic and syntactic knowledge

Syntactic parsing plays a pivotal role in most automatic natural language processing systems. The research project presented in this dissertation has focused on two main characteristics of connectionist models for natural language processing: their adaptability to different tagging conventions, and their ability to use multiple linguistic constraints in parallel during sentence processing. In focusing on these key characteristics, an existing hybrid connectionist, shift-reduce corpus-based parsing model has been modified. This parser, which had earlier been trained to acquire linguistic knowledge from the Lancaster Parsed Corpus, has been adapted to learn linguistic knowledge from the Wall Street Journal Corpus. This adaptation is a novel demonstration that this connectionist parser, and by extension, other similar connectionist models, is able to adapt to more than one syntactic tagging convention; this implies their ability to adapt to the underlying linguistic theories used to annotate these corpora. The parser has also been adapted to integrate shallow lexical semantic information with syntactic information for full syntactic parsing. This approach was used to investigate the effect of shallow lexical semantic information on full syntactic parsing. In pursuing the aims of this project, a novel algorithm for semantic tagging of nouns in the Wall Street Journal Corpus has been developed. The lexical semantic information used in this semantic annotation algorithm was extracted from WordNet, an online lexical resource. Using only syntactic information in making parsing decisions, this parsing model was tested on test sets of sentences that were not used during training. The parser generalised to parse these test sentences with an F-measure of 72.5% and 59.5% on sentences from the Lancaster Parsed Corpus and Wall Street Journal Corpus, respectively. On the integration of shallow lexical semantic information with syntactic information in its input representation, the parser generalised to parse test sentences from the Wall Street Journal Corpus with an F-measure of 56.75%. Although this integration did not seem to improve the parser's overall training/generalisation performance, given its present configuration, it did appear to improve the parser's decision making concerning preposition phrase attachment.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:510178
Date January 2007
CreatorsNkantah, Dianabasi Edet
PublisherNottingham Trent University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://irep.ntu.ac.uk/id/eprint/317/

Page generated in 0.0019 seconds