Directed graphs are an intuitive and versatile representation of natural language meaning because they can capture relationships between instances of events and entities, including cases where entities play multiple roles. Yet, there are few approaches in natural language processing that use graph manipulation techniques for semantic parsing. This dissertation studies graph-based representations of natural language meaning, discusses a formal-grammar based approach to the semantic construction of graph representations, and develops methods for open-domain semantic parsing into such representations. To perform string-to-graph translation I use synchronous hyperedge replacement grammars (SHRG). The thesis studies this grammar formalism from a formal, linguistic, and algorithmic perspective. It proposes a new lexicalized variant of this formalism (LSHRG), which is inspired by tree insertion grammar and provides a clean syntax/semantics interface. The thesis develops a new method for automatically extracting SHRG and LSHRG grammars from annotated “graph banks”, which uses existing syntactic derivations to structure the extracted grammar. It also discusses a new method for semantic parsing with large, automatically extracted grammars, that translates syntactic derivations into derivations of the synchronous grammar, as well as initial work on parse reranking and selection using a graph model. I evaluate this work on the Abstract Meaning Representation (AMR) dataset. The results show that the grammar-based approach to semantic analysis shows promise as a technique for semantic parsing and that string-to-graph grammars can be induced efficiently. Taken together, the thesis lays the foundation for future work on graph methods in natural language semantics.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8JH3ZRR |
Date | January 2017 |
Creators | Bauer, Daniel |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.0019 seconds