Developing computational algorithms that capture the complex structure
of natural language is an open problem. In particular, learning the
abstract properties of language only from usage data remains a
challenge. In this dissertation, we present a probabilistic
usage-based model of verb argument structure acquisition that can
successfully learn abstract knowledge of language from instances of
verb usage, and use this knowledge in various language tasks. The
model demonstrates the feasibility of a usage-based account of
language learning, and provides concrete explanation for the
observed patterns in child language acquisition.
We propose a novel representation for the general constructions of
language as probabilistic associations between syntactic and semantic
features of a verb usage; these associations generalize over the
syntactic patterns and the fine-grained semantics of both the verb and
its arguments. The probabilistic nature of argument structure
constructions in the model enables it to capture both statistical
effects in language learning, and adaptability in language use. The
acquisition of constructions is modeled as detecting similar usages
and grouping them together. We use a probabilistic measure of
similarity between verb usages, and a Bayesian framework for
clustering them. Language use, on the other hand, is modeled as a
prediction problem: each language task is viewed as finding the best
value for a missing feature in a usage, based on the available
features in that same usage and the acquired knowledge of language so
far. In formulating prediction, we use the same Bayesian framework as
used for learning, a formulation which takes into account both the
general knowledge of language (i.e., constructions) and the specific
behaviour of each verb. We show through computational simulation that
the behaviour of the model mirrors that of young children in some
relevant aspects. The model goes through the same learning stages as
children do: the conservative use of the more frequent usages for each
individual verb at the beginning, followed by a phase when general
patterns are grasped and applied overtly, which leads to occasional
overgeneralization errors. Such errors cease to be made over time as
the model processes more input.
We also investigate the learnability of verb semantic roles, a
critical aspect of linking the syntax and semantics of verbs. In
contrary to many existing linguistic theories and computational models
which assume that semantic roles are innate and fixed, we show that
general conceptions of semantic roles can be learned from the semantic
properties of the verb arguments in the input usages. We represent
each role as a semantic profile for an argument position in a general
construction, where a profile is a probability distribution over a set
of semantic properties that verb arguments can take. We extend this
view to model the learning and use of verb selectional preferences, a
phenomenon usually viewed as separate from verb semantic roles. Our
experimental results show that the model learns intuitive profiles for
both semantic roles and selectional preferences. Moreover, the learned
profiles are shown to be useful in various language tasks as observed
in reported experimental data on human subjects, such as resolving
ambiguity in language comprehension and simulating human plausibility
judgements.
Identifer | oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/11180 |
Date | 30 July 2008 |
Creators | Alishahi, Afra |
Contributors | Stevenson, Suzanne |
Source Sets | University of Toronto |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Format | 1013950 bytes, application/pdf |
Page generated in 0.0119 seconds