Spelling suggestions: "subject:"[een] COMPUTATIONAL LINGUISTICS"" "subject:"[enn] COMPUTATIONAL LINGUISTICS""
1 |
Class-based statistical models for lexical knowledge acquisitionClark, Stephen January 2001 (has links)
This thesis is about the automatic acquisition of a particular kind of lexical knowledge, namely the knowledge of which noun senses can fill the argument slots of predicates. The knowledge is represented using probabilities, which agrees with the intuition that there are no absolute constraints on the arguments of predicates, but that the constraints are satisfied to a certain degree; thus the problem of knowledge acquisition becomes the problem of probability estimation from corpus data. The problem with defining a probability model in terms of senses is that this involves a huge number of parameters, which results in a sparse data problem. The proposal here is to define a probability model over senses in a semantic hierarchy, and exploit the fact that senses can be grouped into classes consisting of semantically similar senses. A novel class-based estimation technique is developed, together with a procedure that determines a suitable class for a sense (given a predicate and argument position). The problem of determining a suitable class can be thought of as finding a suitable level of generalisation in the hierarchy. The generalisation procedure uses a statistical test to locate areas consisting of semantically similar senses, and, as well as being used for probability estimation, is also employed as part of a re-estimation algorithm for estimating sense frequencies from incomplete data. The rest of the thesis considers how the lexical knowledge can be used to resolve structural ambiguities, and provides empirical evaluations. The estimation techniques are first integrated into a parse selection system, using a probabilistic dependency model to rank the alternative parses for a sentence. Then, a PP-attachment task is used to provide an evaluation which is more focussed on the class-based estimation technique, and, finally, a pseudo disambiguation task is used to compare the estimation technique with alternative approaches.
|
2 |
Knowledge representation in natural language : the wordicle - a subconscious connectionDowney, Daniel J. G. January 1991 (has links)
No description available.
|
3 |
Computing presuppositions in an incremental natural language processing systemBridge, Derek G. January 1991 (has links)
No description available.
|
4 |
Learning unification-based natural language grammarsOsborne, Miles January 1994 (has links)
No description available.
|
5 |
The representation of natural language to enable neural networks to detect syntactic featuresLyon, Caroline January 1994 (has links)
No description available.
|
6 |
Measuring text reuseClough, Paul D. January 2002 (has links)
No description available.
|
7 |
Automatic generation of spatial configurations in user interfacesFischer, Markus January 1998 (has links)
No description available.
|
8 |
New models of natural language for consultative computingGwei, G. M. January 1987 (has links)
No description available.
|
9 |
Natural language generation in the LOLITA system an engineering approachSmith, Mark H. January 1995 (has links)
Natural Language Generation (NLG) is the automatic generation of Natural Language (NL) by computer in order to meet communicative goals. One aim of NL processing (NLP) is to allow more natural communication with a computer and, since communication is a two-way process, a NL system should be able to produce as well as interpret NL text. This research concerns the design and implementation of a NLG module for the LOLITA system. LOLITA (Large scale, Object-based, Linguistic Interactor, Translator and Analyser) is a general purpose base NLP system which performs core NLP tasks and upon which prototype NL applications have been built. As part of this encompassing project, this research shares some of its properties and methodological assumptions: the LOLITA generator has been built following Natural Language Engineering principles uses LOLITA's SemNet representation as input and is implemented in the functional programming language Haskell. As in other generation systems the adopted solution utilises a two component architecture. However, in order to avoid problems which occur at the interface between traditional planning and realisation modules (known as the generation gap) the distribution of tasks between the planner and plan-realiser is different: the plan-realiser, in the absence of detailed planning instructions, must perform some tasks (such as the selection and ordering of content) which are more traditionally performed by a planner. This work largely concerns the development of the plan- realiser and its interface with the planner. Another aspect of the solution is the use of Abstract Transformations which act on the SemNet input before realisation leading to an increased ability for creating paraphrases. The research has lead to a practical working solution which has greatly increased the power of the LOLITA system. The research also investigates how NLG systems can be evaluated and the advantages and disadvantages of using a functional language for the generation task.
|
10 |
Managing surface ambiguity in the generation of referring expressionsKhan, Imtiaz Hussain January 2010 (has links)
Managing Surface Ambiguity in the Generation of Referring Expressions (Imtiaz Hussain Khan) Most algorithms for the Generation of Referring Expressions tend to generate distinguishing descriptions at the semantic level, disregarding the ways in which surface issues can affect their quality. This thesis explores the role of surface ambiguities in referring expressions and how the risk of such ambiguities should be taken into account by an algorithm that generates referring expressions. This was done by focussing on the type of surface ambiguity which arises when adjectives occur in coordinated structures (as in the old men and women). The central idea is to use statistical information about lexical co-occurrence to estimate which interpretation of a phrase is most likely for human readers, and to avoid generating phrases where misunderstandings are likely. We develop specific hypotheses, and test them by running experiments with human participants. We found that the Word Sketches are a reliable source of information to predict the likelihood of a reading. The avoidance of misunderstandings is not the only issue to be dealt with in this thesis. Since the avoidance of misunderstandings might be achieved at the cost of very lengthy (or perhaps very disfluent) expressions, it is important to select an optimal expression (i.e., the expression which is preferred by most readers) from various alternatives available. Again, we develop specific hypotheses, and recorded human preferences in a forced-choice manner. We found that participants preferred clear (i.e., not likely to be misunderstood) expressions to unclear ones, but if several of the expressions were clear then brief expressions were preferred over their longer counterparts. The results of these empirical studies motivated the design of a GRE algorithm. The implemented algorithm builds a plural distinguishing description for the intended referents (if one exists), using words; applies transformation rules to the distinguishing description to construct a set of distinguishing descriptions that are logically equivalent. Each description in the set is realised as a corresponding English noun phrase (NP) using appropriate realisation rules; the most likely reading of each NP is determined. One NP is selected for output. A further experiment verifies that the kinds of expressions produced by the algorithm are optimal for readers: they are understood accurately and quickly by readers.
|
Page generated in 0.0707 seconds