581 |
Semi-automatic grammar induction for bidirectional machine translation.January 2002 (has links)
Wong, Chin Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 137-143). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Objectives --- p.3 / Chapter 1.2 --- Thesis Outline --- p.5 / Chapter 2 --- Background in Natural Language Understanding --- p.6 / Chapter 2.1 --- Rule-based Approaches --- p.7 / Chapter 2.2 --- Corpus-based Approaches --- p.8 / Chapter 2.2.1 --- Stochastic Approaches --- p.8 / Chapter 2.2.2 --- Phrase-spotting Approaches --- p.9 / Chapter 2.3 --- The ATIS Domain --- p.10 / Chapter 2.3.1 --- Chinese Corpus Preparation --- p.11 / Chapter 3 --- Semi-automatic Grammar Induction - Baseline Approach --- p.13 / Chapter 3.1 --- Background in Grammar Induction --- p.13 / Chapter 3.1.1 --- Simulated Annealing --- p.14 / Chapter 3.1.2 --- Bayesian Grammar Induction --- p.14 / Chapter 3.1.3 --- Probabilistic Grammar Acquisition --- p.15 / Chapter 3.2 --- Semi-automatic Grammar Induction 一 Baseline Approach --- p.16 / Chapter 3.2.1 --- Spatial Clustering --- p.16 / Chapter 3.2.2 --- Temporal Clustering --- p.18 / Chapter 3.2.3 --- Post-processing --- p.19 / Chapter 3.2.4 --- Four Aspects for Enhancements --- p.20 / Chapter 3.3 --- Chapter Summary --- p.22 / Chapter 4 --- Semi-automatic Grammar Induction - Enhanced Approach --- p.23 / Chapter 4.1 --- Evaluating Induced Grammars --- p.24 / Chapter 4.2 --- Stopping Criterion --- p.26 / Chapter 4.2.1 --- Cross-checking with Recall Values --- p.29 / Chapter 4.3 --- Improvements on Temporal Clustering --- p.32 / Chapter 4.3.1 --- Evaluation --- p.39 / Chapter 4.4 --- Improvements on Spatial Clustering --- p.46 / Chapter 4.4.1 --- Distance Measures --- p.48 / Chapter 4.4.2 --- Evaluation --- p.57 / Chapter 4.5 --- Enhancements based on Intelligent Selection --- p.62 / Chapter 4.5.1 --- Informed Selection between Spatial Clustering and Tem- poral Clustering --- p.62 / Chapter 4.5.2 --- Selecting the Number of Clusters Per Iteration --- p.64 / Chapter 4.5.3 --- An Example for Intelligent Selection --- p.64 / Chapter 4.5.4 --- Evaluation --- p.68 / Chapter 4.6 --- Chapter Summary --- p.71 / Chapter 5 --- Bidirectional Machine Translation using Induced Grammars ´ؤBaseline Approach --- p.73 / Chapter 5.1 --- Background in Machine Translation --- p.75 / Chapter 5.1.1 --- Rule-based Machine Translation --- p.75 / Chapter 5.1.2 --- Statistical Machine Translation --- p.76 / Chapter 5.1.3 --- Knowledge-based Machine Translation --- p.77 / Chapter 5.1.4 --- Example-based Machine Translation --- p.78 / Chapter 5.1.5 --- Evaluation --- p.79 / Chapter 5.2 --- Baseline Configuration on Bidirectional Machine Translation System --- p.84 / Chapter 5.2.1 --- Bilingual Dictionary --- p.84 / Chapter 5.2.2 --- Concept Alignments --- p.85 / Chapter 5.2.3 --- Translation Process --- p.89 / Chapter 5.2.4 --- Two Aspects for Enhancements --- p.90 / Chapter 5.3 --- Chapter Summary --- p.91 / Chapter 6 --- Bidirectional Machine Translation ´ؤ Enhanced Approach --- p.92 / Chapter 6.1 --- Concept Alignments --- p.93 / Chapter 6.1.1 --- Enhanced Alignment Scheme --- p.95 / Chapter 6.1.2 --- Experiment --- p.97 / Chapter 6.2 --- Grammar Checker --- p.100 / Chapter 6.2.1 --- Components for Grammar Checking --- p.101 / Chapter 6.3 --- Evaluation --- p.117 / Chapter 6.3.1 --- Bleu Score Performance --- p.118 / Chapter 6.3.2 --- Modified Bleu Score --- p.122 / Chapter 6.4 --- Chapter Summary --- p.130 / Chapter 7 --- Conclusions --- p.131 / Chapter 7.1 --- Summary --- p.131 / Chapter 7.2 --- Contributions --- p.134 / Chapter 7.3 --- Future work --- p.136 / Bibliography --- p.137 / Chapter A --- Original SQL Queries --- p.144 / Chapter B --- Seeded Categories --- p.146 / Chapter C --- 3 Alignment Categories --- p.147 / Chapter D --- Labels of Syntactic Structures in Grammar Checker --- p.148
|
582 |
Learning structural descriptions of grammar rules from examplesBerwick, Robert Cregar January 1980 (has links)
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1980. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING. / Bibliography: leaves 116-120. / by Robert Cregar Berwick. / M.S.
|
583 |
Parametric variation in clitic constructionsBorer, Hagit January 1981 (has links)
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1981. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND HUMANITIES. / Vita. / Bibliography: leaves 357-362. / by Hagit Borer. / Ph.D.
|
584 |
A theory of syntactic recognition for natural language.Marcus, Mitchell Philip January 1978 (has links)
Thesis. 1978. Ph.D.--Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING. / Bibliography: p. 331-335. / Ph.D.
|
585 |
Case-linking : a theory of case and verb diathesis applied to classical Sanskrit.Ostler, Nicholas David MacLachlan January 1979 (has links)
Thesis. 1979. Ph.D.--Massachusetts Institute of Technology. Dept. of Linguistics and Philosophy. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND HUMANITIES. / Vita. / Bibliography: leaves 424-432. / Ph.D.
|
586 |
The formal nature of anaphoric relationsAoun, Joseph January 1982 (has links)
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1982. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND HUMANITIES / Bibliography: leaves 412-419. / by Joseph Aoun. / Ph.D.
|
587 |
On some phonologically-null elements in syntaxJaeggli, Osvaldo January 1980 (has links)
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1980. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND HUMANITIES. / Vita. / Bibliography: leaves 314-319. / by Osvaldo Adolfo Jaeggli. / Ph.D.
|
588 |
Pivot-based Statistical Machine Translation for Morphologically Rich LanguagesKholy, Ahmed El January 2016 (has links)
This thesis describes the research efforts on pivot-based statistical machine translation (SMT) for morphologically rich languages (MRL). We provide a framework to translate to and from morphologically rich languages especially in the context of having little or no parallel corpora between the source and the target languages. We basically address three main challenges. The first one is the sparsity of data as a result of morphological richness. The second one is maximizing the precision and recall of the pivoting process itself. And the last one is making use of any parallel data between the source and the target languages. To address the challenge of data sparsity, we explored a space of tokenization schemes and normalization options. We also examined a set of six detokenization techniques to evaluate detokenized and orthographically corrected (enriched) output. We provide a recipe of the best settings to translate to one of the most challenging languages, namely Arabic. Our best model improves the translation quality over the baseline by 1.3 BLEU points. We also investigated the idea of separation between translation and morphology generation. We compared three methods of modeling morphological features. Features can be modeled as part of the core translation. Alternatively these features can be generated using target monolingual context. Finally, the features can be predicted using both source and target information. In our experimental results, we outperform the vanilla factored translation model. In order to decide on which features to translate, generate or predict, a detailed error analysis should be provided on the system output. As a result, we present AMEANA, an open-source tool for error analysis of natural language processing tasks, targeting morphologically rich languages. The second challenge we are concerned with is the pivoting process itself. We discuss several techniques to improve the precision and recall of the pivot matching. One technique to improve the recall works on the level of the word alignment as an optimization process for pivoting driven by generating phrase pairs between source and target languages. Despite the fact that improving the recall of the pivot matching improves the overall translation quality, we also need to increase the precision of the pivot quality. To achieve this, we introduce quality constraints scores to determine the quality of the pivot phrase pairs between source and target languages. We show positive results for different language pairs which shows the consistency of our approaches. In one of our best models we reach an improvement of 1.2 BLEU points. The third challenge we are concerned with is how to make use of any parallel data between the source and the target languages. We build on the approach of improving the precision of the pivoting process and the methods of combination between the pivot system and the direct system built from the parallel data. In one of the approaches, we introduce morphology constraint scores which are added to the log linear space of features in order to determine the quality of the pivot phrase pairs. We compare two methods of generating the morphology constraints. One method is based on hand-crafted rules relying on our knowledge of the source and target languages; while in the other method, the morphology constraints are induced from available parallel data between the source and target languages which we also use to build a direct translation model. We then combine both the pivot and direct models to achieve better coverage and overall translation quality. Using induced morphology constraints outperformed the handcrafted rules and improved over our best model from all previous approaches by 0.6 BLEU points (7.2/6.7 BLEU points from the direct and pivot baselines respectively). Finally, we introduce applying smart techniques to combine pivot and direct models. We show that smart selective combination can lead to a large reduction of the pivot model without affecting the performance and in some cases improving it.
|
589 |
Quantifier expressions and information structureMankowitz, Poppy January 2019 (has links)
Linguists and philosophers of language have shown increasing interest in the expressions that refer to quantifiers: determiners like 'every' and 'many', in addition to determiner phrases like 'some king' and 'no cat'. This thesis addresses several puzzles where the way we understand quantifier expressions depends on features that go beyond standard truth conditional semantic meaning. One puzzle concerns the fact that it is often natural to understand 'Every king is in the yard' as being true if (say) all of the kings at the party are in the yard, even though the standard truth conditions predict it to be true if and only if every king in the universe is in the yard. Another puzzle emerges from the observation that 'Every American king is in the yard' sounds odd relative to contexts where there are no American kings, even though the standard truth conditions predict it to be trivially true. These puzzles have been widely discussed within linguistics and philosophy of language, and have implications for topics as diverse as the distinction between semantics and pragmatics and the ontological commitments of ordinary individuals. Yet few attempts have been made to incorporate discussions from the linguistics literature into the philosophical literature. This thesis argues that attending to the linguistics literature helps to address these puzzles. In particular, my solutions to these puzzles rely on notions from work on information structure, an often overlooked area of linguistics. I will use these notions to develop a new theory of the pragmatics of ordinary discourse, in the process of resolving the puzzles. In the first two chapters, I provide accessible overviews of key notions from the literature on quantifier expressions and information structure. In the third chapter, I discuss the problem of contextual domain restriction. In the fourth chapter, I consider the problems posed by empty restrictors. In the final chapter, I tackle the issue of category mistakes.
|
590 |
A comparative study of Katzian semantics and atomic physics.January 1996 (has links)
Kwok Wai Man. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 74-78). / Preface --- p.3 / Chapter 1. --- Underlying Linguistic Reality --- p.6 / Chapter 1.1 --- Syntactics --- p.7 / Chapter 1.1.1 --- Superficial & Underlying Phrase Markers / Chapter 1.1.2 --- Rewriting & Transformational Rules / Chapter 1.2 --- Semantics --- p.21 / Chapter 1.2.1 --- Semantic Markers, Readings & Projection Rules / Chapter 1.2.2 --- Selection Restrictions / Chapter 1.2.3 --- Definition of Semantic Properties & Relations / Chapter 1.3 --- Status --- p.31 / Chapter 1.3.1 --- Appearance-Reality Distinction / Chapter 1.3.2 --- Linguistic Competence / Chapter 1.3.3 --- Idealization / Chapter 1.3.4 --- Linguistic Description / Chapter 1.3.5 --- Evidence / Chapter 2. --- Atomic Physics --- p.48 / Chapter 2.1 --- Line Spectra of Hydrogen Atom --- p.48 / Chapter 2.2 --- Bohr's Theory of Hydrogen Atom --- p.50 / Chapter 3. --- Criticisms of Katzian Semantics as Compared with Atomic Physics --- p.55 / Chapter 3.1 --- Distinction between Linguistic Theory & Linguistic Descriptions --- p.57 / Chapter 3.2 --- Theoretical Constructs in Katz' Theory --- p.58 / Chapter 3.3 --- Theoretical Concepts & Correspondence Rules --- p.62 / Chapter 3.4 --- Bohr vs Katz: the Weakness of the Latter --- p.64 / Notes --- p.67 / References --- p.74
|
Page generated in 0.0869 seconds