Global ETD Search

1	Translation accuracy comparison between machine translation and context-free machine natural language grammar–based translation Wang, Long Qi January 2018 (has links) University of Macau / Faculty of Science and Technology. / Department of Computer and Information Science Computational linguistics
2	Building phrase based language model from large corpus / Tang, Haijiang. January 2002 (has links) Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002. / Includes bibliographical references (leaves 74-79). Also available in electronic version. Access restricted to campus users. Computational linguistics
3	Modeling Relevance in Statistical Machine Translation: Scoring Alignment, Context, and Annotations of Translation Instances Phillips, Aaron B. 01 January 2012 (has links) Machine translation has advanced considerably in recent years, primarily due to the availability of larger datasets. However, one cannot rely on the availability of copious, high-quality bilingual training data. In this work, we improve upon the state-of-the-art in machine translation with an instance-based model that scores each instance of translation in the corpus. A translation instance reflects a source and target correspondence at one specific location in the corpus. The significance of this approach is that our model is able to capture that some instances of translation are more relevant than others. We have implemented this approach in Cunei, a new platform for machine translation that permits the scoring of instance-specific features. Leveraging per-instance alignment features, we demonstrate that Cunei can outperform Moses, a widely-used machine translation system. We then expand on this baseline system in three principal directions, each of which shows further gains. First, we score the source context of a translation instance in order to favor those that are most similar to the input sentence. Second, we apply similar techniques to score the target context of a translation instance and favor those that are most similar to the target hypothesis. Third, we provide a mechanism to mark-up the corpus with annotations (e.g. statistical word clustering, part-of-speech labels, and parse trees) and then exploit this information to create additional perinstance similarity features. Each of these techniques explicitly takes advantage of the fact that our approach scores each instance of translation on demand after the input sentence is provided and while the target hypothesis is being generated; similar extensions would be impossible or quite difficult in existing machine translation systems. Ultimately, this approach provides a more exible framework for integration of novel features that adapts better to new data. In our experiments with German-English and Czech-English translation, the addition of instance-specific features consistently shows improvement. Computational Linguistics
4	Crosslingual implementation of linguistic taggers using parallel corpora Safadi, Hani. January 1900 (has links) Thesis (M.Sc.). / Written for the School of Computer Science. Title from title page of PDF (viewed 2008/12/09). Includes bibliographical references. Computational linguistics.
5	Inferring conceptual structures from pictorial input data Salveter, Sharon Caroline. January 1978 (has links) Thesis--Wisconsin. / Vita. Includes bibliographical references (leaves 118-122). Computational linguistics.
6	Word Alignment by Re-using Parallel Phrases Holmqvist, Maria January 2008 (has links) <p>In this thesis we present the idea of using parallel phrases for word alignment. Each parallel phrase is extracted from a set of manual word alignments and contains a number of source and target words and their corresponding alignments. If a parallel phrase matches a new sentence pair, its word alignments can be applied to the new sentence. There are several advantages of using phrases for word alignment. First, longer text segments include more context and will be more likely to produce correct word alignments than shorter segments or single words. More importantly, the use of longer phrases makesit possible to generalize words in the phrase by replacing words by parts-of-speech or other grammatical information. In this way, the number of words covered by the extracted phrases can go beyond the words and phrases that were present in the original set of manually aligned sentences. We present experiments with phrase-based word alignment on three types of English–Swedish parallel corpora: a software manual, a novel and proceedings of the European Parliament. In order to find a balance between improved coverage and high alignment accuracy we investigated different properties of generalised phrases to identify which types of phrases are likely to produce accurate alignments on new data. Finally, we have compared phrase-based word alignments to state-of-the-art statistical alignment with encouraging results. We show that phrase-based word alignments can be used to enhance statistical word alignment. To evaluate word alignments an English–Swedish reference set for the Europarl corpus was constructed. The guidelines for producing this reference alignment are presented in the thesis.</p> computational linguistics Computational linguistics Datorlingvistik
7	Class-based statistical models for lexical knowledge acquisition Clark, Stephen January 2001 (has links) This thesis is about the automatic acquisition of a particular kind of lexical knowledge, namely the knowledge of which noun senses can fill the argument slots of predicates. The knowledge is represented using probabilities, which agrees with the intuition that there are no absolute constraints on the arguments of predicates, but that the constraints are satisfied to a certain degree; thus the problem of knowledge acquisition becomes the problem of probability estimation from corpus data. The problem with defining a probability model in terms of senses is that this involves a huge number of parameters, which results in a sparse data problem. The proposal here is to define a probability model over senses in a semantic hierarchy, and exploit the fact that senses can be grouped into classes consisting of semantically similar senses. A novel class-based estimation technique is developed, together with a procedure that determines a suitable class for a sense (given a predicate and argument position). The problem of determining a suitable class can be thought of as finding a suitable level of generalisation in the hierarchy. The generalisation procedure uses a statistical test to locate areas consisting of semantically similar senses, and, as well as being used for probability estimation, is also employed as part of a re-estimation algorithm for estimating sense frequencies from incomplete data. The rest of the thesis considers how the lexical knowledge can be used to resolve structural ambiguities, and provides empirical evaluations. The estimation techniques are first integrated into a parse selection system, using a probabilistic dependency model to rank the alternative parses for a sentence. Then, a PP-attachment task is used to provide an evaluation which is more focussed on the class-based estimation technique, and, finally, a pseudo disambiguation task is used to compare the estimation technique with alternative approaches. 410 Computational linguistics
8	Knowledge representation in natural language : the wordicle - a subconscious connection Downey, Daniel J. G. January 1991 (has links) No description available. 003.5 Computational linguistics
9	Computing presuppositions in an incremental natural language processing system Bridge, Derek G. January 1991 (has links) No description available. 410 Computational linguistics
10	Learning unification-based natural language grammars Osborne, Miles January 1994 (has links) No description available. 005 Computational linguistics

Search results