Return to search

Advances in Fully-Automatic and Interactive Phrase-Based Statistical Machine Translation

This thesis presents different contributions in the fields of fully-automatic statistical machine
translation and interactive statistical machine translation.
In the field of statistical machine translation there are three problems that are to be addressed,
namely, the modelling problem, the training problem and the search problem. In this
thesis we present contributions regarding these three problems.
Regarding the modelling problem, an alternative derivation of phrase-based statistical
translation models is proposed. Such derivation introduces a set of statistical submodels governing
different aspects of the translation process. In addition to this, the resulting submodels
can be introduced as components of a log-linear model.
Regarding the training problem, an alternative estimation technique for phrase-based
models that tries to reduce the strong heuristic component of the standard estimation technique
is proposed. The proposed estimation technique considers the phrase pairs that compose
the phrase model as part of complete bisegmentations of the source and target sentences.
We theoretically and empirically demonstrate that the proposed estimation technique can be
efficiently executed. Experimental results obtained with the open-source THOT toolkit also
presented in this thesis, show that the alternative estimation technique obtains phrase models
with lower perplexity than those obtained by means of the standard estimation technique.
However, the reduction in the perplexity of the model did not allow us to obtain improvements
in the translation quality.
To deal with the search problem, we propose a search algorithm which is based on the
branch-and-bound search paradigm. The proposed algorithm generalises different search
strategies that can be accessed bymodifying the input parameters. We carried out experiments
to evaluate the performance of the proposed search algorithm. / Ortiz Martínez, D. (2011). Advances in Fully-Automatic and Interactive Phrase-Based Statistical Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/12127 / Palancia

Identiferoai:union.ndltd.org:upv.es/oai:riunet.upv.es:10251/12127
Date14 October 2011
CreatorsOrtiz Martínez, Daniel
ContributorsCasacuberta Nolla, Francisco, García Varea, Ismael, Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
PublisherUniversitat Politècnica de València
Source SetsUniversitat Politècnica de València
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis, info:eu-repo/semantics/acceptedVersion
SourceRiunet
Rightshttp://rightsstatements.org/vocab/InC/1.0/, info:eu-repo/semantics/openAccess

Page generated in 0.0065 seconds