Return to search

Asymmetric term alignment with selective contiguity constraints by multi-tape automata

This article describes a HMM-based word-alignment method that can selectively enforce a contiguity constraint. This method has a direct application in the extraction of a bilingual terminological lexicon from a parallel corpus, but can also be used as a preliminary step for the extraction of phrase pairs in a Phrase-Based Statistical Machine Translation system. Contiguous source words composing terms are aligned to contiguous target language words. The HMM is transformed into a Weighted Finite State Transducer (WFST) and contiguity constraints are enforced by specific multi-tape WFSTs. The proposed method is especially suited when basic linguistic resources (morphological analyzer, part-of-speech taggers and term extractors) are available for the source language only.

Identiferoai:union.ndltd.org:Potsdam/oai:kobv.de-opus-ubp:2711
Date January 2008
CreatorsBarbaiani, Mădălina, Cancedda, Nicola, Dance, Chris, Fazekas, Szilárd, Gaál, Tamás, Gaussier, Éric
PublisherUniversität Potsdam, Extern. Extern
Source SetsPotsdam University
LanguageEnglish
Detected LanguageEnglish
TypeInProceedings
Formatapplication/pdf
Rightshttp://opus.kobv.de/ubp/doku/urheberrecht.php

Page generated in 0.002 seconds