Within the field of neural machine translation (NMT), transfer learning and domain adaptation techniques have emerged as central solutions to overcome the data scarcity challenges faced by low-resource languages and specialized domains. This thesis explores the potential of zero-shot cross-lingual domain adaptation, which integrates principles of transfer learning across languages and domain adaptation. By fine-tuning a multilingual pre-trained NMT model on domain-specific data from one language pair, the aim is to capture domain-specific knowledge and transfer it to target languages within the same domain, enabling effective zero-shot cross-lingual domain transfer. This study conducts a series of comprehensive experiments across both specialized and mixed domains to explore the feasibility and influencing factors of zero-shot cross-lingual domain adaptation. The results indicate that fine-tuned models generally outperform the pre-trained baseline in specialized domains and most target languages. However, the extent of improvement depends on the linguistic complexity of the domain, as well as the transferability potential driven by the linguistic similarity between the pivot and target languages. Additionally, the study examines zero-shot cross-lingual cross-domain transfer, where models fine-tuned on mixed domains are evaluated on specialized domains. The results reveal that while cross-domain transfer is feasible, its effectiveness depends on the characteristics of the pivot and target domains, with domains exhibiting more consistent language being more responsive to cross-domain transfer. By examining the interplay between language-specific and domain-specific factors, the research explores the dynamics influencing zero-shot cross-lingual domain adaptation, highlighting the significant role played by both linguistic relatedness and domain characteristics in determining the transferability potential.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-530815 |
Date | January 2024 |
Creators | Shahnazaryan, Lia |
Publisher | Uppsala universitet, Institutionen för lingvistik och filologi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0022 seconds