Return to search

Natural Language Inference Transfer Learning in a Multi-Task Contract Dataset : In the Case of ContractNLI: a Document Information Extraction System

This thesis investigates the enhancement of legal contract Natural Language Inference (NLI) classification through supervised fine-tuning on general domain NLI, in the case of ContractNLI and Span NLI BERT (Koreeda and Manning, 2021), a multi-task document information extraction dataset and framework. Annotated datasets of a specific professional domain are scarce due to the high time and labour cost required to create them. Since NLI is a simple yet effective task in inducing and evaluating natural language understanding (NLU) abilities in language models, there is potential in leveraging abundant general domain NLI datasets to aid information extraction and classification for legal contracts. This work evaluates the impact of transfer learning from Adversarial NLI (Nie et al.,2020) from the general domain to ContractNLI, via sequential and mixed batch fine-tuning. The study also extends its investigation to the effects of the model’s evidence identification component on NLI, by fine-tuning on the Contract Understanding Atticus Dataset (Hendrycks et al., 2021). The results highlight the benefits of fine-tuning with general domain NLI data, particularly for hypotheses in the target task with balanced entailment and contradiction training examples. In addition, the study demonstrates the reciprocal relationship between evidence identification and NLI classification, where improvements in the former enhance the accuracy of the latter. With NLI being more commonly applied to information extraction settings in specialised domains, this work sheds light on the potential impacts of existing general domain NLI resources in stepping up classification performance in specific domains.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-512897
Date January 2023
CreatorsTang, Yiu Kei
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds