Return to search

Transforming Legal Entity Recognition

Transformer-based architectures have in recent years advanced state-of-the-art performance in Natural Language Processing. Researchers have successfully adapted such models to downstream tasks within NLP in a domain-specific setting. This thesis examines the application of these models to the legal domain by doing Named Entity Recognition (NER) in a setting of scarce training data. Three different pre-trained BERT models are fine-tuned on a set of 101 court case documents, whereof one model is pre-trained on legal corpora and the other two on general corpora. Experiments are run to evaluate the models’ predictive performance given smaller or larger quantities of data to fine-tune on. Results show that BERT models work reasonably well for NER with legal data. Unlike many other domain-specific BERT models, the BERT model trained on legal corpora does not outperform the base models. Modest amounts of annotated data seem sufficient for reasonably good performance.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-447240
Date January 2021
CreatorsAndersson-Säll, Tim
PublisherUppsala universitet, Statistiska institutionen
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0029 seconds