Return to search

Joint Biomedical Event Extraction and Entity Linking via Iterative Collaborative Training

Biomedical entity linking and event extraction are two crucial tasks to support text understanding and retrieval in the biomedical domain. These two tasks intrinsically benefit each other: entity linking disambiguates the biomedical concepts by referring to external knowledge bases and the domain knowledge further provides additional clues to understand and extract the biological processes, while event extraction identifies a key trigger and entities involved to describe each biological process which also captures the structural context to better disambiguate the biomedical entities. However, previous research typically solves these two tasks separately or in a pipeline, leading to error propagation. What's more, it's even more challenging to solve these two tasks together as there is no existing dataset that contains annotations for both tasks. To solve these challenges, we propose joint biomedical entity linking and event extraction by regarding the event structures and entity references in knowledge bases as latent variables and updating the two task-specific models in an iterative training framework: (1) predicting the missing variables for each partially annotated dataset based on the current two task-specific models, and (2) updating the parameters of each model on the corresponding pseudo completed dataset. Experimental results on two benchmark datasets: Genia 2011 for event extraction and BC4GO for entity linking, show that our joint framework significantly improves the model for each individual task and outperforms the strong baselines for both tasks. We will make the code and model checkpoints publicly available once the paper is accepted. / M.S. / Biomedical entity linking and event extraction are essential tasks in understanding and retrieving information from biomedical texts. These tasks mutually benefit each other, as entity linking helps disambiguate biomedical concepts by leveraging external knowledge bases, while domain knowledge provides valuable insights for understanding and extracting biological processes. Event extraction, on the other hand, identifies triggers and entities involved in describing biological processes, capturing their contextual relationships for improved entity disambiguation. However, existing approaches often address these tasks separately or in a sequential manner, leading to error propagation. Furthermore, the joint solution becomes even more challenging due to the lack of datasets with annotations for both tasks.

To overcome these challenges, we propose a novel approach for jointly performing biomedical entity linking and event extraction. Our method treats the event structures and entity references in knowledge bases as latent variables and employs an iterative training framework. This framework involves predicting missing variables in partially annotated datasets based on the current task-specific models and updating the model parameters using the completed datasets. Experimental results on benchmark datasets, namely Genia 2011 for event extraction and BC4GO for entity linking, demonstrate the effectiveness of our joint framework. It significantly improves the performance of each individual task and outperforms strong baselines for both tasks.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/115831
Date05 1900
CreatorsLi, Xiaochu
ContributorsComputer Sciences, Huang, Lifu, Reddy, Chandan, Zhang, Liqing
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf, application/pdf

Page generated in 0.0086 seconds