In this work, we leverage the recent developments in computer science to address several of the challenges in current literature-based discovery (LBD) solutions. First, LBD solutions cannot use semantics or are too computational complex. To solve the problems we propose a generative model OverlapLDA based on topic modeling, which has been shown both effective and efficient in extracting semantics from a corpus. We also introduce an inference method of OverlapLDA. We conduct extensive experiments to show the effectiveness and efficiency of OverlapLDA in LBD. Second, we expand LBD to a more complex and realistic setting. The settings are that there can be more than one concept connecting the input concepts, and the connectivity pattern between concepts can also be more complex than a chain. Current LBD solutions can hardly complete the LBD task in the new setting. We simplify the hypotheses as concept sets and propose LBDSetNet based on graph neural networks to solve this problem. We also introduce different training schemes based on self-supervised learning to train LBDSetNet without relying on comprehensive labeled hypotheses that are extremely costly to get. Our comprehensive experiments show that LBDSetNet outperforms strong baselines on simple hypotheses and addresses complex hypotheses.
Identifer | oai:union.ndltd.org:unt.edu/info:ark/67531/metadc1944357 |
Date | 05 1900 |
Creators | Ding, Juncheng |
Contributors | Bryant, Barrett, Chen, Jiangping, Ding, Junhua, Guo, Xuan, Nielsen, Rodney |
Publisher | University of North Texas |
Source Sets | University of North Texas |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Format | Text |
Rights | Public, Ding, Juncheng, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved. |
Page generated in 0.0012 seconds