Return to search

Text Mining Methods for Biomedical Data Analysis / Text Mining Metoder för Biomedicinsk Data Analys

Biological data topic modeling has become a very prevalent topic among researchers in recent times. However, analysing countless research papers and gathering consensus regarding biomedicine is a near-impossible task for any researcher due to the complexity and quantity of material that is published. This thesis is devised to focus on two objectives that can help the researchers in this domain based on data related to five major DNA repair pathways. The first objective is to propose an unsupervised approach to examine the hidden structures and analyse research trends in temporal biomedical text data. The second objective is to find DNA repair markers involved in immune defense and retrieve potential PPIs, GIs, and disease-gene associations reported in the literature. We have used latent Dirichlet Allocation (LDA) to discover hidden themes and semantically coherent topics from text. We have clustered the documents based on LDA topic models to analyse the research trend and used the Mann- Kendall test to understand the trends of the topics. Hybridization of text mining methods with classical co-occurrence statistical approach and association rule mining was used to discover potential PPIs, GIs, and disease-gene association in the text. The results for PPIs and GIs were then evaluated with an external biological database of PPIs.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-176615
Date January 2021
CreatorsJabeen, Rakhshanda
PublisherLinköpings universitet, Statistik och maskininlärning, rakhshanda.jbn@gmail.com
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds