Return to search

CASSANDRA: drug gene association prediction via text mining and ontologies

The amount of biomedical literature has been increasing rapidly during the last decade. Text mining techniques can harness this large-scale data, shed light onto complex drug mechanisms, and extract relation information that can support computational polypharmacology. In this work, we introduce CASSANDRA, a fully corpus-based and unsupervised algorithm which uses the MEDLINE indexed titles and abstracts to infer drug gene associations and assist drug repositioning. CASSANDRA measures the Pointwise Mutual Information (PMI) between biomedical terms derived from Gene Ontology (GO) and Medical Subject Headings (MeSH). Based on the PMI scores, drug and gene profiles are generated and candidate drug gene associations are inferred when computing the relatedness of their profiles.

Results show that an Area Under the Curve (AUC) of up to 0.88 can be achieved. The algorithm can successfully identify direct drug gene associations with high precision and prioritize them over indirect drug gene associations. Validation shows that the statistically derived profiles from literature perform as good as (and at times better than) the manually curated profiles.

In addition, we examine CASSANDRA’s potential towards drug repositioning. For all FDA-approved drugs repositioned over the last 5 years, we generate profiles from publications before 2009 and show that the new indications rank high in these profiles. In summary, co-occurrence based profiles derived from the biomedical literature can accurately predict drug gene associations and provide insights onto potential repositioning cases.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa.de:bsz:14-qucosa-159884
Date28 January 2015
CreatorsKissa, Maria
ContributorsTechnische Universität Dresden, Fakultät Informatik, Prof. Dr. Michael Schroeder, PhD Nigam H. Shah
PublisherSaechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis
Formatapplication/pdf

Page generated in 0.0022 seconds