Return to search

Context specific text mining for annotating protein interactions with experimental evidence

Indiana University-Purdue University Indianapolis (IUPUI) / Proteins are the building blocks in a biological system. They interact with other proteins to make unique biological phenomenon. Protein-protein interactions play a valuable role in understanding the molecular mechanisms occurring in any biological system. Protein interaction databases are a rich source on protein interaction related information. They gather large amounts of information from published literature to enrich their data. Expert curators put in most of these efforts manually. The amount of accessible and publicly available literature is growing very rapidly. Manual annotation is a time consuming process. And with the rate at which available information is growing, it cannot be dealt with only manual curation. There need to be tools to process this huge amounts of data to bring out valuable gist than can help curators proceed faster. In case of extracting protein-protein interaction evidences from literature, just a mere mention of a certain protein by look-up approaches cannot help validate the interaction. Supporting protein interaction information with experimental evidence can help this cause. In this study, we are applying machine learning based classification techniques to classify and given protein interaction related document into an interaction detection method. We use biological attributes and experimental factors, different combination of which define any particular interaction detection method. Then using predicted detection methods, proteins identified using named entity recognition techniques and decomposing the parts-of-speech composition we search for sentences with experimental evidence for a protein-protein interaction. We report an accuracy of 75.1% with a F-score of 47.6% on a dataset containing 2035 training documents and 300 test documents.

Identiferoai:union.ndltd.org:IUPUI/oai:scholarworks.iupui.edu:1805/3809
Date03 January 2014
CreatorsPandit, Yogesh
ContributorsPalakal, Mathew J., Liu, Yunlong, Liu, Xiaowen
Source SetsIndiana University-Purdue University Indianapolis
Languageen_US
Detected LanguageEnglish
TypeThesis

Page generated in 0.0015 seconds