Return to search

Corpus construction based on Ontological domain knowledge

The purpose of this thesis is to contribute a corpus for sentence level interpretation of biomedical language. The available corpora for the biomedical domain are small in terms of amount of text and predicates. Besides that these corpora are developed rather intuitively. In this effort which we call BioOntoFN, we created a corpus from the domain knowledge provided by an ontology. By doing this we believe that we can provide a rough set of rules to create corpora from ontologies. Besides that we also designed an annotation tool specifically for building our corpus. We built a corpus for biological transport events. The ontology we used is the piece of Gene Ontology pertaining to transport, the term transport GO: 0006810 and all of its child concepts, which could be called a sub-ontology. The annotation of the corpus follows the rules of FrameNet and the output is annotated text that is in an XML format similar to that of FrameNet. The text for the corpus is taken from abstracts of MEDLINE articles. The annotation tool is a GUI created using Java.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-71851
Date January 2011
CreatorsBenis, Nirupama, Kaliyaperumal, Rajaram
PublisherLinköpings universitet, Institutionen för datavetenskap, Linköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds