<p>As Natural Language Processing systems converge on a high percentage of successful deeply parsed text, parse success alone is an incomplete measure of the ``intelligence'' exhibited by the system. Because systems apply different grammars, dictionaries and programming languages, the internal representation of parsed text is often different from system to system, making it difficult to compare performance and exchange useful data such as tagged corpora or semantic interpretations. This report describes how semantically annotated corpora can be used to measure quality of Natural Language Processing systems. A selected corpus produced by the GENIA project were used as ``golden standard'' (event-annotated abstracts from MEDLINE). This corpus were sparse (19 abstracts), thus manual methods were employed to produce a mapping from the native GeneTUC knowledge format (TQL). This mapping were used to produce an evaluation of events in GeneTUC. We were able to attain a recall of 67% and average precision of 33% on the training data. These results suggest that the mapping is inadequate. On test data, the recall were 28% and average precision 21%. Because events is a new ``feature'' in NLP-applications, there are no large corpora that can be used for automated rule learning. The conclusion is that at least there exists a partial mapping from TQL to GENIA events, and that larger corpora and AI-methods should be applied to refine the mapping rules. In addition, we discovered that this mapping can be of use for extraction of protein-protein interactions.</p>
Identifer | oai:union.ndltd.org:UPSALLA/oai:DiVA.org:ntnu-10061 |
Date | January 2006 |
Creators | Søvik, Harald |
Publisher | Norwegian University of Science and Technology, Department of Computer and Information Science, Institutt for datateknikk og informasjonsvitenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, text |
Page generated in 0.002 seconds