Global ETD Search

Return to search

Improving Performance of Biomedical Information Retrieval using Document-Level Field Boosting and BM25F Weighting

Corpora of biomedical information typically contains large amounts of ambiguous data, as proteins and genes can be referred to by a number of different terms, making information retrieval difficult. This thesis investigates a number of methods attempting to increase precision and recall of searches within the biomedical domain, including using the BM25F model for scoring documents and using Named Entity Recognition (NER) to identify biomedical entities in the text. We have implemented a prototype for testing the approaches, and have found that by using a combination of several methods, including using three different NER models at once, a significant increase (up to 11.5%) in mean average precision (MAP) is observed over our baseline result.

ntnudaim:4443

MIT informatikk

Informasjonsforvaltning

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ntnu-11979
Date	January 2010
Creators	Jervidalo, Jørgen
Publisher	Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, Institutt for datateknikk og informasjonsvitenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0013 seconds

Improving Performance of Biomedical Information Retrieval using Document-Level Field Boosting and BM25F Weighting

Description

Links & Downloads

Tags

Additional Fields