Return to search

A Framework For Ranking And Categorizing Medical Documents

In this dissertation, we present a framework to enhance the retrieval, ranking, and categorization of text documents in medical domain. The contributions of this study are the introduction of a similarity model to retrieve and rank medical textdocuments and the introduction of rule-based categorization method based on lexical syntactic patterns features. We formulate the similarity model by combining three features to model the relationship among document and construct a document network. We aim to rank retrieved documents according to their topics / making highly relevant document on the top of the hit-list. We have applied this model on OHSUMED collection (TREC-9) in order to demonstrate the performance effectiveness in terms of topical ranking, recall, and precision metrics.

In addition, we introduce ROLEX-SP (Rules Of LEXical Syntactic Patterns) / a method for the automatic induction of rule-based text-classifiers relies on lexical syntactic patterns as a set of features to categorize text-documents. The proposed method is dedicated to solve the problem of multi-class classification and feature imbalance problems in domain specific text documents. Furthermore, our proposed
method is able to categorize documents according to a predefined set of characteristics such as: user-specific, domain-specific, and query-based categorization which facilitates browsing documents in search-engines and increase
users ability to choose among relevant documents. To demonstrate the applicability of ROLEX-SP, we have performed experiments on OHSUMED (categorization
collection). The results indicate that ROLEX-SP outperforms state-of-the-art methods in categorizing short-text medical documents.

Identiferoai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/2/12611996/index.pdf
Date01 June 2010
CreatorsAl Zamil, Mohammed Gh. I.
ContributorsBetin Can, Aysu
PublisherMETU
Source SetsMiddle East Technical Univ.
LanguageEnglish
Detected LanguageEnglish
TypePh.D. Thesis
Formattext/pdf
RightsTo liberate the content for METU campus

Page generated in 0.0095 seconds