Molecular medicine encompasses the application of molecular biology techniques and knowledge to the prevention, diagnosis and treatment of diseases and disorders. Statistical and computational models can predict clinical outcomes, such as prognosis or response to treatment, based on the results of molecular assays. For advances in molecular medicine to translate into clinical results, clinicians and translational researchers need to have up-to-date access to high-quality predictive models. The large number of such models reported in the literature is growing at a pace that overwhelms the human ability to manually assimilate this information. Therefore the important problem of retrieving and organizing the vast amount of published information within this domain needs to be addressed. The inherent complexity of this domain and the fast pace of scientific discovery make this problem particularly challenging.
<p>
This dissertation describes a framework for retrieval and organization of clinical bioinformatics predictive models. A semantic analysis of this domain was performed. The semantic analysis informed the design of the framework. Specifically, it allowed the development of a specialized annotation scheme of published articles that can be used for meaningful organization and for indexing and efficient retrieval. This annotation scheme was codified using an annotation form and accompanying guidelines document that were used by multiple human experts to annotate over 1000 articles. These datasets were then used to train and test support vector machine (SVM) machine learning classifiers. The classifiers were designed to provide a scalable mechanism to replicate human experts ability (1) to retrieve relevant MEDLINE articles and (2) to annotate these articles using the specialized annotation scheme. The machine learning classifiers showed very good predictive ability that was also shown to generalize to different disease domains and to datasets annotated by independent experts. The experiments highlighted the need for providing unambiguous operational definitions of the complex concepts used for semantic annotations. The impact of the semantic definitions on the quality of manual annotations and on the performance of the machine learning classifiers was discussed.
Identifer | oai:union.ndltd.org:VANDERBILT/oai:VANDERBILTETD:etd-03282011-223440 |
Date | 16 April 2011 |
Creators | Wehbe, Firas Hazem |
Contributors | Cynthia S. Gadd, Constantin Aliferis, Steven H. Brown, Daniel R. Masys, Pierre Massion, Hua Xu |
Publisher | VANDERBILT |
Source Sets | Vanderbilt University Theses |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.library.vanderbilt.edu/available/etd-03282011-223440/ |
Rights | restricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Vanderbilt University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. |
Page generated in 0.0019 seconds