Emerging topic detection algorithms have the potential to assist researchers in maintaining awareness of current trends in biomedical fields--a feat not easily achieved with existing methods. Though topic detection algorithms for news-cycles exist, several aspects of this particular area make applying them directly to scientific literature problematic.
This dissertation offers a framework for emerging topic detection in biomedicine. The framework includes a novel set of weightings based on the historical importance of each topic identified. Features such as journal impact factor and funding data are used to develop a fitness score to identify which topics are likely to burst in the future. Characterization of bursts over an extended planning horizon by discipline was performed to understand what a typical burst trend looks like in this space to better understand how to identify important or emerging trends. Cluster analysis was used to create an overlapping hierarchical structure of scientific literature at the discipline level. This allows for granularity adjustment (e.g. discipline level or research area level) in emerging topic detection for different users. Using cluster analysis allows for the identification of terms that may not be included in annotated taxonomies, as they are new or not considered as relevant at the time the taxonomy was last updated. Weighting topics by historical frequency allows for better identification of bursts that are associated with less well-known areas, and therefore more surprising. The fitness score allows for the early identification of bursty terms. This framework will benefit policy makers, clinicians and researchers.
Identifer | oai:union.ndltd.org:uiowa.edu/oai:ir.uiowa.edu:etd-5559 |
Date | 01 December 2014 |
Creators | Madlock-Brown, Charisse Renee |
Contributors | Eichmann, David |
Publisher | University of Iowa |
Source Sets | University of Iowa |
Language | English |
Detected Language | English |
Type | dissertation |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | Copyright 2014 Charisse Rene Madlock-Brown |
Page generated in 0.002 seconds