• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Automated spatiotemporal and semantic information extraction for hazards

Wang, Wei 01 July 2014 (has links)
This dissertation explores three research topics related to automated spatiotemporal and semantic information extraction about hazard events from Web news reports and other social media. The dissertation makes a unique contribution of bridging geographic information science, geographic information retrieval, and natural language processing. Geographic information retrieval and natural language processing techniques are applied to extract spatiotemporal and semantic information automatically from Web documents, to retrieve information about patterns of hazard events that are not explicitly described in the texts. Chapters 2, 3 and 4 can be regarded as three standalone journal papers. The research topics covered by the three chapters are related to each other, and are presented in a sequential way. Chapter 2 begins with an investigation of methods for automatically extracting spatial and temporal information about hazards from Web news reports. A set of rules is developed to combine the spatial and temporal information contained in the reports based on how this information is presented in text in order to capture the dynamics of hazard events (e.g., changes in event locations, new events occurring) as they occur over space and time. Chapter 3 presents an approach for retrieving semantic information about hazard events using ontologies and semantic gazetteers. With this work, information on the different kinds of events (e.g., impact, response, or recovery events) can be extracted as well as information about hazard events at different levels of detail. Using the methods presented in Chapter 2 and 3, an approach for automatically extracting spatial, temporal, and semantic information from tweets is discussed in Chapter 4. Four different elements of tweets are used for assigning appropriate spatial and temporal information to hazard events in tweets. Since tweets represent shorter, but more current information about hazards and how they are impacting a local area, key information about hazards can be retrieved through extracted spatiotemporal and semantic information from tweets.
2

Knowledge-Driven Methods for Geographic Information Extraction in the Biomedical Domain

January 2019 (has links)
abstract: Accounting for over a third of all emerging and re-emerging infections, viruses represent a major public health threat, which researchers and epidemiologists across the world have been attempting to contain for decades. Recently, genomics-based surveillance of viruses through methods such as virus phylogeography has grown into a popular tool for infectious disease monitoring. When conducting such surveillance studies, researchers need to manually retrieve geographic metadata denoting the location of infected host (LOIH) of viruses from public sequence databases such as GenBank and any publication related to their study. The large volume of semi-structured and unstructured information that must be reviewed for this task, along with the ambiguity of geographic locations, make it especially challenging. Prior work has demonstrated that the majority of GenBank records lack sufficient geographic granularity concerning the LOIH of viruses. As a result, reviewing full-text publications is often necessary for conducting in-depth analysis of virus migration, which can be a very time-consuming process. Moreover, integrating geographic metadata pertaining to the LOIH of viruses from different sources, including different fields in GenBank records as well as full-text publications, and normalizing the integrated metadata to unique identifiers for subsequent analysis, are also challenging tasks, often requiring expert domain knowledge. Therefore, automated information extraction (IE) methods could help significantly accelerate this process, positively impacting public health research. However, very few research studies have attempted the use of IE methods in this domain. This work explores the use of novel knowledge-driven geographic IE heuristics for extracting, integrating, and normalizing the LOIH of viruses based on information available in GenBank and related publications; when evaluated on manually annotated test sets, the methods were found to have a high accuracy and shown to be adequate for addressing this challenging problem. It also presents GeoBoost, a pioneering software system for georeferencing GenBank records, as well as a large-scale database containing over two million virus GenBank records georeferenced using the algorithms introduced here. The methods, database and software developed here could help support diverse public health domains focusing on sequence-informed virus surveillance, thereby enhancing existing platforms for controlling and containing disease outbreaks. / Dissertation/Thesis / Doctoral Dissertation Biomedical Informatics 2019

Page generated in 0.2687 seconds