Return to search

Building a Trustworthy Question Answering System for Covid-19 Tracking

During the unprecedented global pandemic of Covid-19, the general public is suffering from inaccurate Covid-19 related information including outdated information and fake news. The most used media: TV, social media, newspaper, and radio are incompetent in providing certitude and flash updates that people are seeking. In order to cope with this challenge, several public data resources that are dedicated to providing Covid-19 information were born. They rallied with experts from different fields to provide authoritative and up-to-date pandemic updates. However, the general public cannot still make complete use of such resources since the learning curve is too steep, especially for the aged and under-aged users.

To address this problem, in this Thesis, we propose a question answering system that can be interacted with using simple natural language-based sentences. While building this system, we investigate qualified public data resources and from the data content they are providing, and we collect a set of frequently asked questions for Covid-19 tracking. We further build a dedicated dataset named CovidQA for evaluating the performance of the question answering system with different models. Based on the new dataset, we assess multiple machine learning-based models that are built for retrieving relevant information from databases, and then propose two empirical models which utilize the pre-defined templates to generate SQL queries. In our experiments, we demonstrate both quantitative and qualitative results and provide a comprehensive comparison between different types of methods. The results show that the proposed template-based methods are simple but effective in building question answering systems for specific domain problems. / Master of Science / During the unprecedented global pandemic of Covid-19, the general public is suffering from inaccurate Covid-19 related information including outdated information and fake news. The most used media: TV, social media, newspaper, and radio are incompetent in providing certitude and flash updates that people are seeking. In order to cope with this challenge, several public data resources that are dedicated to providing Covid-19 information were born. They rallied with experts from different fields to provide authoritative and up-to-date pandemic updates. However, there is room for improvement in terms of user experience.

To address this problem, in this Thesis, we propose a system that can be interacted with using natural questions. While building this system, we evaluate and choose six qualified public data providers as the data sources. We further build a testing dataset for evaluating the performance of the system. We assess two Artificial Intelligence-powered models for the system, and then propose two rule-based models for the researched problem. In our experiments, we provide a comprehensive comparison between different types of methods. The results show that the proposed rule-based methods are simple but effective in building such systems.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/113956
Date02 September 2021
CreatorsLiu, Yiqing
ContributorsComputer Science, Reddy, Chandan K., Shaffer, Clifford A., Lu, Chang Tien
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0121 seconds