Global ETD Search

Return to search

M3D: Multimodal MultiDocument Fine-Grained Inconsistency Detection

Validating claims from misinformation is a highly challenging task that involves understanding how each factual assertion within the claim relates to a set of trusted source materials. Existing approaches often make coarse-grained predictions but fail to identify the specific aspects of the claim that are troublesome and the specific evidence relied upon. In this paper, we introduce a method and new benchmark for this challenging task. Our method predicts the fine-grained logical relationship of each aspect of the claim from a set of multimodal documents, which include text, image(s), video(s), and audio(s). We also introduce a new benchmark (M^3DC) of claims requiring multimodal multidocument reasoning, which we construct using a novel claim synthesis technique. Experiments show that our approach significantly outperforms state-of-the-art baselines on this challenging task on two benchmarks while providing finer-grained predictions, explanations, and evidence. / Master of Science / In today's world, we are constantly bombarded with information from various sources, making it difficult to distinguish between what is true and what is false. Validating claims and determining their truthfulness is an essential task that helps us separate facts from fiction, but it can be a time-consuming and challenging process. Current methods often fail to pinpoint the specific parts of a claim that are problematic and the evidence used to support or refute them.

In this study, we present a new method and benchmark for fact-checking claims using multiple types of information sources, including text, images, videos, and audio. Our approach analyzes each aspect of a claim and predicts how it logically relates to the available evidence from these diverse sources. This allows us to provide more detailed and accurate assessments of the claim's validity. We also introduce a new benchmark dataset called M^3DC, which consists of claims that require reasoning across multiple sources and types of information. To create this dataset, we developed a novel technique for synthesizing claims that mimic real-world scenarios. Our experiments show that our method significantly outperforms existing state-of-the-art approaches on two benchmarks while providing more fine-grained predictions, explanations, and evidence. This research contributes to the ongoing effort to combat misinformation and fake news by providing a more comprehensive and effective approach to fact-checking claims.

multi-modality reasoning

fine-grained reasoning

multi-document understanding

Identifer	oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/119382
Date	10 June 2024
Creators	Tang, Chia-Wei
Contributors	Computer Science and#38; Applications, Thomas, Christopher Lee, Lourentzou, Ismini, Huang, Lifu
Publisher	Virginia Tech
Source Sets	Virginia Tech Theses and Dissertation
Language	English
Detected Language	English
Type	Thesis
Format	ETD, application/pdf
Rights	In Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0016 seconds

M3D: Multimodal MultiDocument Fine-Grained Inconsistency Detection

Description

Links & Downloads

Tags

Additional Fields