• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Leveraging Linguistic Insights for Uncertainty Calibration of ChatGPT and Evaluating Crowdsourced Annotations

Venkata Divya Sree Pulipati (18469230) 09 July 2024 (has links)
<p dir="ltr">The quality of crowdsource annotations has always been a challenge due to the variability in annotators backgrounds, task complexity, the subjective nature of many labeling tasks, and various other reasons. Hence, it is crucial to evaluate these annotations to ensure their reliability. Traditionally, human experts evaluate the quality of crowdsourced annotations, but this approach has its own challenges. Hence, this paper proposes to leverage large language models like ChatGPT-4 to evaluate one of the existing crowdsourced MAVEN dataset and explore its potential as an alternative solution. However, due to stochastic nature of LLMs, it is important to discern when to trust and question LLM responses. To address this, we introduce a novel approach that applies Rubin's framework for identifying and using linguistic cues within LLM responses as indicators of LLMs certainty levels. Our findings reveal that ChatGPT-4 successfully identified 63% of the incorrect labels, highlighting the potential for improving data label quality through human-AI collaboration on these identified inaccuracies. This study underscores the promising role of LLMs in evaluating crowdsourced data annotations offering a way to enhance accuracy and fairness of crowdsource annotations while saving time and costs.</p><p dir="ltr"><br></p>

Page generated in 0.1579 seconds