We present a study on the automatic classification
of speech acts in the domain of political communication, based
on J. R. Searleās classification of illocutionary acts. Our research
involves creating a dataset using the US State of the Union corpus
and the UN General Debate corpus (UNGD) as data sources.
To overcome limited labelled data, we employ a combination
of weak supervision and active learning techniques for dataset
creation and model training. Through various experiments, we
investigate the influence of external and internal factors on speech
act classification. In addition, we discuss the potential for further
analysis of speech act usage, using the trained model on the
UNGD corpus. The findings demonstrate the effectiveness of
Transformer-based models for automatic speech act classification,
highlight the benefits of weak supervision and active learning
for dataset creation and model training, and underscore the
potential for large-scale statistical analysis of speech act usage in
the domain of political communication
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:92378 |
Date | 28 June 2024 |
Creators | Burghardt, Manuel, Niekler, Andreas, Kantner, Cathleen, Schmidt, Klaus |
Publisher | Polish Information Processing Society |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | http://doi.org/10.15439/2023F3485 |
Page generated in 0.0018 seconds