An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

Holistic Inter-Annotator Agreement and Corpus Coherence Estimation in a Large-scale Multilingual Annotation Campaign

cover
In this paper we report on the complexity of persuasion technique annotation in the context of a large annotation campaign involving 9 languages and approximately 40 annotators. We highlight the techniques that appear to be difficult for humans to annotate. We introduce HolisticIAA, a new word embedding-based annotator agreement metric and we report on various experiments using this metric and its correlation with the traditional Inter Annotator Agreement (IAA) metrics. However, given somewhat limited and loose interaction between annotators, i.e., only a few annotators annotate the same document subsets, we try to devise a way to assess the coherence of the entire dataset and strive to find a good proxy for IAA between annotators tasked to annotate different documents and in different languages. We present our preliminary results on this research problem.
2024-10-30
Association for Computational Linguistics
JRC134232
https://aclanthology.org/2023.emnlp-main.6.pdf,    https://publications.jrc.ec.europa.eu/repository/handle/JRC134232,   
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice