Återgå till huvudnavigering Återgå till sök Gå direkt till huvudinnehållet

Entity framing and role portrayal in the news

  • Tarek Mahmoud
  • , Zhuohan Xie
  • , Dimitar Dimitrov
  • , Nikolaos Nikolaidis
  • , Purificação Silvano
  • , Roman Yangarber
  • , Shivam Sharma
  • , Elisa Sartori
  • , Nicolas Stefanovitch
  • , Giovanni Da San Martino
  • , Jakub Piskorski
  • , Preslav Nakov

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

We introduce a novel multilingual and hierarchical corpus annotated for entity framing and role portrayal in news articles. The dataset uses a unique taxonomy inspired by storytelling elements, comprising 22 fine-grained roles, or archetypes, nested within three main categories: protagonist, antagonist, and innocent. Each archetype is carefully defined, capturing nuanced portrayals of entities such as guardian, martyr, and underdog for protagonists; tyrant, deceiver, and bigot for antagonists; and victim, scapegoat, and exploited for innocents. The dataset includes 1,378 recent news articles in five languages (Bulgarian, English, Hindi, European Portuguese, and Russian) focusing on two critical domains of global significance: the Ukraine-Russia War and Climate Change. Over 5,800 entity mentions have been annotated with role labels. This dataset serves as a valuable resource for research into role portrayal and has broader implications for news analysis. We describe the characteristics of the dataset and the annotation process, and we report evaluation results on fine-tuned state-of-the-art multilingual transformers and hierarchical zero-shot learning using LLMs at the level of a document, a paragraph, and a sentence.
Originalspråkengelska
Titel på värdpublikationFindings of the Association for Computational Linguistics : ACL 2025
RedaktörerWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Antal sidor25
UtgivningsortKerrville
FörlagAssociation for Computational Linguistics (ACL)
Utgivningsdatumjuli 2025
Sidor302-326
ISBN (elektroniskt)979-8-89176-256-5
DOI
StatusPublicerad - juli 2025
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangAnnual Meeting of the Association for Computational Linguistics - Vienna, Österrike
Varaktighet: 27 juli 20251 aug. 2025
Konferensnummer: 63

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 113 Data- och informationsvetenskap

Citera det här