Återgå till huvudnavigering Återgå till sök Gå direkt till huvudinnehållet

SemEval-2024 Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate. Such cases of overgeneration put in jeopardy many NLG applications, where correctness is often mission-critical. The shared task was conducted with a newly constructed dataset of 4000 model outputs labeled by 5 annotators each, spanning 3 NLP tasks: machine translation, paraphrase generation and definition modeling.The shared task was tackled by a total of 58 different users grouped in 42 teams, out of which 26 elected to write a system description paper; collectively, they submitted over 300 prediction sets on both tracks of the shared task. We observe a number of key trends in how this approach was tackled---many participants rely on a handful of model, and often rely either on synthetic data for fine-tuning or zero-shot prompting strategies. While a majority of the teams did outperform our proposed baseline system, the performances of top-scoring systems are still consistent with a random handling of the more challenging items.
Originalspråkengelska
Titel på värdpublikationProceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
RedaktörerAtul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Antal sidor15
UtgivningsortStroudsburg
FörlagThe Association for Computational Linguistics
Utgivningsdatum1 juni 2024
Sidor1979-1993
ISBN (elektroniskt)979-8-89176-107-0
DOI
StatusPublicerad - 1 juni 2024
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Workshop on Semantic Evaluation - Mexico City, Mexiko
Varaktighet: 20 juni 202421 juni 2024
Konferensnummer: 18

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 113 Data- och informationsvetenskap

Citera det här