ELOQUENT CLEF Shared Tasks for Evaluation of Generative Language Model Quality

Jussi Jerker Karlgren, Luise Dürlich, Evangelia Gogoulou, Liane Guillou, Joakim Nivre, Magnus Sahlgren, Aarne Talman

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

ELOQUENT is a set of shared tasks for evaluating the quality and usefulness of generative language models. ELOQUENT aims to bring together some high-level quality criteria, grounded in experiences from deploying models in real-life tasks, and to formulate tests for those criteria, preferably implemented to require minimal human assessment effort and in a multilingual setting. The selected tasks for this first year of ELOQUENT are (1) probing a language model for topical competence; (2) assessing the ability of models to generate and detect hallucinations; (3) assessing the robustness of a model output given variation in the input prompts; and (4) establishing the possibility to distinguish human-generated text from machine-generated text.
Original languageEnglish
Title of host publicationAdvances in Information Retrieval. ECIR 2024
EditorsN. Goharian, et al.
Place of PublicationCham
PublisherSpringer
Publication date23 Mar 2024
Pages459–465
ISBN (Print)978-3-031-56068-2
ISBN (Electronic)978-3-031-56069-9
DOIs
Publication statusPublished - 23 Mar 2024
Externally publishedYes
MoE publication typeA4 Article in conference proceedings
EventEuropean Conference on Information Retrieval: ECIR - Glasgow, United Kingdom
Duration: 24 Mar 202428 Mar 2024
Conference number: 46

Publication series

Name Lecture Notes in Computer Science
PublisherSpringer
Volume14612
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Fields of Science

  • 6121 Languages
  • 113 Computer and information sciences

Cite this