On the Consistency, Discriminative Power and Robustness of Sampled Metrics in Offline Top-N Recommender System Evaluation

Forskningsoutput: KonferensbidragSammanfattningPeer review

Sammanfattning

Negative item sampling in offline top-n recommendation evaluation has become increasingly wide-spread, but remains controversial. While several studies have warned against using sampled evaluation metrics on the basis of being a poor approximation of the full ranking (i.e. using all negative items), others have highlighted their improved discriminative power and potential to make evaluation more robust. Unfortunately, empirical studies on negative item sampling are based on relatively few methods (between 3-12) and, therefore, lack the statistical power to assess the impact of negative item sampling in practice. In this article, we present preliminary findings from a comprehensive benchmarking study of negative item sampling based on 52 recommendation algorithms and 3 benchmark data sets. We show how the number of sampled negative items and different sampling strategies affect the consistency and discriminative power of sampled evaluation metrics. Furthermore, we investigate the impact of sparsity bias and popularity bias on the robustness of these metrics. In brief, we show that the optimal parameterizations for negative item sampling are dependent on data set characteristics and the goals of the investigator, suggesting a need for greater transparency in related experimental design decisions.

Originalspråkengelska
Sidor1152-1157
Antal sidor6
DOI
StatusPublicerad - 14 sep. 2023
MoE-publikationstypEj behörig
EvenemangACM Conference on Recommender Systems - Singapore, Singapore
Varaktighet: 18 sep. 202322 sep. 2023
Konferensnummer: 17

Konferens

KonferensACM Conference on Recommender Systems
Förkortad titelRecSys
Land/TerritoriumSingapore
OrtSingapore
Period18/09/202322/09/2023

Bibliografisk information

Publisher Copyright:
© 2023 Owner/Author.

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här