The Impact of Negative Relevance Judgments on NDCG
L. Gienapp, M. Fröbe, M. Hagen, and M. Potthast. Proceedings of the 29th ACM International Conference on Information & Knowledge Management, page 2037–2040. New York, NY, USA, Association for Computing Machinery, (2020)
DOI: 10.1145/3340531.3412123
Abstract
NDCG is one of the most commonly used measures to quantify system performance in retrieval experiments. Though originally not considered, graded relevance judgments nowadays frequently include negative labels. Negative relevance labels cause NDCG to be unbounded. This is probably why widely used implementations of NDCG map negative relevance labels to zero, thus ensuring the resulting scores to originate from the 0,1 range. But zeroing negative labels discards valuable relevance information, e.g., by treating spam documents the same as unjudged ones, which are assigned the relevance label of zero by default. We show that, instead of zeroing negative labels, a min-max-normalization of NDCG retains its statistical power while improving its reliability and stability.
%0 Conference Paper
%1 10.1145/3340531.3412123
%A Gienapp, Lukas
%A Fröbe, Maik
%A Hagen, Matthias
%A Potthast, Martin
%B Proceedings of the 29th ACM International Conference on Information & Knowledge Management
%C New York, NY, USA
%D 2020
%I Association for Computing Machinery
%K cumulated discounted evaluation, gain, information judgements, normalized relevance reliability, retrieval, stability
%P 2037–2040
%R 10.1145/3340531.3412123
%T The Impact of Negative Relevance Judgments on NDCG
%U https://doi.org/10.1145/3340531.3412123
%X NDCG is one of the most commonly used measures to quantify system performance in retrieval experiments. Though originally not considered, graded relevance judgments nowadays frequently include negative labels. Negative relevance labels cause NDCG to be unbounded. This is probably why widely used implementations of NDCG map negative relevance labels to zero, thus ensuring the resulting scores to originate from the 0,1 range. But zeroing negative labels discards valuable relevance information, e.g., by treating spam documents the same as unjudged ones, which are assigned the relevance label of zero by default. We show that, instead of zeroing negative labels, a min-max-normalization of NDCG retains its statistical power while improving its reliability and stability.
%@ 9781450368599
@inproceedings{10.1145/3340531.3412123,
abstract = {NDCG is one of the most commonly used measures to quantify system performance in retrieval experiments. Though originally not considered, graded relevance judgments nowadays frequently include negative labels. Negative relevance labels cause NDCG to be unbounded. This is probably why widely used implementations of NDCG map negative relevance labels to zero, thus ensuring the resulting scores to originate from the [0,1] range. But zeroing negative labels discards valuable relevance information, e.g., by treating spam documents the same as unjudged ones, which are assigned the relevance label of zero by default. We show that, instead of zeroing negative labels, a min-max-normalization of NDCG retains its statistical power while improving its reliability and stability.},
added-at = {2024-10-02T10:38:17.000+0200},
address = {New York, NY, USA},
author = {Gienapp, Lukas and Fr\"{o}be, Maik and Hagen, Matthias and Potthast, Martin},
biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/20199466f58a67187588c2a1cfca7bf6d/scadsfct},
booktitle = {Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
doi = {10.1145/3340531.3412123},
interhash = {5df1c60e95e9ca3f564d8057062020a8},
intrahash = {0199466f58a67187588c2a1cfca7bf6d},
isbn = {9781450368599},
keywords = {cumulated discounted evaluation, gain, information judgements, normalized relevance reliability, retrieval, stability},
location = {Virtual Event, Ireland},
numpages = {4},
pages = {2037–2040},
publisher = {Association for Computing Machinery},
series = {CIKM '20},
timestamp = {2024-10-02T10:38:17.000+0200},
title = {The Impact of Negative Relevance Judgments on NDCG},
url = {https://doi.org/10.1145/3340531.3412123},
year = 2020
}