Inproceedings,

Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain

L. Gienapp, B. Stein, M. Hagen, and M. Potthast.
Proceedings of the 29th ACM International Conference on Information & Knowledge Management, page 2033–2036. New York, NY, USA, Association for Computing Machinery, (2020)
DOI: 10.1145/3340531.3412109

Abstract

Information retrieval evaluation has to consider the varying "difficulty" between topics. Topic difficulty is often defined in terms of the aggregated effectiveness of a set of retrieval systems to satisfy a respective information need. Current approaches to estimate topic difficulty come with drawbacks such as being incomparable across different experimental settings. We introduce a new approach to estimate topic difficulty, which is based on the ratio of systems that achieve an NDCG score that is better than a baseline formed as random ranking of the pool of judged documents. We modify the NDCG measure to explicitly reflect a system's divergence from this hypothetical random ranker. In this way we achieve relative comparability of topic difficulty scores across experimental settings as well as stability to outlier systems?features lacking in previous difficulty estimations. We reevaluate the TREC 2012 Web Track's ad hoc task to demonstrate the feasibility of our approach in practice.

BibTeX key: 10.1145/3340531.3412109
entry type: inproceedings
address: New York, NY, USA
booktitle: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
year: 2020
pages: 2033–2036
publisher: Association for Computing Machinery
series: CIKM '20
isbn: 9781450368599
numpages: 4
location: Virtual Event, Ireland
DOI: 10.1145/3340531.3412109
url: https://doi.org/10.1145/3340531.3412109

PUMA

Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on