Inproceedings,

Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain

, , , and .
Proceedings of the 29th ACM International Conference on Information & Knowledge Management, page 2033–2036. New York, NY, USA, Association for Computing Machinery, (2020)
DOI: 10.1145/3340531.3412109

Abstract

Information retrieval evaluation has to consider the varying "difficulty" between topics. Topic difficulty is often defined in terms of the aggregated effectiveness of a set of retrieval systems to satisfy a respective information need. Current approaches to estimate topic difficulty come with drawbacks such as being incomparable across different experimental settings. We introduce a new approach to estimate topic difficulty, which is based on the ratio of systems that achieve an NDCG score that is better than a baseline formed as random ranking of the pool of judged documents. We modify the NDCG measure to explicitly reflect a system's divergence from this hypothetical random ranker. In this way we achieve relative comparability of topic difficulty scores across experimental settings as well as stability to outlier systems?features lacking in previous difficulty estimations. We reevaluate the TREC 2012 Web Track's ad hoc task to demonstrate the feasibility of our approach in practice.

Tags

Users

  • @scadsfct

Comments and Reviews