Abstract
Validation metrics are key for tracking scientific progress and
bridging the current chasm between artificial intelligence
research and its translation into practice. However, increasing
evidence shows that, particularly in image analysis, metrics are
often chosen inadequately. Although taking into account the
individual strengths, weaknesses and limitations of validation
metrics is a critical prerequisite to making educated choices,
the relevant knowledge is currently scattered and poorly
accessible to individual researchers. Based on a multistage
Delphi process conducted by a multidisciplinary expert
consortium as well as extensive community feedback, the present
work provides a reliable and comprehensive common point of
access to information on pitfalls related to validation metrics
in image analysis. Although focused on biomedical image
analysis, the addressed pitfalls generalize across application
domains and are categorized according to a newly created,
domain-agnostic taxonomy. The work serves to enhance global
comprehension of a key topic in image analysis validation.
Users
Please
log in to take part in the discussion (add own reviews or comments).