This post is part of the ongoing series of reading reflections on HCI papers and articles. You can find the other posts ‘here.’
How Good is 85%? A Survey Tool to Connect Classifier Evaluation to Acceptability of Accuracy - Kay, Patel, and Kientz CHI 2015
Digital Object Identifier (DOI): https://doi.org/10.1145/2702123.2702603
This article discusses a new metric, termed by the authors as “acceptability of accuracy”, which measures how acceptable a classifier is to a user. The authors explain how users tolerate accuracy errors in different applications (e.g. weather prediction, home alarms, etc.) and how tolerant they are to different types of errors (e.g. false positives, false negatives, etc.). The authors also discuss how this metric can be used to target the desired accuracy of future classifiers.
Prompts
Why is this article important?
This article introduces a novel and human-centric metric to evaluate the accuracy of classifiers. The authors also walk through how this metric can be used in the real world through user surveys, how they should be structured, and even gives an example of a survey and how to interpret them and improve them. I believe this article is important as it discusses a more human-centric approach to evaluating instead of relying on the traditional F-measure, which does not differentiate between recall and precision.
What do the authors seem to be assuming about the future of A.I. and Human Interaction?
The authors are sure users value accuracy differently, and this metric can be used to evaluate the accuracy of classifiers. They also seem to be assuming that this metric can be used to target the desired accuracy of future classifiers. They believe this metric can help researchers and developers have a metric to target and improve before bringing a product to market, to meet users’ accuracy expectations. They also believe this metric can help researchers and developers understand the tradeoffs between accuracy and other factors, such as user experience, time to market, and when to train and predict, and help them make informed decisions.
How might you integrate the reading into your academic/professional work?
I would consider the expectation of user accuracy and the other trade-offs mentioned in the article in my project’s design and ideation phase. I agree with the paper that sometimes the best way to judge users’ perceptions is to ask them directly, via surveys or interviews, to crowdsource various opinions, and doing that early in the design and development process to help make informed decisions about the product.