Many AI systems fail to grasp human error and uncertainty. It mainly happens in systems where a human provides feedback to a machine-learning model. These systems are designed to think that people are always certain and right, while in reality, decision-making involves uncertainty and blunders from time to time.
To better account for uncertainty in AI applications where humans and machines collaborate, researchers from the University of Cambridge have worked with The Alan Turing Institute, Princeton, and Google DeepMind to close the gap between human behavior and machine learning. In particular, where safety is crucial, such as in the identification of medical conditions, this could assist in reducing risk and improve the confidence and reliability of these applications.
For individuals to provide feedback and express their level of uncertainty when labeling a particular image, the team modified a well-known image classification dataset. They discovered that although humans also reduce the overall performance of these hybrid systems, training with uncertain labels can increase their ability to handle ambiguous feedback.
First author, Katherine Collins from Cambridge’s Department of Engineering, said, “Uncertainty is central in how humans reason about the world, but many AI models fail to consider this. Many developers are working to address model uncertainty, but less work has been done on addressing uncertainty from the person’s point of view.”
“Many human-AI systems assume that humans are always certain of their decisions, which isn’t how humans work – we all make mistakes. We wanted to look at what happens when people express uncertainty, which is especially important in safety-critical settings, like a clinician working with a medical AI system.”
Co-author Matthew Barker recently completed his MEng degree at Gonville & Caius College, Cambridge, said, “We need better tools to recalibrate these models so that the people working with them are empowered to say when they’re uncertain. Although machines can be trained confidently, humans often can’t provide this, and machine learning models struggle with that uncertainty.”
The researchers employed three benchmark machine learning datasets for their study: one for categorizing digits, one for classifying chest X-rays, and one for classifying bird photos.
For the first two datasets, the researchers simulated uncertainty; however, for the dataset on birds, they asked participants to rate their level of certainty about the images they were viewing, such as whether a bird was red or orange.
The researchers could ascertain how the final product was altered thanks to the human participants’ annotated “soft labels.” However, they discovered that when humans replaced machines, performance quickly declined.
Researchers said, “Their results have identified several open challenges when incorporating humans into machine learning models. They are releasing their datasets so that further research can occur and uncertainty might be built into machine learning systems.”
Barker said, “In some ways, this work raised more questions than answered. But even though humans may be miscalibrated in their uncertainty, we can improve the trustworthiness and reliability of these human-in-the-loop systems by accounting for human behavior.”
- Katherine M. Collins et al.’ Human Uncertainty in Concept-Based AI Systems.’ Paper presented at the Sixth AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES 2023), August 8-10, 2023. Montréal, QC, Canada.