Evaluating the interpretability of AI technologies

Shining a light into the ‘‘black box’’ of AI.


Follow us onFollow Tech Explorist on Google News

Time series data are ubiquitous. The availability of this type of data is increasing, and so is the need for automated analysis tools capable of extracting interpretable and actionable knowledge from them. 

To this end, established and more interpretable time-series approaches remain competitive for many tasks. This data can be modeled by AI technologies to build diagnostic or predictive tools. Yet adopting AI technologies as black-box tools is problematic in several applied contexts.

A brand-new technique for assessing the interpretability of artificial intelligence (AI) technologies has been developed by researchers from the University of Geneva (UNIGE), the Geneva University Hospitals (HUG), and the National University of Singapore (NUS), opening the door to more transparency and confidence in AI-driven diagnostic and predictive tools.

The novel method clarifies the mysterious inner workings of so-called “black box” AI algorithms, aiding users in understanding how AI outcomes are influenced and whether those results can be relied upon. This is crucial in circumstances where human health and life are significantly affected, such as when AI is used in healthcare. 

Professor Christian Lovis, Director of the Department of Radiology and Medical Informatics at the UNIGE Faculty of Medicine and Head of the Division of Medical Information Science at the HUG, who co-directed this work, said, “The way these algorithms work is opaque, to say the least. Of course, the stakes, particularly financial, are extremely high. But how can we trust a machine without understanding the basis of its reasoning? These questions are essential, especially in sectors such as medicine, where AI-powered decisions can influence the health and even the lives of people; and finance, where they can lead to enormous loss of capital.”

Assistant Professor Gianmarco Mengaldo, Director of the MathEXLab at the National University of Singapore’s College of Design and Engineering, who co-directed the work, said, “Interpretability methods aim to answer these questions by deciphering why and how an AI reached a given decision and the reasons behind it. Knowing what elements tipped the scales in favor of or against a solution in a specific situation, thus allowing some transparency, increases the trust that can be placed in them.”

“However, the current interpretability methods widely used in practical applications and industrial workflows provide tangibly different results when applied to the same task. This raises the important question: what interpretability method is correct, given that there should be a unique, correct answer? Hence, evaluating interpretability methods becomes as important as interpretability per se.”

Doctoral student in Prof Lovis’ laboratory and first author of the study Hugues Turbé explains, “Discriminating data is critical in developing interpretable AI technologies. For example, when an AI analyses images, it focuses on a few characteristic attributes. AI can, for example, differentiate between an image of a dog and an image of a cat. The same principle applies to analyzing time sequences: the machine needs to be able to select elements – peaks that are more pronounced than others, for example – to base its reasoning on. ECG signals mean reconciling signals from the different electrodes to evaluate possible dissonances that would indicate a particular cardiac disease.”

Selecting an interpretability approach from the many available for a given goal can be difficult. Even when used on the same dataset and job, various AI interpretability algorithms frequently generate substantially different outcomes. The researchers created two novel evaluation methods to assist in understanding how the AI makes decisions to handle this challenge: one for determining the most pertinent parts of a signal and another for determining their relative relevance in relation to the final prediction.

They concealed a portion of the data to see if it was necessary for the AI’s decision-making to assess interpretability. This method, meanwhile, occasionally led to inaccurate results. They trained the AI on an enhanced dataset that contains hidden data to account for this and maintain the accuracy and balance of the data. The team then developed two metrics to assess the effectiveness of the interpretability approaches, demonstrating whether the AI was using the appropriate data to inform decisions and whether all available data was being treated equally. 

Hugues Turbé said“Overall, our method aims to evaluate the model that will be used within its operational domain, thus ensuring its reliability.”

Journal Reference:

  1. Turbé, H., Bjelogrlic, M., Lovis, C. et al. Evaluation of post-hoc interpretability methods in time-series classification. Nat Mach Intell 5, 250–260 (2023). DOI: 10.1038/s42256-023-00620-w