Challenges in diagnosing darker skin conditions

Deep learning assists in diagnosing skin diseases across different skin tones.

Share

A recent study by MIT researchers reveals that when diagnosing skin diseases using images of a patient’s skin, doctors perform less accurately with darker skin tones. The study involved over 1,000 dermatologists and general practitioners, showing that dermatologists accurately characterized about 38% of the images overall but only 34% of those depicting darker skin. 

General practitioners, who were less accurate overall, also showed decreased accuracy with darker skin. Additionally, the study found that artificial intelligence algorithms could assist doctors and improve accuracy. However, the improvements were more significant for patients with lighter skin tones.

This study is the first to show differences in how doctors diagnose based on skin tone. Previous research suggests that dermatology textbooks and training materials mostly show lighter skin tones, possibly contributing to this gap. Additionally, some doctors may need more experience treating patients with darker skin, which could also be a factor.

Matt Groh, PhD ’23, an assistant professor at the Northwestern University Kellogg School of Management, said, “Probably no doctor intends to do worse on any person, but it might be the fact that you don’t have all the knowledge and the experience, and therefore on certain groups of people, you might do worse. This is one of those situations where you need empirical evidence to help people figure out how you might want to change policies around dermatology education.”

Lead author Groh and senior author Rosalind Picard, an MIT professor, collaborated on a study published in Nature Medicine. Inspired by a previous MIT study by Joy Buolamwini, which highlighted errors in facial-analysis programs for darker-skinned individuals, Groh investigated whether AI models and doctors might struggle with diagnosing skin diseases on darker skin tones. 

The goal was to address potential social problems and improve diagnostic accuracy using AI assistance. Groh emphasized applying machine learning to real-world issues like medicine to enhance decision-making and improve patient outcomes.

Researchers gathered 364 images from various sources to evaluate doctors’ diagnostic accuracy for 46 skin diseases. The images covered a range of skin tones. They primarily depicted inflammatory skin conditions like atopic dermatitis, Lyme disease, and secondary syphilis, along with cutaneous T-cell lymphoma (CTCL), a rare cancer. Since diseases can manifest differently on dark and light skin, the study aimed to assess potential disparities. 

Participants were shown ten images each, including dermatologists, residents, and general practitioners recruited from a social networking site for doctors. They provided their top three disease predictions for each image. They indicated if they would recommend a biopsy or refer the patient to a dermatologist.

Picard mentions that while online triage with skin images is less thorough than in-person exams, it’s more scalable and convenient for using machine-learning algorithms. Specialists in dermatology showed higher accuracy rates, correctly identifying 38% of images compared to 19% for general practitioners. 

Both groups saw a decrease in accuracy, losing about four percentage points when diagnosing darker skin conditions. Dermatologists were also less inclined to recommend biopsies for CTCL images on darker skin but more likely to suggest biopsies for noncancerous conditions.

Jenna Lester, an associate professor of dermatology at the University of California, San Francisco, notes that the study highlights an apparent disparity in diagnosing skin conditions in dark skin tones, which has yet to be extensively demonstrated in previous literature. She suggests further research to determine the causes and potential solutions to this disparity. 

The researchers also tested an AI algorithm they developed, trained on 30,000 images, to assist doctors in diagnosing skin conditions. The algorithm achieved an accuracy rate of about 47%, and they also tested a version with an artificially inflated success rate of 84% to assess its impact on doctors’ decisions. This allowed them to evaluate current AI assistance and potential future improvements.

Both AI algorithms showed equal accuracy on light and dark skin tones, improving diagnostic accuracy for dermatologists (up to 60%) and general practitioners (up to 47%). Doctors tended to follow suggestions from the higher-accuracy algorithm after it provided correct answers but ignored incorrect recommendations, indicating their skill in ruling out diseases. 

Dermatologists using AI showed similar accuracy increases regardless of skin tone. In contrast, general practitioners showed more significant improvement with lighter skin images. The study suggests more training on darker skin patients in medical schools and textbooks. It can guide the development of AI assistance programs in dermatology.

The study highlights the challenges doctors face in accurately diagnosing diseases when presented with images of darker skin tones. By leveraging AI assistance, doctors can improve diagnostic accuracy, ultimately improving patient outcomes. The findings emphasize the importance of addressing diagnostic disparities and integrating diversity-focused training in medical education to ensure equitable healthcare for all patient populations.

Journal reference:

  1. Groh, M., Badri, O., Daneshjou, R. et al. Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nature Medicine. DOI:10.1038/s41591-023-02728-3.

Newsletter

See stories of the future in your inbox each morning.

Trending