IBM’s AI model predicts Breast Cancer using mammography data and health records


Breast cancer is the most commonly diagnosed cancer among women across the world and is also a leading cause of cancer-related deaths in women. Early detection of the disease could make a big difference in the treatment. Earlier detection might give the person more options for successful treatment when the disease is still in its early stages.

A team of IBM researchers has come up with the solution that achieves an impressive level of accurate detection of breast cancer. The team developed an AI model capable of predicting the development of malignant breast cancer in patients within a year, with an accuracy rate comparable to human radiologists.

The AI algorithm – built using both machine learning and deep learning – analyses mammograms and patients clinical information in electronic health records (EHR) to accurately diagnose breast cancer – even in the cases where the disease is missed by the radiologists. The results show that their model was able to correctly predict the development of breast cancer in 87% of the cases it analyzed and was also able to correctly interpret 77 percent of non-cancerous cases.

Image: Radiology
Image: Radiology

The team collected a dataset of 52,936 images from 13,234 women who underwent at least one mammogram between 2013 and 2017, and who had health records for at least one year prior to the mammogram. The new AI algorithm was then trained on more than 9,000 matching sets of mammograms and health records, using data from both to predict biopsy malignancy and differentiate a normal and abnormal examination.

The resulting algorithm was validated with data from 1055 women and tested in 2548 women (mean age, 55 years ± 10 [standard deviation]). In the test set, the algorithm identified 48% of false-negative findings on mammograms. It was found to have an area under the ROC curve (AUC) of 0.91, a specificity of 77.3% and sensitivity of 87%, for predicting biopsy malignancy.

“Our model could one day help radiologists to confirm or deny positive breast cancer cases,” wrote Michal Chorev, IBM Research, in a blog post. “While false positives can cause an enormous amount of undue stress and anxiety, false negatives can often hamper how early cancer is detected and subsequently treated.”

The algorithm was also compared with the previously existing breast cancer risk assessment tool, the Gail Model. And the team found that the new algorithm had much higher AUC (0.78 compared to 0.54) than the Gail Model.

Although the team’s model did not necessarily outperform radiologists, its performance did fall in the acceptable range of radiologists for breast cancer screening. This shows significant potential for helping healthcare providers, according to the researchers.

“The model did not perform better than radiologists; it performed differently. In a scenario where double reading at screening mammography is not available, as is the case at Assuta Medical Centers, we believe that the use of this model as a second reader could be beneficial,” wrote the author.

The IBM system can be particularly important in countries where due to lack of staff, it impractical for another radiologist to weigh in, or in any case, where there isn’t much time for human checks.

Well, we can’t say this is going to be the most advanced form of breast cancer prediction. Recently, MIT has developed a method that works up to five years in advance, using just images.

Akselrod-Ballin Ph.D., from IBM Research, noted that their algorithm could be improved more in many ways. For example, training it to compare findings to previous mammograms and use of ultrasound images could further enhance the ML-DL model’s performance.

The paper is published this month in the journal Radiology.


See stories of the future in your inbox each morning.