Artificial intelligence model “learns” from patient data to make cancer treatment less toxic

The machine-learning system determines the fewest, smallest doses that could still shrink brain tumors.


Glioblastoma is a threatening tumor that develops in the brain or spinal cord, and anticipation for grown-ups is close to five years. Patients must bear a blend of radiation treatment and numerous medications taken each month.

Medicinal experts by and large control most extreme safe medication measurements to recoil the tumor however much as could reasonably be expected. Be that as it may, these strong pharmaceuticals still cause debilitating side effects in patients.

To address this issue, MIT scientists have developed a machine learning model that could potentially improve the quality of life for patients by reducing toxic chemotherapy and radiotherapy dosing for glioblastoma. The model takes a gander at treatment regimens at present being used and iteratively changes the measurements. In the end, it finds an ideal treatment design, with the most reduced conceivable intensity and recurrence of dosages that should at present lessen tumor sizes to a degree practically identical to that of conventional regimens.

During the trial on 50 patients, scientists found that the model designed treatment cycles that reduced the potency to a quarter or half of nearly all the doses while maintaining the same tumor-shrinking potential.

Pratik Shah, a principal investigator at the Media Lab who supervised this research said, “We kept the goal, where we have to help patients by reducing tumor sizes but, at the same time, we want to make sure the quality of life — the dosing toxicity — doesn’t lead to overwhelming sickness and harmful side effects.”

The model works by using a technique called reinforced learning (RL), that comprises artificially intelligent “agents” that complete “actions” in an unpredictable, complex environment to reach the desired outcome.

At whatever point it finishes an activity, the agent gets a “reward” or “penalty,” relying upon whether the activity moves in the direction of the result. At that point, the specialist changes its activities as needs be to accomplish that result.

Rewards and penalties essentially positive and negative numbers say +1 or – 1. Their qualities shift by the move made, figured by the likelihood of succeeding or falling flat at the result, among different elements. The specialist is basically attempting to numerically advance all activities, in light of reward and punishment esteems, to get to a most extreme result score for a given task.

The model’s agent combs through traditionally administered regimens. These regimens are based on protocols that have been used clinically for decades and are based on animal testing and various clinical trials. Oncologists use these established protocols to predict how much doses to give patients based on weight.

As the model explores the regimen, at each planned dosing interval — say, once a month — it decides on one of several actions. It can, first, either initiate or withhold a dose. If it does administer, it then decides if the entire dose, or only a portion, is necessary.

At each action, it pings another clinical model — often used to predict a tumor’s change in size in response to treatments — to see if the action shrinks the mean tumor diameter. If it does, the model receives a reward.

Whenever the model chooses to administer all full doses, therefore, it gets penalized, so instead chooses fewer, smaller doses.

Pratik Shah, a principal investigator at the Media Lab who supervised this research said, “If all we want to do is reduce the mean tumor diameter, and let it take whatever actions it wants, it will administer drugs irresponsibly. Instead, we need to reduce the harmful actions it takes to get to that outcome.”

“This represents an “unorthodox RL model, described in the paper for the first time,” Shah says, that weighs potential negative consequences of actions (doses) against an outcome (tumor reduction). Traditional RL models work toward a single outcome, such as winning a game, and take any and all actions that maximize that outcome.”

“Our model, at each action, has the flexibility to find a dose that doesn’t necessarily solely maximize tumor reduction, but that strikes a perfect balance between maximum tumor reduction and low toxicity. This technique has various medical and clinical trial applications, where actions for treating patients must be regulated to prevent harmful side effects.”

The researchers also designed the model to treat each patient individually, as well as in a single cohort and achieved similar results (medical data for each patient was available to the researchers). Traditionally, the same dosing regimen is applied to groups of patients, but differences in tumor size, medical histories, genetic profiles, and biomarkers can all change how a patient is treated.

The paper being presented next week at the 2018 Machine Learning for Healthcare conference at Stanford University.

Latest Updates