Using machine learning to organize the chemical diversity

Machine-learning helps sort out massive materials' databases.

Follow us onFollow Tech Explorist on Google News

Because of the popularity of MOFs, scientists are developing, synthesizing, studying, and cataloging MOFs. However, the sheer number of MOFs is creating a problem.

Even if synthesizing new MOF, it is quite challenging to know whether it is new and not some minor variation of a structure that has already been synthesized.

To address this problem, EPFL scientists, in collaboration with MIT, have used machine-learning to organize the chemical diversity found in the ever-growing databases for the popular metal-organic framework materials. Using machine learning, scientists developed a “language” to compare two materials and quantify their differences.

Through this new language, scientists set off to determine the chemical diversity in MOF databases.

Professor Berend Smit at EPFL said, “Before, the focus was on the number of structures. But now, we discovered that the major databases have all kinds of bias towards particular structures. There is no point in carrying out expensive screening studies on similar structures. One is better off in carefully selecting a set of very diverse structures, which will give much better results with far fewer structures.”

Another exciting application is “scientific archeology”: The researchers used their machine-learning system to identify the MOF structures that, at the time of the study, were published as very different from the ones that are already known.

Smit said“So we now have a straightforward tool that can tell an experimental group how different their novel MOF is compared to the 90,000 other structures already reported.”

Journal Reference:
  1. Seyed Mohamad Moosavi, Aditya Nandy, Kevin Maik Jablonka, Daniele Ongari, Jon Paul Janet, Peter G. Boyd, Yongjin Lee, Berend Smit, Heather J. Kulik. Understanding the diversity of the metal-organic framework ecosystem. Nature Communications 11, 4068 (2020). DOI: 10.1038/s41467-020-17755-8