New system that store data in organic molecules for millennia

A new way to store information in molecules could preserve the contents of the New York Public Library in a teaspoon of protein, without energy, for millions of years.


Although the information is ubiquitous, and its technology arguably among the highest that humankind has produced, its very ubiquity has posed new types of problems. One of the problems is storage. Even the cloud storage that promises to offer opaque, endless space also run out of space.

Now, Harvard scientists have developed a new way to store information could stably house data for millions of years, lives outside the hackable internet, and, once written, uses no energy. All you need is a chemist, some cheap molecules, and your precious information. This new system for reading and writing information with organic molecules could potentially store the information for thousands of years.

Researchers have long known that DNA can be used for data storage—after all, it stores the blueprint for making individual humans and transmits it from one generation to the next. It can store huge amounts of data in a tiny space, and is extremely stable, surviving for millennia under the right conditions.

Be that as it may, DNA has its own obstacles. As far as molecules go it’s moderately large, and perusing and composing it tends to be a fiddly and time-consuming procedure. Thus, scientists have developed a strategy that indirectly borrowed from biology. Depending on the techniques common in organic and analytical chemistry, they developed a method that uses small, low molecular weight molecules to encode information.

To developed this technique, scientists used oligopeptides instead of DNA. Oligopeptides are small molecules made up of a varying number of amino acids. Additionally, these are common, stable, and smaller than DNA, RNA or proteins.

 Pairing molecule mass and binary code, the Whitesides team can "write" massive amounts of data Credit: Michael J. Fink

Pairing molecule mass and binary code, the Whitesides team can “write” massive amounts of data
Credit: Michael J. Fink

Making words from the letters is a bit complicated: In a microwell—like a miniature version of whack-a-mole but with 384-mole holes—each well contains oligopeptides with varying masses. Just as ink is absorbed on a page, the oligopeptide mixtures are then assembled on a metal surface where they are stored. If the team wants to read back what they “wrote,” they take a look at one of the wells through a mass spectrometer, which sorts the molecules by mass. This tells them which oligopeptides are present or absent: Their mass gives them away.

Then, to translate the jumble of molecules into letters and words, they borrowed the binary code. An “M,” for example, uses four of eight possible oligopeptides, each with a different mass. The four floating in the well receive a “1,” while the missing four receive a “0.” The molecular-binary code points to a corresponding letter or if the information is an image, a corresponding pixel.

With this method, a mixture of eight oligopeptides can store one byte of information; 32 can store four bytes, and more could store even more.

During tests, scientists were able to write, store and read 400 kB of data, including a written transcript of a lecture, a photo, and a painting. They noted, “the average writing speed is eight bits per second and reading takes 20 bits per second, with an accuracy of 99.9 percent.”

Scientists noted, “Oligopeptides have stabilities of hundreds or thousands of years under suitable conditions. The hardy molecules could endure without light or oxygen, in high heat and drought. And, unlike the cloud, which hackers can access from their favorite easy chair, the molecular storage can only be accessed in person. Even if a thief finds the data stash, a little chemistry is needed to retrieve the code.”

The research was published in the journal ACS Central Science.