New record for storing digital data in DNA


The DNA inside every cell of the body stores biological information. Recently, scientists from Microsoft and the University of Washington have to make a new record for storing digital data and retrieve 200 MB of information on DNA strands.

Scientists condensed physical size of data storage devices while simultaneously increasing the storage capacity of hundreds of gigabytes. They squeeze it on such devices that fit in the palm of the hand. This technology is capable of storing the world’s total data heads towards an estimated 44 trillion GB by 2020.

Current devices like Hard drives and optical storage are a temporary solution to the problem. They are vulnerable to damage and degradation, with a life expectancy of a few decades at best. Scientists want DNA to have a potential solution for capacity and longevity problems. DNA is nature’s hard disk is able to squeeze 125,000 GB of information per cubic millimeter.

According to Luis Ceze, a professor at the University of Washington professor, “By that measure, all 700 Exabyte’s of today’s accessible internet would fit into space the size of a shoebox.”

The DNA data will remain as it is even after wrapping that shoebox away in a vault for thousands of years. Synthetic fossil provides proof of it. According to that, it contains animal genetic code particles even thousands of years after they died out. Hence, scientists found that DNA is incredibly strong and capable of storing digital data for 1,000 years under the right conditions.

Discovery of this new record for storing digital data:

In collaboration with Twist Bioscience, scientists have encoded the data onto the DNA strands. They just took benefit of the similarities between DNA’s natural code and the binary language of computer code.

Ceze said, “DNA already contained with digital flavour as it has four bases and molecules that ‘stick’ to each other in a very programmable way. So the first step in storing digital data into DNA is to map strings of 1s and 0s into strings of As, Cs, Gs and Ts.”

Scientists used Polymerase Chain Reaction techniques for authorizing addresses to the sequences. This helps them to find the desired data. By using a silicon-based DNA synthesis substrate, DNA sequences are produced chemically. Silicon-based DNA synthesis substrate is the reason to make various sequences together. Once it has been done, the DNA is kept inside test tube and dehydrated. Because if it is kept away from light and heat, it can potentially remain for thousands of years.

DNA sequence is essential for reading data. It reads the sequence of As, Cs, Gs and Ts, and algorithms that translate that back into the original digital data. During the translation process, some data might be lost. Hence, the researchers used an error correction technique, which is used in computer memory, to overcome this barrier.

Ceze said, “Although being reliable, DNA writing and reading have errors. This is similar to hard drives and electronic memories have errors. Therefore, we developed error-correcting codes to reliably retrieve data. By doing this, we didn’t lose even a single byte of information.”

The team was able to successfully access the data randomly. This process makes them detect and retrieve the desired sequences from a large pool of random DNA molecules. The technique of writing and reading data onto DNA strands is still away from being it to good use for storing photos and videos.

“There are still many challenges in making DNA storage mainstream. We will continue to focus on developing an end-to-end system and work with our Microsoft and Twist Bioscience collaborators to reduce the cost and increase the speed of writing and reading DNA,” said Ceze.

- Advertisement -

Latest Updates