DNA Data Storage: Synthetic bases store even more information

Image: Pixabay

Researchers announce a new breakthrough for potential memory of the future based on artificially generated DNA. By adding synthetic bases, it should be possible to store far more data in a small space.
Research has been going on for years on a data storage device that uses deoxyribonucleic acid (DNA), which is used in nature to store the genetic material of living beings. An artificial DNA memory should accommodate huge amounts of data in a tiny space and keep them for thousands of years and more.

More bases for even more data

US researchers have now presented a method with which the already huge storage density of DNA can be increased even further. Natural DNA stores information about the sequence of the four nucleobases adenine (A), guanine (G), cytosine (C) and thymine (T). A team from the University of Illinois Urbana-Champaign has now succeeded in expanding the DNA memory by adding seven synthetic bases, as described in their article in the scientific journal Nano Letters. Similar to the words of the alphabet, the number of possible combinations and thus the number of ones and zeros translated in this way is increasing.
"Instead of converting zeros and ones into A, G, C and T, we can convert zeros and ones into A, G, C, T and the seven new letters of the memory alphabet," says Kasra Tabatabaei, one of the researchers, illustrating the principle.

Deep learning helps with reading

At the same time, however, a new method for reading out the data had to be developed, since not all of the current systems can handle the modified DNA strands. A combination of nanopore sequencing and deep learning with the help of artificial intelligence was used.
“We tried 77 different combinations of the 11 nucleotides, and our method was able to perfectly distinguish each of them […] The deep learning framework, as part of our method for identifying different nucleotides, is universal, which is the generalization of our approach to many other applications,” the researchers explained.
Their scientific article also mentions the potential for almost doubling the storage density and reducing the latency during recording by the same factor.
Overall, the extended molecular alphabet may potentially offer a nearly 2-fold increase in storage density and potentially the same order of reduction in the recording latency, thereby enabling new implementations of molecular recorders.

Excerpt from the article at Nano Letters

The infrastructure still leaves something to be desired
In principle, even before the new method, 1 million terabytes could be stored in one cubic millimeter of DNA. As Golem writes, the entire Internet could theoretically be stored in memory the size of a shoebox.
However, the devices for storing and reading out the DNA data are still complicated and expensive.
“Compared to conventional storage media, the process steps are complex and expensive, difficult to automate and difficult to integrate into mobile systems that can be used in practice. They are therefore particularly suitable for the stationary archiving of large amounts of data over long periods of time. From a technical point of view, it should be possible to use them in practice in the medium term,” says an article by the Fraunhofer Institute.
In the summer of 2019, the startup CATALOG managed to store the English edition of Wikipedia on DNA, which had a data volume of 16 GB at the time. The write speed was specified as 4 megabits per second (0.5 MB/s), which makes it clear that the transfer rates are still an issue.

Related Posts

Leave a Reply

%d bloggers like this: