A new data storage powerhouse is required in the Information Age. DNA may simply be the answer, thanks to its extended molecular alphabet and 21st-century twist.
Researchers created a letter-perfect sequencing technique and added seven additional letters to DNA’s molecular alphabet. These advancements enabled the double helix evolve into a reliable, long-lasting data storage device appropriate for the Information Age and engineered to survive far into the twenty-first century.
Consider playing Bach’s “Cello Suite No. 1” on a strand of DNA.
This possibility isn’t as far-fetched as it seems. DNA is a powerhouse for storing music files and all sorts of other material, while being too little to sustain a repetitive strum or sliding bowstring.
“Nature’s initial data storage method is DNA. It can be used to store any kind of data, including photographs, video, and music “Kasra Tabatabaei, a collaborator on the paper and a researcher at the Beckman Institute for Advanced Science and Technology, agreed.
A multi-institutional team was able to turn the double helix into a strong, long-term data storage platform by expanding DNA’s molecular composition and establishing a precise new sequencing process.
In February 2022, the team’s article was published in Nano Letters.
Anyone bold enough to explore the daily news in the era of digital information feels the global archive becoming heavier by the day. Paper files are increasingly being digitized to save space and preserve data from natural catastrophes.
Anyone with information to keep, from scientists to social media influencers, will benefit from a safe, long-lasting data lock box, and the double helix meets the bill.
“DNA is one of the finest, if not the best, solutions for archival data storage,” said Chao Pan, a PhD student at the University of Illinois Urbana-Champaign and one of the study’s coauthors.
DNA is engineered to withstand the toughest environmental conditions on Earth, often for tens of thousands of years, yet remain a useful data source. Scientists can decipher genetic histories and bring life into long-dead landscapes by sequencing ancient strands.
DNA, despite its microscopic size, is similar to Dr. Who’s notorious police box in that it is larger on the inside than it seems.
“Every day, the internet generates many petabytes of data. Only one gram of DNA would be needed to hold all of that information. As a storage medium, DNA is very dense “Tabatabaei, who is a fifth-year Ph.D. student, agreed.
Another key feature of DNA is its natural abundance and near-infinite renewability, which is a feature not shared by today’s most modern data storage system: silicon microchips, which generally circulate for just a few decades before being buried in a mound of landfilled e-waste.
“The relevance of sustainable storage solutions cannot be overstated at a time when we are confronting tremendous climatic concerns. New, environmentally friendly DNA recording methods are developing, making molecular storage even more significant in the future “Olgica Milenkovic, a co-PI on the work and the Franklin W. Woeltge Professor of Electrical and Computer Engineering, noted
The multidisciplinary team looked to DNA’s millennia-old MO when imagining the future of data storage. The researchers then put their own 21st-century tweak to the equation.
Every strand of DNA in nature has four chemicals: adenine, guanine, cytosine, and thymine, which are typically abbreviated as A, G, C, and T. They rearrange themselves along the double helix in order to create combinations that scientists can decode, or sequence, to make sense of.
By adding seven synthetic nucleobases to the original four-letter lineup, the researchers increased DNA’s already vast capacity for information storage.
“Consider the letters of the English alphabet. You could only make so many words if you only had four letters to work with. You could make an infinite number of word combinations if you possessed the whole alphabet. The same is true with DNA. We may convert zeros and ones to A, G, C, T, and the seven new letters in the storage alphabet instead of converting them to A, G, C, and T “Tabatabaei said the following.
Because this is the first team to employ chemically changed nucleotides for information storage in DNA, members had to come up with creative solutions to a unique problem: not all existing technologies can read chemically modified DNA strands. They used machine learning and artificial intelligence to design a first-of-its-kind DNA sequence readout processing approach to overcome this challenge.
Their solution can tell the difference between modified and natural compounds, as well as each of the seven novel molecules.
“We examined 77 different combinations of the 11 nucleotides, and our approach was able to properly distinguish each of them,” Pan added. “Because the deep learning framework used in our technique to recognize distinct nucleotides is universal, our methodology may be used to a wide range of applications.”
Nanopores, proteins having a central opening through which a DNA strand may readily flow, are responsible for this letter-perfect translation. Surprisingly, the researchers discovered that nanopores can detect and differentiate each individual monomer unit along a DNA strand, regardless of whether the units are natural or synthetic.
“This work presents an interesting proof-of-principle demonstration of extending macromolecular data storage to non-natural chemistries, which has the potential to significantly boost storage density in non-traditional storage medium,” stated Charles Schroeder, a co-PI on the project.
By preserving genetic information, DNA actually changed history. According to this research, the future of data storage will be just as double-helical.
Aleksei Aksimentiev, the Center for Biophysics and Quantitative Biology, and Alvaro Hernandez, the Roy J. Carver Biotechnology Center, are two more UIUC collaborators. The University of Massachusetts at Amherst and Stanford University are among the partners. Please see the published work for a complete list of colleagues and affiliations.