While technology companies are routinely building massive data centers to store their data, a group of researchers have developed a data system using DNA molecules.
The new technique developed by University of Washington (UW) and Microsoft researchers, can now reduce the space needed to store digital data. So much so, that data that would need an area the size of a Walmart supercenter could be reduced to the size of a sugar cube.
The team of computer scientists and electrical engineers has detailed one of the first complete systems to encode, store, and retrieve digital data using DNA molecules, which can store information millions of times more compactly than current archival technologies, the University of Washington announced in a statement.
In one experiment, the team was able to successfully encoded digital data from four image files into the nucleotide sequences of synthetic DNA snippets. What’s more surprising is that they were able to “reverse the process — retrieving the correct sequences from a larger pool of DNA and reconstructing the images without losing a single byte of information.”
Luis Ceze, UW associate professor of computer science and engineering and co-author of the study, said in a statement:
“Life has produced this fantastic molecule called DNA that efficiently stores all kinds of information about your genes and how a living system works — it’s very, very compact, and very durable.
“We’re essentially repurposing it to store digital data — pictures, videos, documents — in a manageable way for hundreds or thousands of years.”
The researchers believe that DNA molecules would store data many millions of times more densely than any existing technology. Another thing to consider is that while existing technology degrades after a few years or decades, DNA could reliably preserve for centuries with the data intact.
However, DNA is best suited for archival applications, and not for instances where files would be needed to be accessed immediately. The team is developing a DNA-based storage system, and is anticipating that it may address the world’s needs for archival storage.
In developing the system the researcher’s first converted the long strings of ones and zeroes in digital data into the four basic building blocks of DNA sequences (adenine, guanine, cytosine, and thymine).
Georg Seelig, a UW associate professor of electrical engineering and of computer science and engineering and co-author, said:
“How you go from ones and zeroes to As, Gs, Cs, and Ts really matters because if you use a smart approach, you can make it very dense and you don’t get a lot of errors, if you do it wrong, you get a lot of mistakes.”
The digital data is chopped into pieces that are then stored by synthesizing a massive number of tiny DNA molecules, which can then be dehydrated or preserved for long-term storage.
The researchers are also only one of two teams nationwide that can demonstrate the ability to perform “random access” — to identify and retrieve the correct sequences from this large pool of random DNA molecules, (similar to reassembling one chapter of a story from a library of torn books), according to the University.
To access the data, the researchers encoded the equivalent of zip codes and street addresses into the DNA sequences. By using Polymerase Chain Reaction (PCR) techniques (commonly used in molecular biology) they are able to easily identify the zip codes they are looking for.
Then by using DNA sequencing techniques, they are able to “read” the data and convert them back to a video, image, or document file by using the street addresses to reorder the data.
However, one setback that needs to be overcome is the cost and efficiency with which DNA can be synthesized (or manufactured) and sequenced (or read) on a large scale.