DNA-based storage, which involves storing binary code in the four nucleotides that constitute DNA, has been a moonshot for high-density data storage since the 1960s. Since the first successful experiments in the 1980s, researchers have made a series of major strides toward implementing DNA-based storage at scale, such as improving write times and storage density and enabling easier file identification and extraction. Now, a new $25 million initiative led by the Georgia Tech Research Institute (GTRI) aims to bring scalable DNA-based archival storage even closer to being a functional reality.
The Intelligence Advanced Research Projects Activity’s (IARPA) Molecular Information Storage (MIST) program awarded the multi-phase contract to GTRI along with Twist Bioscience, Roswell Biotechnologies and the University of Washington in collaboration with Microsoft. Their joint project – called Scalable Molecular Archival Software and Hardware, or “SMASH” – will aim to engineer a silicon-based DNA synthesis platform that can write data-loaded DNA strands and provide a DNA sequencing technology for reading those strands.
“The goal is to significantly reduce the size, weight and power required for archival data storage,” said Alexa Harter, director of GTRI’s Cybersecurity, Information Protection, and Hardware Evaluation Research (CIPHER) Laboratory. “What would take acres in a data farm today could be kept in a device the size of the tabletop. We want to significantly improve all kinds of metrics for long-term data storage.”
“[DNA storage] is so compact that a practical DNA archive could store an exabyte of data – equivalent to a million terabyte hard drives – in a volume about the size of a sugar cube,” said Nicholas Guise, a senior research scientist at GTRI. “Scientists have been able to read DNA from animals that died centuries ago, so the data lasts essentially forever under the right conditions.”
Because of DNA storage’s longevity, as well as its long read and write times, the project is emphasizing archival data storage over competition with modern server farms, which prioritize quick and frequent access to information. To that end, GTRI outlined a daunting metric for the project’s success: encoding and decoding hundreds of terabytes daily at rates and costs more than 100 times better than current archival data storage technologies.
“We don’t see any killers ahead for this technology,” said Adam Meier, a senior research scientist at GTRI. “There is a lot of emerging technology and doing this commercially will require many orders of magnitude improvement. Magnetic tape for archival storage has been improving steadily for 60 years, and this investment from IARPA will power the advancements needed to make DNA storage competitive with that.”