when relevant content is
added and updated.
We’ve talked before about DNA storage, or the ability to store large amounts of data in DNA. Now there’s another company that’s taking a stab at it.
While DNA storage is incredibly dense and is thought to last longer than traditional magnetic storage, it’s expensive and slow.
So now there’s a Boston-based company, Catalog Technologies, that was founded in 2016 and is working on the technology. It got a flurry of attention last year when it raised $9 million from investors.
Most recently, the company said that it had put all 16 gigabytes of Wikipedia onto DNA strands to demonstrate its technology. Since previous demonstrations were 200 megabytes that Microsoft showed in 2016, this was quite an improvement.
“We encoded the English text version of Wikipedia into synthetic DNA molecules using printer technology, our groundbreaking encoding scheme and chemical protocols,” the company writes. “The total amount of data came to 16 gigabytes, significantly more digital information than has ever been captured into DNA previously – not to mention orders of magnitude faster and cheaper than chemical synthesis approaches.”
The company uses a DNA-building enzyme instead of traditional chemical approaches to rapidly synthesize DNA, wrote Jeff Bauter Engel in Xconomy last year. “ The startup says the key to its approach is separating the process of synthesizing DNA molecules from the process of encoding the digital data,” he writes. “Catalog’s method involves purchasing large quantities of small DNA fragments—about 20 to 30 base pairs long—from synthetic DNA suppliers. Catalog designed a machine that can dispense and stitch the DNA fragments together in programmable ways. The idea is that Catalog’s process uses a relatively small number of DNA molecules—fewer than 200—which can be combined in an exponential number of ways.”
“Essentially, it’s like a language: in English, there are only 26 letters, but through various arrangements we can make, theoretically, make an infinite number of different words,” wrote Katherine Ellen Foley in Quartz last year. “Catalog estimates that it will cost less than three thousands of a cent to store one MB of data. For context, on Spotify, a minute of stereo sound is about 2.4 MB at the highest quality.”
The company had said at the time that by next year – that is to say, this year – it would be able to encode 1 terabyte of information per day in DNA, for several thousand dollars, Engel wrote.
Now, 16 gb isn’t 1 TB, but it’s certainly better than 200 mb. “By comparison, a silicon-based portable hard drive with 1 terabyte of storage capacity typically costs less than $100, and the process of saving 1 terabyte of data on it would only take a few hours,” Engel writes. “The bottom line is even if Catalog’s system performs as well as advertised, the company and its rivals are still a long way from being able to compete with the lower costs and faster data transfer speeds of hard drives.”
Nonetheless, it’s a start. The company told MIT Technology Review last year that it would have a commercial system of a single machine or a group of them able to store a petabit of data per day by 2021.
“Let’s face it, this thing is huge,” wrote Antonio Regalado. “It’s no flash drive. The rendering shows a door and room enough inside for a couple of technicians. Inside there will need to be a hundred bags or bottles of ready-made DNA, and then an automated laboratory to mix the strands together and perform billions of reactions. You’ll also have to squeeze in a DNA sequencing machine—maybe a couple of them—to retrieve the data.”
In addition to raising money, Catalog is also said to have done a good job assembling talent. Funny how those two things go together.
Meanwhile, Microsoft hasn’t yet announced the DNA storage engine it promised to have by the end of the decade.