Yottabytes: Storage and Disaster Recovery

Apr 26 2017   11:16PM GMT

Can Google Books Turn Into Spotify for Books?

Sharon Fisher Sharon Fisher Profile: Sharon Fisher


Isn’t it cool how you can search for a genre of music, run across a new-to-you recording artist, decide you like them, then be able to listen to a whole lot of their work? And then if you like them, you can buy their CDs or higher-quality recordings and keep them forever?

Wouldn’t it be cool to do that with books? Not just current ones, but old ones too?

You almost could. And at some point, you might still be able to. A couple of articles are reminding us of what Google Books was supposed to be: an online repository of all the books in all the libraries, in and especially out of print, and you could search for a phrase or a fact, get a list of books about it, and read that section. And then, if you wanted, you could buy electronic or even published copies of the books.

“You were going to get one-click access to the full text of nearly every book that’s ever been published,” writes James Somers in the Atlantic. “Books still in print you’d have to pay for, but everything else—a collection slated to grow larger than the holdings at the Library of Congress, Harvard, the University of Michigan, at any of the great national libraries of Europe—would have been available for free at terminals that were going to be placed in every local library that wanted one. At the terminal you were going to be able to search tens of millions of books and read every page of any book you found. You’d be able to highlight passages and make annotations and share them; for the first time, you’d be able to pinpoint an idea somewhere inside the vastness of the printed record, and send somebody straight to it with a link. Books would become as instantly available, searchable, copy-pasteable—as alive in the digital world—as web pages.”

Granted, you can still do some of that now, after a fashion, but it’s just a whisper of the original plan. Starting in 2004, it saw Google employees manually scanning what turned into 25 million books from libraries such as Michigan, Harvard, Stanford, Oxford, and the New York Public Library, at a cost of $400 million, into a massive database of up to 50 to 60 petabytes, Somers writes.

“Approximately twenty per cent of all books are in the public domain; these include books that were never copyrighted, like government publications, and works whose copyrights have expired,” wrote Jeffrey Toobin in the New Yorker in 2007. “Google has simply copied such books and made them available on the Web. Roughly ten per cent of books are copyrighted and in print—that is, actively being sold by publishers. Many of these books are covered by Google’s arrangement with its publisher partners, which allows the company to scan and display parts of the works. The vast majority of books belong to a third category: still protected by copyright, or of uncertain status, and out of print.”

And those books, known as “orphan works,” were the problem. The problem, like the early days of Napster and music sharing, was copyright and the rights of the artists to control their work. While that has gotten largely worked out in this day and age for music, there’s not yet a Spotify for books.

“There’s actually a long tradition of technology companies disregarding intellectual-property rights as they invent new ways to distribute content,” Somers writes. “What usually becomes of these battles—what happened with piano rolls, with records, with radio, and with cable—isn’t that copyright holders squash the new technology. Instead, they cut a deal and start making money from it. But even if everyone typically ends up ahead, each new cycle starts with rightsholders fearful they’re being displaced by the new technology.”

Google Books was no different, and litigation over the subject went on for years, with the upshot that people don’t have access to the database of 25 million books that the company had scanned thus far. There was also the interesting nuance mentioned in Backchannel that people don’t have access, but AI systems might. “We know Google can’t legally make its millions of books available for anyone to read in full — but what if it made them available for machines to read?” writes Scott Rosenberg.

And, who knows? Perhaps with the examples of the music industry before it, and in a different political climate, the computer industry can figure out how to do Spotify for books after all. Interestingly, one of the Google executives in charge back in the day was Marissa Mayer; maybe she’d like to get involved again?

“Google Books could turn out to be for out-of-print books what the VCR had been for movies out of the theater,” writes Somers. “The greatest tragedy is we are still exactly where we were on the orphan works question,” Lateef Mtima, a copyright scholar at Howard University Law School, tells him. “That stuff is just sitting out there gathering dust and decaying in physical libraries, and with very limited exceptions, nobody can use them. So everybody has lost and no one has won.”

That said, Somers was wistfully hoping that, somehow, access to the Google Books database could be opened or that the files could somehow become available. “What would it take to make the books viewable in full to everybody? What’s standing between us and a digital public library of 25 million volumes?” he writes. “All you’d have to do, more or less, is write a single database query. You’d flip some access control bits from off to on. It might take a few minutes for the command to propagate.”

Perhaps he was wishing Aaron Swartz was still around.

2  Comments on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.
  • Blackowl7
    Hopefully they can get this figured out before the books decay away.
    10 pointsBadges:
  • a1r9i5
    1,820 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: