Posted by: Jack Vaughan
data architecture, web applications
Innovative messaging and data architectures are being widely applied in Web applications these days – but approaches that work for the top-tier sites may not work well for others. While traditional RDBMs may not be the best path, the effort involved with making next-generation NoSQL DBs work may entail too much for typical shops, one noted database expert says. Perhaps not surprisingly, the expert, Michael Stonebraker, is presently touting an alternative to both traditional RDBs and upstart NoSQL DBs. He calls that alternative “NewSQL.”
NoSQL may have a lot of momentum just now, but it requires some pretty extensive programming capabilities – enough to limit the extent of its eventual use for large-scale on-line transaction processing (OLTP), in the estimation of Stonebraker, CTO, VoltDB and adjunct professor, MIT. Meanwhile, conventional relational databases will also fail to keep up as the number of online transactions “goes through the roof,” Stonebraker told an audience this week at the Large-Scale Installed System Administration (LISA) Usenix 2011 Conference in Boston.
OLTP data handling alternatives break down into the categories of OldSQL, NoSQL and NewSQL, says Stonebraker. OldSQL, or traditional SQL databases suffer from code bloat through years of wide use and adaptation, claims Stonebraker. These general purpose RDBMs spend too much time and effort on locking, latching and buffer management, and spend a small portion of their time doing ”useful work,” he claims. They do benefit from use of high-level SQL programming support, which is an area where NoSQL databases are faulty, in Stonebraker’s opinion. The NewSQL approach he attributes to VoltDB and others claims better support for high-level SQL programming along with dramatically increased speed of transactions via more scalable architecture and built-in high availability.
“The NoSQL guys give up SQL and ACID to get scalability and performance,” he said. But, he asserted, SQL is not part of the problem with overhead that established RDBMSs exhibit. Keeping SQL but cutting conventional RDBMS overhead is a worthwhile goal, he maintains.
“There is no reason you shouldn’t run DB systems with a high-level language,” he said. “With NoSQL, you get to do ACID in user-level code. That is a ‘tear your hair out kind of thing,’” he said. [Stonebraker does, however, admit that NoSQL players are working to improve their SQL support.]
Stonebraker has a long history in data bases in academia, as well as in industry, where he has been something of a serial entrepreneur . He was the major force behind the Ingres relational database, and subsequently started innovative data-related concerns such as Illustra, Cohera, StreamBase Systems and Vertica. Now he is with VoltDB.
The advent of the smartphone will boost transaction rates, which have already had to accelerate drastically with the move from customer service representatives doing computer input to end users over the Web doing input. Stonebraker actually puts forward a fourth alternative, besides traditional RDBs, upstart NoSQL DBs and NewSQL, for meeting the new OLTP mandate. That alternative is ”roll your own.” It is used by some big sites, and, in Stonebraker’s estimation, is not an appropriate strategy for typical IT shops.
In his view, a top-tier site like Amazon is mostly a traditional shop using ”old SQL” for purchases and data warehousing. Amazon does however make use of the SimpleDB non-relational data store.
“Most of the big Web properties have [created] ‘purpose-built’ data base systems. Most of them have simply rolled their own,” he said. “Simple DB was a purpose-built application. Cassandra for Facebook and Big Table for Google, too, are purpose-built ‘roll your own’ [applications].”
“If you are running at the unbelievable volumes, the ‘build versus buy’ decision is tilted toward build, especially if you have highly competent programmers.
Roll you own is not the best approach for most shops, however. “The big Web properties are generally rolling their own DB systems because they have the volume and the personnel. There are only about six companies that are in that state,” Stonebraker told the LISA Usenix crowd. – Jack Vaughan