Beyond the RDBMS: the Brave New (Old) World of NoSQL

noelwelsh · on Jan 14, 2011

He misses the point. Do you want your database to scale in units of $20'000 Dell servers + $50'000 Oracle licenses that take a week to deliver and install, or $500/yr EC2 instances you can have up and running in under a minute? Do you want to pay some more money and time to get "Oracle Data Guard" for redundancy, or just boot up another Riak instance? In case of failure would you rather have writes always go through (e.g. purchases are never refused) or have customers wait while "the hot standby machine ... roll[s] forward from the transaction logs and, as soon as it is up to date with the last committed transaction, take the place of the dead server"? There are good reasons for NoSQL databases.

lwat · on Jan 14, 2011

That has nothing to do with this article. I can install EC2 instances of PostgreSQL or any other good free database too if I want. Hell there's EC2 instances with built-in SQL Server for minimal extra cost if you want the real high-end stuff. That's not an advantage of NoSQL.

As for your other argument I don't understand either. With SQL servers you can have a hot spare with automatic failover with no 'roll forward from transaction logs' that's just pure FUD.

There are indeed good reasons for NoSQL data stores but those are not it.

noelwelsh · on Jan 14, 2011

"Roll forward" is a quote from the article (hence the quote marks). Here is the full quote:

"The simplest approach to redundancy is a "hot standby" server that has access the transaction logs of the production machine. If the production server dies for any reason, the hot standby machine can roll forward from the transaction logs and, as soon as it is up to date with the last committed transaction, take the place of the dead server. As a bonus, the hot standby machine can be used for complex queries that don't need to include up-to-the-second changes. For Oracle 11g, look for "Oracle Data Guard" to learn more about this approach."

If you think that is FUD take it up with the author.

The $20'000 Dell server is (guess what!) a quote from the article:

"It is difficult to find realistic benchmarks for the kind of database activity imposed by an Internet application. The TPC-E benchmark is probably the closest among industry standards. TPC-E is a mixture of fairly complex reads (SELECTs) and writes (INSERTs and UPDATEs) into an indexed database that gets larger with the number of SQL statements that are attempted for processing. The "transactions per second" figures put out by the TPC include both reads and writes. In 2010, a moderately priced Dell 4-CPU (32 core) server hooked up to, literally, more than 1000 hard disk drives, processed about 2,000 transactions per second into an 8-terabyte database.

The authors crowd-sourced the question in this blog posting and for smaller databases that can be stored mostly on solid-state drives, it seems as though a modest $10,000 or $20,000 computer should be capable of 10,000 or more SQL queries and updates per second."

So the argument is that an expensive server is the RDBMS solution for scalability. This is the argument I addressed.