Rounded Corners – 174 (SimpleDB overload)

Related. RDBMS vendors are not having the best day today. SantaAmazon delivers SimpleDB. Scalable, reliable, pay by the use, Do Your Own Consistency. And that’s after open-source came in to undermine the pricing structure. Big news.

SOAP without the envelope. On a sour note, and I quote from the SimpleDB documentation:

Amazon SimpleDB REST calls are made using HTTP GET requests. The Action query parameter provides the method called and the URI specifies the target of the call. Additional call parameters are specified as HTTP query parameters. The response is an XML document that conforms to a schema.

It gets better:

The following shows a REST request that puts three attributes and values for an item named Item123 into the domain named MyDomain.

https://sdb.amazonaws.com/?Action=PutAttributes&DomainName=MyDomain

And the response?

<PutAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07">
  <ResponseMetadata>
    <StatusCode>Success</StatusCode>
    <RequestId>f6820318-9658-4a9d-89f8-b067c90904fc</RequestId>
    <BoxUsage>0.0000219907</BoxUsage>
  </ResponseMetadata>
</PutAttributesResponse>

Where did the REST go?

Eventually misconstrued. A lot of people are tripping over this, here’s one example:

This “eventual consistency” greatly limits what SimpleDB can be used for. Don’t try to use it to store any sort of accounting information, for example: If you adjust an account balance twice in quick succession (with each transaction being performed as a read-modify-write sequence) there’s a good chance that you’ll lose the first transaction because it won’t have propagated by the time that you read data for the second transaction.

Actually, no. What’s going to happen in this scenario is that you’re going to end up reading both values. Unless you mark it for replacement, each write will add a value to the attribute and the following read will return both, which you can then reconcile. It’s just a matter of handling consistency at read time.

Caching, (de)mystified. There’s another post in the queue about caching (if_modified is making a comeback), but meanwhile, do have a read through Mark Nottingham’s excellent tutorial on caching, survey of XHR caching support in browsers (guess which ones are broken?), and overview of ETags.

And now, for your moment of Zen. Annotation-Oriented Programming:

@ImplementedBy(ServiceImpl.class)
public interface Service {

2 thoughts on “Rounded Corners – 174 (SimpleDB overload)

  1. “This “eventual consistency” greatly limits what SimpleDB can be used for. Don’t try to use it to store any sort of accounting information, for example: If you adjust an account balance twice in quick succession (with each transaction being performed as a read-modify-write sequence) there’s a good chance that you’ll lose the first transaction because it won’t have propagated by the time that you read data for the second transaction.”

    Actually, no. What’s going to happen in this scenario is that you’re going to end up reading both values. Unless you mark it for replacement, each write will add a value to the attribute and the following read will return both, which you can then reconcile. It’s just a matter of handling consistency at read time.

    ??? You can’t be serious! Eventual consistency is a much bigger issue than you’re acknowledging.

    Let’s take another example: A database update gets posted. Now try to query it back from 2 different machines. One of them sees it; the other doesn’t. The one then proceeds to continue on its merry way performing its processing with incorrect data. The consequences of this can of course var from trivial to critical, depending on the criticality of the application and the data.

    Your solution: “It’s just a matter of handling consistency at read time.”

    Huh? How exactly would you suggest this consistency problem get handled at read time?

  2. DAR, eventual consistency is tricker than write consistency, just like relations are tricker than hierarchical, and multi-threading is tricker than single-threaded.

    Imagine a write updating a post. Imagine two readers querying it. All this happens concurrently with an RDBMS in the middle. Timing could very well mean one reader gets the post before the update, and one reader after the update.

    If you’re that sensitive to timing issues than you have a concurrency problem with *any* database you’re going to use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>