Translate Languages

Youarehere You are here: Home Arrow 2009 Arrow 05 Arrow 30 Arrow ruby_simpledb.html 

MAY

30th

Experimenting and improving Ruby interface to Amazon SimpleDB

I have recently started playing with Amazon SimpleDB, part of the Amazon cloud computing offering, which is basically equivalent to Google Bigtable. Being a ruby enthusiast, I decided to use the right_aws gem from RightScale. The only problem, is that right_aws still does not support batch attribute insertion, which severely limit performances. Since I’m planning to use Amazon SDB for BayesFor, where we need performance, I implemented it. You can find the code in sdb_batchput.rb

Using it is extremely simple. By just including sdb_batchput.rb you’ll get a new method, batch_put_attributes on your SdbInterface. Here is an example:


require 'right_aws'
require 'sdb_batchput'

sdb = RightAws::SdbInterface.new(access_key, secret_key)
items = {}
25.times do |i|
  attributes = { 
    'foo' => 'bar',
    'baz' => 'bat'
  }
  items["item#{i}"] = attributes
end
sdb.batch_put_attributes("Test", items)

Grab sdb_batchput.rb from here

This file contains a bit longer example (embedded into the BayesFor infrastructure, unfortunately, but still readable) to benchmark the new method.

A simple benchmark, to insert 1000 items with 2 attributes each, shows an 18x improvement:


PutAttributes:
Total save time: 288s. Per item: 0.288s

BatchPutAttributes:
Total save time: 16.29s. Per item: 0.016s

Alternatively, if you don’t want to use my patch for right_aws, the ruby-aws project fork implemented the same feature (with a slightly different signature) just a few days ago

Riccardo Govoni, last modified on May 30, 2009 - 09:54