Chapter 13. SimpleDB (Beta)

It is your responsibility to encode your data and ensure that it can be sorted correctly. You also need to be careful to avoid ruining the integrity of your database by introducing data that is not encoded or that is encoded differently than existing attribute values. Because there is no predefined schema in the service, no mechanism exists to enforce type correctness, so SimpleDB will not complain if you provide an integer value for an attribute that should only contain dates.

Limited query capabilities

The query language provided by SimpleDB is much simpler than SQL or similar, traditional query languages, so you may have to code your application to make up for the lack of power in this language. For example, SimpleDB does not support the sorting of query results, the inclusion of attribute values in a result set, or the comparison of one attribute against another within the same item. For a complete summary of the features and shortcomings of the SimpleDB query language, see SimpleDB Query Expression Syntax.”

Data consistency may suffer due to propagation delays

SimpleDB only guarantees eventual consistency for your data, which means that the data you retrieve from the service at any particular time may be slightly out of date.

The service is implemented as a distributed system, in which information is stored redundantly across multiple physical servers and potentially across multiple data centers. This strategy ensures that your data are kept safe and readily accessible, but it also means there will be a delay before any addition, alteration, or deletion operation you perform is propagated through the entire SimpleDB system. Your data will eventually be globally consistent, but until then, you could potentially retrieve outdated information from the service.

Amazon states that global consistency is usually achieved “within seconds”; however, this timeframe can depend on the processing and the network load the service is under when you make a change. If data consistency is important to your application, you may need to consider adding an intermediate caching layer that can respond to changes more quickly. For an example of how to use the memcached tool as a cache for the service, see Caching SimpleDB with Memcached.”

Limited attribute sizes

The maximum size for any attribute value in SimpleDB is only 1024 bytes. This limit is clearly far too small to store binary data, such as images or files. This severe data-storage size restriction is deliberate, because Amazon intends that you store large data items in the S3 service, rather than in SimpleDB. The recommended way to integrate your SimpleDB database with content in S3 is to store a reference to an object’s bucket and key names in SimpleDB, then retrieve this object’s data from S3 when it is needed.

This arrangement makes more efficient use of Amazon’s storage resources and will end up saving you money: the SimpleDB data storage expenses are ten times higher than those in S3. However, this means your application must perform an extra step to integrate small pieces of data from SimpleDB with larger pieces from S3.

Pricing

SimpleDB account holders are billed monthly for their service usage based on three criteria: the amount of storage they have used, the amount of data transferred into or out of the service, and the the number of hours of machine utilization time their operations have consumed.

Storage

Storage space in SimpleDB is charged at $1.50 per GB/Month. In addition to the storage space consumed by your own data, you will be charged for the space required to store the indexing information the service automatically generates for your data. The space overhead for index information is 45 bytes for each item, attribute name, and attribute value you store. This is illustrated in Table 13-1.

Table 13-1. Storage space for an example item

Item Name	Attribute Name	Value	Data Storage	Index Overhead
Item-01			7 bytes	45 bytes
	Description		11 bytes	45 bytes
		Béret	6 bytes (2 bytes per non-ASCII character)	45 bytes
	Colors		6 bytes	45 bytes
		Green	5 bytes	45 bytes
		Purple	6 bytes	45 bytes
		Total data size: 41 bytes		Total with overheads: 311 bytes

To estimate the amount of storage your SimpleDB database consumes, add the total number of bytes of your own data plus an extra 45 bytes for every item, attribute name, and attribute value in your data set.

Data transferred

Data received by SimpleDB when you store information costs 10¢ per gigabyte. Fees for data sent by SimpleDB in response to queries or data retrieval requests are charged on a sliding scale, depending on the volume of data transferred during the month: 18¢/GB from 0 to 10 TB, 16¢/GB from 10 to 50 TB, and 13¢/GB for any amount over 50 TB.

Machine utilization

Amazon tracks the machine utilization for each SimpleDB operation you perform and charges 14¢ per machine hour you consume. The usage measurements are normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor. The machine utilization value of each operation is returned in the response message, so you can monitor your usage on a per-request basis (see Service Response Messages”).

The machine usage for each request tends to depend on the volume of data processed in the request. The more attributes you upload, retrieve, or query, the higher the usage is likely to be. The exceptions to this rule are the operations for creating and deleting domains; these operations consume a relatively constant, and large, amount of machine time.

Note

These prices are correct as of February 2008. Please refer to the service’s web site to confirm the latest pricing.