Case Study: Amazon S3

Amazon S3 (Simple Storage Service) is an online file-storage web service provided by Amazon. It is unique among online storage services in several ways:

It has a no-minimum pricing structure. Storage is billed by the GB-month, bandwidth is billed by the GB, and there is an additional charge per GET, PUT, and LIST request.
There is no web interface to create objects; the only full mode of access is through the API.
It is generally agreed that the S3 API is the first large public API that calls itself RESTful and actually lives up to the principles of REST.
In addition to the rich HTTP web service interface, S3 can serve objects over plain HTTP (without any custom HTTP headers) and BitTorrent. Many organizations use S3 as a storage network for their static content because it can serve images, CSS, and JavaScript just as well as a standard web server.

The full documentation for the S3 API is at http://aws.amazon.com/s3. We will now look into the basic architecture of S3, its concepts, and its set of operations.

Concepts and Terminology

S3 is used to store objects, which are streams of data with a key (a name) and attached metadata. They are like files in many ways. Objects are stored in buckets, which also have a key. Buckets are like filesystem directories, with a few differences:

Bucket names must be unique across the entire S3 system. You cannot pick a bucket name that has already been chosen by someone else.
Bucket names must be valid DNS names (alphanumeric plus underscore, period, and dash).
Buckets cannot be nested. There is one level of buckets, which contain objects. However, we can fake such nesting by giving objects keys like blog/2007/01/05/index.html. Slash characters, though they often designate hierarchy in URIs, are treated like any other character in object keys. We can even query keys by prefix, so we can ask to list keys starting with blog/2007/01/05.

Amazon provides three different URI templates by which objects can be accessed. These are genuine RESTful URIs; they refer to the resources themselves, and nothing else:

This last URI is an example of a virtual hosted bucket; by using a DNS name as a bucket key, and pointing that DNS name at s3.amazonaws.com. via a CNAME, S3 will recognize the bucket key from the Host header and serve the appropriate object. This makes it possible to serve an entire domain from S3, nearly transparently. If we create a bucket called images.example.com, place a JPEG photo in it as an object called hello.jpg, and ensure the proper CNAME is set up pointing images.example.com. to s3.amazonaws.com., then our image is accessible at http://images.example.com/hello.jpg with a standard web browser, just as if we had an HTTP server serving that URI.

Authentication

Because Amazon was not tied to the limitations of existing HTTP clients, it did not have to bow to the limitations of HTTP Basic or Digest authentication in web browsers when creating S3. The S3 authentication protocol is a thin layer, adding an HMAC signature to each request. After the message is signed, a header is added to the HTTP request as follows:

Authorization: AWS AWSAccessKeyId:Signature

The AWSAccessKeyId value indicates the ID of the access key that the bucket owner generated; it is tantamount to a user ID. The Signature value is the Base64-encoded result of the HMAC calculation.

Alternative authentication options

S3 is a closed system; the owner of a bucket is billed for most operations on it. Therefore, all requests to S3 must be signed or otherwise authorized by the bucket owner, as he is the one ultimately responsible for payment.

However, signing each request can be inconvenient in some situations. A common example is when an organization uses S3 as an asset server; usually the organization would want the corresponding bucket to be world-readable. S3 includes access control lists (ACLs) for this purpose. As long as the owner is comfortable with being charged for operations by anonymous users, he can give READ access to the AllUsers group, which will eliminate the need for a signature.

Another option, which can be incredibly useful, is to delegate access control by including the authentication information in the query string of the object's URI. This is most useful when the object is still private but there are designated users without an AWS account who should be allowed to retrieve it via plain HTTP or BitTorrent. Basecamp uses this approach to store a company's files. The files are kept on S3 with a locked-down ACL, and when an authorized user requests the file, he is sent to a URI including a signature, which is valid for a limited period of time. The format of the URIs is such:

/objectkey?AWSAccessKeyId=AWSAccessKeyId&Expires=Expires&Signature=Signature

The AWSAccessKeyId and Signature values are as described previously, while the Expires value is a POSIX-time-formatted value indicating when the authorization expires. The Expires value is also signed by the HMAC so that the recipient cannot modify it undetected.

Architecture and Operations

S3 has a truly RESTful HTTP interface, in which the URIs correspond to resources only, the proper HTTP methods are used according to their semantics, and status codes are used appropriately. There are three types of resources in the S3 system:

Service

Represents the Amazon S3 service; its well-known URI is http://s3.amazonaws.com/. This resource supports only one HTTP method:

GET service

Returns a list of all buckets owned by the currently authenticated user.

Bucket

Represents one bucket belonging to the authenticated user. Can be accessed through the following URIs:

http://s3.amazonaws.com/bucketkey
http://bucketkey.s3.amazonaws.com/
http://bucketkey/ (if the key is a valid DNS name with a CNAME pointing to s3.amazonaws.com)

A bucket resource supports the following three methods:

PUT bucket

Creates a bucket with the given name (as the client gets to choose the name, this is accomplished with PUT to the resource itself, rather than POST to the parent). Attempting to create a bucket that already exists will return an HTTP 409 Conflict error code.

GET bucket

Retrieves a list of objects contained in the specified bucket. Takes a prefix parameter in the query string to list all keys that begin with a given string.

DELETE bucket

Deletes the specified bucket. Only the bucket's owner may delete a bucket. A bucket can be deleted only if it is empty; attempting to delete a nonempty bucket will cause an error with an HTTP status code of 409 Conflict.

Object

Represents an object stored within a bucket. Accessible at the following URIs:

All object keys, as seen above, are qualified with their bucket key. An object resource supports the following four methods:

PUT object

Stores the given data at the location specified, creating a new object or overwriting an existing object.

GET object

Retrieves and returns the object at the specified location.

HEAD object

Returns the headers that would be returned from a GET request on this object, with no body.

DELETE object

Deletes the object at the given location. By analogy to Unix file permissions, you must have WRITE access on a bucket to delete objects within it. Deleting a nonexistent object is not an error, but is effectively a no-op.

S3 Clients and Servers

Marcel Molina, Jr.'s AWS::S3 library (http://amazon.rubyforge.org/) is the most popular client for S3. Its design was inspired by ActiveRecord, and it is simple and elegant:

	require 'aws/s3' # gem install aws-s3

	AWS::S3::Base.establish_connection!(
	  :access_key_id     => 'MyAWSAccessKeyId',
	  :secret_access_key => 'MyAWSSecretAccessKey'
	)

	image_bucket = Bucket.create "images.example.com"

	S3Object.store(
	  'hello.jpg',            # key
	  File.read('hello.jpg'), # value
	  'images.example.com',   # bucket name
	  :content_type => 'image/jpeg',
	  :access => :public_read
	)

The s3fuse project (http://sourceforge.net/projects/s3fuse/) is an implementation of an S3 client using FUSE (a Linux filesystem framework that runs in userspace rather than kernel space). This makes it possible to mount an S3 bucket as a Linux filesystem and use it transparently within unmodified applications.

Park Place, by why the lucky stiff (http://code.whytheluckystiff.net/parkplace),is a nearly complete clone of the Amazon S3 web service. It is perfect for developing and testing S3 applications without requiring an S3 account or payment. It does not support S3's SOAP interface, but it supports most everything else, including distributing objects with BitTorrent.

Tip

Park Place is written using the excellent Camping web microframework, also by why the lucky stiff (http://code.whytheluckystiff.net/camping). Camping is a very stripped-down Ruby framework modeled after Rails but taking less than 4 kb of source (packed).

Incidentally, the Camping source is a great place to learn Ruby meta-programming inside and out.