Interacting with S3

The S3 web service application program interface (API) is made available through two interfaces: REST and SOAP. In this book we will use the REST interface.

The S3 service implementation presented in this chapter uses the REST API functionality in the AWS Ruby module. The AWS module includes methods, presented in REST API Implementation” in Chapter 2, that perform authentication, transmission, and response checking of REST API requests.

The REST API interface for the S3 service uses five HTTP methods to perform API operations: GET, HEAD, PUT, DELETE, and POST. The meaning of each method varies slightly, depending on what kind of S3 resource the operation is targeting: an object, a bucket, an Access Control List (ACL), or the S3 service itself. Table 3-1 lists some of the operations you can perform on S3 resources using different HTTP methods.

The most recent S3 API version available when this book was written, was 2006-03-01. This version number is used as a component of the XML namespace of documents provided to and produced by the service, http://s3.amazonaws.com/doc/2006-03-01/.

In this chapter, we will gradually build up a complete implementation class called “S3” that you can use to interact with the S3 service. Example 3-1 shows a basic Ruby code stub that defines the S3 class, to which we will add API implementation methods as we proceed through the chapter. Save this code to a file named S3.rb in the same directory as the AWS module file AWS.rb, which we defined in Chapter 2.

This class will rely on the communication library implementation defined in the AWS module, which it includes as a mixin module. A mixin is a feature of Ruby that makes the variables and methods defined in a module available to a class.

The S3 class defines two constant values. The S3_ENDPOINT constant defines the default endpoint hostname for S3 service requests. The XMLNS constant defines the XML namespace that is used in documents received from, or sent to, the service.

Resources in the S3 service are identified using URIs that include three components:

To perform an action on an S3 resource, you must first construct a URI string that identifies that resource. Constructing this URI is not a trivial task, because its content can vary a great deal depending on what kind of resource is involved and whether the request will use the standard S3 service domain or an alternative hostname instead. Table 3-2 shows some example URIs that may be used to represent resources in S3.

S3 understands three different URI formats:

We will discuss how you can use these different URI formats in Alternative Hostnames” later in this chapter. For the time being, we must decide which URI format to use for our S3 request messages. Unfortunately we cannot simply choose one format that will work in all cases. Instead, we must choose between the subdomain format and the default format, depending on the name of the S3 bucket we are addressing.

The best URI construction to use is the subdomain format, because URIs of this type will work with buckets stored in any geographical location (see Bucket Locations” for more information). Unfortunately, this format cannot be used in cases where a bucket’s name is incompatible with the DNS system; for example, if it contains uppercase characters or underscores. In these cases, we must use the default S3 hostname format, in which the bucket name is contained in the resource path instead of the hostname.

In summary, we will use the subdomain format provided the name of the bucket we are addressing can be included in a valid hostname. Example 3-2 defines a method that determines whether we can build a valid subdomain URI for a given bucket name.

This method returns a true response if the bucket name will make a valid subdomain hostname. To make a valid subdomain, a bucket name must:

Now that we have this method to determine when we can create subdomain URIs and when we cannot, we can move on to the code that will generate our URIs.

Example 3-3 defines a method that constructs URIs for S3 service requests. The URI will include a hostname that is either the default S3 service domain name or a sub-domain, based on the result of the valid_dns_name method. The URI will also include a path specifying the bucket or object resource the request will act upon, if any, and any additional request parameters provided to the method.

The URI generated by this method will use the standard HTTP or secure HTTPS protocol, depending on whether the @secure_http variable is set. If the S3 bucket and object names parameters provided are not empty strings, they are added to the URI path. If URI will follow the subdomain format, only object names and not bucket names will be added to the path.

The following examples demonstrate how this method can be used to construct URIs corresponding to the examples in Table 3-2 earlier in this chapter.

# Load the S3 class and instantiate it in a variable
irb> require 'S3'
irb> s3 = S3.new

# S3 Service URI
irb> s3.generate_s3_uri().to_s
=> "https://s3.amazonaws.com"

# Bucket URI (where the bucket name is a valid DNS name)
irb> s3.generate_s3_uri('bucketname').to_s
=> "https://bucketname.s3.amazonaws.com"

# Bucket URI (where the bucket name is not a valid DNS name)
irb> s3.generate_s3_uri('bucket_name').to_s
=> "https://s3.amazonaws.com/bucket_name"

# Object URI (where the bucket name is a valid DNS name)
irb> s3.generate_s3_uri('bucket.name','objectkey').to_s
=> "https://bucket.name.s3.amazonaws.com/objectkey"

# Object URI (where the bucket name is not a valid DNS name)
irb> s3.generate_s3_uri('bucket-.-name','objectkey').to_s
=> "https://s3.amazonaws.com/bucket-.-name/objectkey"

# Bucket's ACL URI
irb> s3.generate_s3_uri('bucketname', '', [:acl=>nil]).to_s
=> "https://bucketname.s3.amazonaws.com?acl"