Items and Attributes

Information is stored in the SimpleDB service as a collection of attribute values that are grouped together as items, where each item comprises one or more attributes. Items and attributes are inseparable resources; they cannot exist independently, nor can they be manipulated in isolation. The SimpleDB API operations we will discuss in this section all operate on items and attributes at the same time.

Items

An item is a collection of attributes with a name that is unique within the domain to which it belongs. The name can contain any text in UTF-8 format and may be between 1 and 1024 bytes long. An item’s name serves as its unique identifier. Although the item name is similar in concept to a primary key column in a traditional database, there is an important difference: you cannot refer to item names in SimpleDB query statements.

Each item may contain up to 256 attribute values. Note that this limit applies to attribute name and value pairs, not just attribute names. This means that attributes with multiple values will use up more than one of the 256 spaces allowed. Because an item must always contains at least one attribute value, you cannot create an item without specifying at least one; and if all of an item’s attribute values are deleted, the item is automatically deleted as well.

You may alter the attributes stored in an item at any time, and because the SimpleDB service does not impose a schema, there is no requirement that the items in a domain contain matching attribute names.

Attributes

An attribute is a named collection of one or more text values. Both attribute names and value strings must contain text in UTF-8 format and be between 1 and 1,024 bytes long.

You can think about attributes in two ways: as a category of information or as a set of name and value pairs. An attribute behaves as a category when you refer to it using only its name, in which case you can retrieve or modify all of the values associated with that name at once. If you need to reference a specific value for an attribute instead of all the values, you can refer to the value by using a pairing of the attribute’s name and the specific value you wish to modify.

Warning

SimpleDB is intended to support UTF-8 text values for item names, attribute names, and attribute values. However, when this book was written, an issue existed that prevented Query API requests containing non-ASCII characters from being accepted and authorized by the service. The issue was known to Amazon and was to be fixed in a later release. In the meantime, the only ways to populate SimpleDB with non-ASCII data is to encode it, to use the SOAP API interface, or to use the weak Query signature version (0).

The SimpleDB Query API interface uses a special parameter-naming convention to represent attribute name and value pairs in requests. The API operations that set and retrieve attributes use the parameter format shown in Table 13-5 to represent the name and value of each attribute.

Within API request messages, the name and value of each attribute is represented by two separate parameters that are associated with each other by using the same integer value for the i offset component of the parameter name. This offset value is only relevant within a single request message and has no significance outside of the request. The offset should start at zero for the first attribute name and value pair included in a request, and it should be incremented by one for each subsequent pair.

Here is an example showing how three attribute name and value pairs might be represented by request parameters. In this example there are two attribute categories, named Model and Color, and the Color attribute has two values. Notice that each name and value pair is assigned its own offset value.

Attribute.0.Name=Model
Attribute.0.Value=MacBook
Attribute.1.Name=Color
Attribute.1.Value=White
Attribute.2.Name=Color
Attribute.2.Value=Black

Example 13-6 defines a utility method that converts a dictionary of attribute name and value pairs into a dictionary of request parameters that will be understood by the SimpleDB service. We will take advantage of this method later on, when we implement the operations to modify and retrieve attribute data.

This method performs two tasks in addition to creating the name and value parameters necessary to describe attributes. The method invokes another method, called encode_attribute_value, to encode an attribute’s values before generating the parameter. We will discuss the encoding of attribute values further in Representing Data in SimpleDB.” The method also creates an additional request parameter called Attribute.i.Replace for each attribute pair, if the optional replace parameter is set to true. We will discuss how the Attribute.i.Replace parameter is used in Create or Update an Item.”

To demonstrate, let us use this method to generate the same parameter values we listed above for the Model and Color attribute categories.

# Create a dictionary of attribute names and values. If an attribute
# has multiple values, we represent the values as an array.
irb> attribs = {'Model' => 'MacBook', 'Color' => ['White','Black']}

# Generate the parameters that will represent these attributes
irb> sdb.build_attribute_params(attribs)
{"Attribute.0.Name"=>"Model",
 "Attribute.0.Value"=>"MacBook",
 "Attribute.1.Name"=>"Color",
 "Attribute.1.Value"=>"White",
 "Attribute.2.Name"=>"Color",
 "Attribute.2.Value"=>"Black"}

The build_attribute_params method also recognizes when an attribute name is mapped to a nil value. If that is the case, it will not generate an Attribute.i.Value request parameter for that attribute. This behavior will be important when we need to delete attributes from an item in Delete an Item’s Attributes.”

# Compare the parameters generated for nil values versus empty strings
irb> sdb.build_attribute_params({'NilAttr'=>nil, 'EmptyAttr'=>''})
{"Attribute.0.Name"=>"NilAttr", 
 "Attribute.1.Name"=>"EmptyAttr",
 "Attribute.1.Value"=>""}

The GetAttributes operation shown in Table 13-6 retrieves the attributes stored in a named item. The operation returns all of an item’s attributes by default, though it can also return only the values for a single attribute. The SimpleDB service does not perform any ordering of the attributes returned by this operation.

If you perform this operation on an item that does not exist, or if you refer to an attribute that is not present in an item, the service will return an empty result set.

Here is an XML document returned by the operation. The document includes a series of Attribute elements, each of which describes an attribute name and value pair with the elements Name and Value. This response document includes a Color attribute with two values; notice that each of these values is listed as a separate pair of Name and Value elements.

<GetAttributesResponse xmlns='http://sdb.amazonaws.com/doc/2007-11-07/'>
  <GetAttributesResult>
    <Attribute>
      <Name>Color</Name>
      <Value>White</Value>
    </Attribute>
    <Attribute>
      <Name>Color</Name>
      <Value>Black</Value>
    </Attribute>
    <Attribute>
      <Name>Model</Name>
      <Value>MacBook</Value>
    </Attribute>
  </GetAttributesResult>
  <ResponseMetadata>
    <RequestId>83cf0f6c-c505-4146-9fe7-de262f371051</RequestId>
    <BoxUsage>0.0000093522</BoxUsage>
  </ResponseMetadata>
</GetAttributesResponse>

Example 13-7 defines a method that retrieves an item’s attributes from SimpleDB. If the optional attribute_name parameter is included, only the named attribute values will be returned as an array; otherwise all of the item’s attributes will be returned as a hash object.

This method also invokes the decode_attribute_value method, if it is available in the SimpleDB class, to decode the text data stored in SimpleDB into other data types. We will discuss the encoding and decoding of attribute values in Representing Data in SimpleDB.”

Here is an example command that will return all of the attributes that belong to the item TestItem that describes a song by the Beatles. See Create or Update an Item” to see how we created this item in SimpleDB.

# Retrieve all the attributes for "TestItem" in "test-domain"
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Tomorrow Never Knows"],
    "Time"=>["177"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

If we are only interested in the albums on which this song appears, we can specify that only the values of the Album attribute should be retrieved.

# Retrieve the "Album" attribute from "TestItem"
irb> sdb.get_attributes('test-domain', 'TestItem', 'Album')
=> ["Revolver", "The Beatles Box Set"]

If we try to retrieve an attribute or item that does not exist, we simply get an empty result set, not an error.

# Retrieve all attributes for an item that does not exist
irb> sdb.get_attributes('test-domain', 'NonExistentItem')
=> {}

# Retrieve an attribute that does not exist in "TestItem"
irb> sdb.get_attributes('test-domain', 'TestItem', 'NonExistentAttribute')
=> []

The PutAttributes operation shown in Table 13-7” creates or replaces attributes in a named SimpleDB item. If the named item does not yet exist, this operation will create a new item with the supplied attributes. If the item already exists, this operation will either add new attributes to the item, or it will replace existing attributes, depending on the parameters included in the request.

Although an item can contain 256 attribute name and value pairs, this operation can include no more than 100 attributes in a single request. If you wish to create or update more than 100 attributes, you must perform multiple requests.

This operation describes attributes using the parameter format we discussed above in Attribute Parameters.” In addition to the Attribute.i.Name and Attribute.i.Value parameters we have already discussed, the PutAttributes request operation can accept an additional parameter called Attribute.i.Replace for each of the attributes in the request. This parameter is given a Boolean value to indicate what should happen if the item the operation is acting on already includes an attribute with the same name. If the Attribute.i.Replace parameter is set to false, the value will be added to the existing attribute; if it is set to true, any existing values belonging to the attribute will be replaced with the new value. If you attempt to replace an attribute that does not already exist in an item, the service will simply add the attribute without complaining.

The Attribute.i.Replace parameter provides a convenient shortcut for replacing the attributes in your items. Rather than having to first delete an attribute then add new content for it, you can perform both steps in one operation.

Here is an XML document returned by the operation. The response contains no more information than a standard SimpleDB service response.

<PutAttributesResponse xmlns='http://sdb.amazonaws.com/doc/2007-11-07/'>
  <ResponseMetadata>
    <RequestId>fe317580-242a-46eb-9b7a-7a4d9b523795</RequestId>
    <BoxUsage>0.0000219909</BoxUsage>
  </ResponseMetadata>
</PutAttributesResponse>

Example 13-8 implements a method that will create a new item in SimpleDB, or will add to or replace the attributes in an existing item. In addition to the required domain and item name parameters, the method takes a dictionary object with mappings of attribute names to values. If the replace parameter is set to true, the request message will include Attribute.i.Replace parameters with the value true for every attribute included in the request.

Let us run through some examples that demonstrate how to use the PutAttributes operation to create a new item in SimpleDB and how to add or replace attributes in an existing item. We will start by creating a new item in test-domain to represent a Beatles’ song. Because this song appears on at least two albums, we will create an item with two values for the Album attribute.

# Define a dictionary of attributes to describe a song
irb> attribs = {'Name'=>'Tomorrow Never Knows', 
                'Artist'=>'The Beatles', 
                'Time'=>'177', 
                'Album'=>['Revolver','The Beatles Box Set']} 
                
# Create a new item in SimpleDB called TestItem to store the attributes
irb> sdb.put_attributes('test-domain', 'TestItem', attribs)
=> true

# Retrieve the item to confirm that it has been created. This operation may  
# not return the item's attributes immediately due to propagation delays
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Tomorrow Never Knows"],
    "Time"=>["177"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

To add a new value to an existing attribute, we simply provide a new attribute name and value pair. In our example we have specified the song’s duration in seconds (177); we will now specify it in minutes and seconds (2:57) as well.

# Add a new value to an existing attribute
irb> sdb.put_attributes('test-domain', 'TestItem', {'Time'=>'2:57'})

# Confirm that the item's "Time" attribute now includes the value '2:57'
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Tomorrow Never Knows"],
    "Time"=>["177", "2:57"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

To replace an existing attribute with new values, we provide a name and value pair and also set the method’s optional replace parameter to true. Because it does not make much sense to have the song’s Time attribute specified in two different ways, we will replace this attribute with only the duration in minutes. Notice that the request parameters listed below include the parameter Attribute.0.Replace with a value of true.

# Replace any current values of the "Time" attribute with new values
irb> sdb.put_attributes('test-domain', 'TestItem', {'Time'=>'2:57'}, true)
REQUEST
=======
Method : POST
URI    : https://sdb.amazonaws.com/
Query Parameters:
  Action=PutAttributes
  ItemName=TestItem
  Attribute.0.Name=Time
  Attribute.0.Value=2:57
  Attribute.0.Replace=true
. . .

# Confirm that the "Time" attribute now contains only the value '2:57'
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Tomorrow Never Knows"],
    "Time"=>["2:57"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

Of course, it is not necessary to add or replace a single attribute at a time. Here we will modify the item to store the details of an entirely different song that appears on the same Beatles’ albums. We will also add an extra Composer attribute to the item to prove that you can safely add brand new attributes when the Attribute.i.Replace parameter is applied.

# Replace the values for the "Name" and "Time" attributes, 
# and add a new "Composer" attribute as well.
irb> attribs = {'Name'=>'Taxman', 'Time'=>'2:39', 
                'Composer'=>'George Harrison'}
irb> sdb.put_attributes('test-domain', 'TestItem', attribs, true)

# Confirm that the item's attributes have been added or replaced
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Taxman"],
    "Composer"=>["George Harrison"],
    "Time"=>["2:39"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

The DeleteAttributes operation shown in Table 13-8” deletes one or more attributes from an item. If all of an item’s attributes are deleted, the item itself ceases to exist. Performing this operation on an attribute or item that does not exist will not result in an error.

This operation can be used in three ways, depending on how many of the optional parameters you provide:

  1. Delete the entire item by specifying only the item’s name.

  2. Delete an entire attribute by specifying the attribute’s name, in addition to the item name.

  3. Delete a specific attribute name and value pair by specifying the attribute’s name and the exact value to be removed.

This operation uses the special attribute parameters discussed in Attribute Parameters” to represent attributes in request messages.

Here is an XML document returned by the operation. The response contains no more information than a standard SimpleDB service response.

<DeleteAttributesResponse xmlns='http://sdb.amazonaws.com/doc/2007-11-07/'>
  <ResponseMetadata>
    <RequestId>f316c744-182e-49eb-871f-429345fe9c3d</RequestId>
    <BoxUsage>0.0000219907</BoxUsage>
  </ResponseMetadata>
</DeleteAttributesResponse>

Example 13-9 defines a method that deletes an item or its attributes. If the optional attributes parameter is provided, only the attributes matching the given attribute names or name and value pairs will be deleted.

Let us work through the three different ways of using the DeleteAttributes operation. First, we will use the most precise kind of deletion to remove specific attribute values.

# Check the attribute contents of the "TestItem" item
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Taxman"],
    "Composer"=>["George Harrison"],
    "Time"=>["2:39"],
    "Album"=>["Revolver", "The Beatles Box Set"],
    "Artist"=>["The Beatles"]}
    
# Remove the attribute "Album" where the value is "Revolver", and the
# attribute "Composer" where the value is "George Harrison"
irb> deletes = {'Composer'=>'George Harrison', 'Album'=>'Revolver'}
irb> sdb.delete_attributes('test-domain', 'TestItem', deletes)
=> true

# Confirm that the specific attribute name and value pairs have been removed
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Taxman"],
    "Time"=>["2:39"],
    "Album"=>["The Beatles Box Set"],
    "Artist"=>["The Beatles"]}

Notice that because we deleted the one and only value of the Composer attribute, this attribute was removed from the item entirely. The Album attribute remains, however, because we only deleted one of the two values stored by this attribute.

Second, to delete an entire attribute, rather than specific attribute values, we invoke the operation without including any specific values. To achieve this with our delete_attributes method, we must provide a dictionary that maps the attribute’s names to the nil value.

# Delete the entire "Album" attribute
irb> deletes = {'Album'=>nil}
irb> sdb.delete_attributes('test-domain', 'TestItem', deletes)

# Confirm that the attribute has been removed
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {"Name"=>["Taxman"], "Time"=>["2:39"], "Artist"=>["The Beatles"]}

Finally, we can skip the attributes altogether and delete the item as a whole by specifying only the item’s name in the request message.

# Delete all the attributes in "TestItem" and thereby the item itself
irb> sdb.delete_attributes('test-domain', 'TestItem')

# Confirm that the item has been removed
irb> sdb.get_attributes('test-domain', 'TestItem')
=> {}