A Client Against the Amazon E-Commerce Service

Amazon Web Services is an umbrella for Amazon’s pioneering contributions, in infrastructure and applications, to web services. From early on, Amazon pushed hard to make its web sites for shopping, storage (S3, Simple Storage Service), utility-priced cloud computing (EC2), and so on available as web services, too. Among the prominent hosts of web services, Amazon is unusual in offering both SOAP-based and REST-style versions of such services. This chapter and later ones have code examples that involve Amazon’s E-Commerce or shopping service (see Registering with Amazon), which requires an accessId and a secretKey for access. The accessId is inserted, as is, into any request against the E-Commerce service; the secretKey is used to create what Amazon calls a signature, which is likewise inserted into every request and then verified on the Amazon side. The secretKey itself is not inserted into a request.

The RestfulAmazon client (see Example 3-2) is relatively clean code but only because the messy details are isolated in the utility class RequestHelper. Amazon requires, in a RESTful request for item lookups against the E-Commerce service, that the verb be GET and that the required data be in a strictly formatted query string. Here are some details:

Amazon’s RESTful service is fussy about the format of requests against it. The utility class RequestHelper ensures that a GET request against the E-Commerce service has the required query string format.

Example 3-2. The RestfulAmazon client against the Amazon E-Commerce web service

package restful;

import java.net.URL;
import java.net.URLConnection;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.ByteArrayInputStream;
import java.util.HashMap;
import java.util.Map;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class RestfulAmazon {
    private static final String endpoint = "ecs.amazonaws.com";
    private static final String itemId = "0545010225"; // Harry Potter        1

    public static void main(String[ ] args) {
        if (args.length < 2) {
            System.err.println("RestfulAmazon <accessKeyId> <secretKey>");
            return;
        }
        new RestfulAmazon().lookupStuff(args[0].trim(), args[1].trim());
    }
    private void lookupStuff(String accessKeyId, String secretKey) {
        RequestHelper helper = new RequestHelper(endpoint, accessKeyId, secretKey);
        String requestUrl = null;
        String title = null;
        // Store query string params in a hash.
        Map<String, String> params = new HashMap<String, String>();
        params.put("Service", "AWSECommerceService");
        params.put("Version", "2009-03-31");
        params.put("Operation", "ItemLookup");                                2
        params.put("ItemId", itemId);
        params.put("ResponseGroup", "Small");
        params.put("AssociateTag", "kalin");  // any string should do
        requestUrl = helper.sign(params);
        String response = requestAmazon(requestUrl);
        // The string "null" is returned before the XML document.
        String noNullResponse = response.replaceFirst("null", "");
        System.out.println("Raw xml:\n" + noNullResponse);
        System.out.println("Author: " + getAuthor(noNullResponse));
    }
    private String requestAmazon(String stringUrl) {
        String response = null;
        try {
            URL url = new URL(stringUrl);
            URLConnection conn = url.openConnection();
            conn.setDoInput(true);
            BufferedReader in =
                new BufferedReader(new InputStreamReader(conn.getInputStream()));
            String chunk = null;
            while ((chunk = in.readLine()) != null) response += chunk;
            in.close();
        }
        catch(Exception e) { throw new RuntimeException("Arrrg! " + e); }
        return response;
    }
    private String getAuthor(String xml) {
        String author = null;
        try {
            ByteArrayInputStream bais = new ByteArrayInputStream(xml.getBytes());
            DocumentBuilderFactory fact = DocumentBuilderFactory.newInstance();
            fact.setNamespaceAware(true);
            DocumentBuilder builder = fact.newDocumentBuilder();
            Document doc = builder.parse(bais);
            NodeList results = doc.getElementsByTagName("Author");
            for (int i = 0; i < results.getLength(); i++) {
                Element e = (Element) results.item(i);
                NodeList nodes = e.getChildNodes();
                for (int j = 0; j < nodes.getLength(); j++) {
                    Node child = nodes.item(j);
                    if (child.getNodeType() == Node.TEXT_NODE)
                        author = child.getNodeValue();
                }
            }
        }
        catch(Exception e) { throw new RuntimeException("Xml bad!", e); }
        return author;
    }
}

The RestfulAmazon application expects two command-line arguments: an Amazon accessId and secretKey, in that order. The client application then sets various properties such as the requested Amazon operation (in this example, ItemLookup in line 2), the item’s identifier (in this example, 0545010225 in line 1, which is a Harry Potter novel), the Amazon associate’s name, and so on. After the RequestHelper utility formats the request according to Amazon’s requirements, the RestfulAmazon client then opens a URLConnection to the Amazon service, sends the GET request, and reads the response, chunk by chunk. The relevant code segment is:

URL url = new URL(stringUrl);
URLConnection conn = url.openConnection();                            1
conn.setDoInput(true);
BufferedReader in =                                                   2
  new BufferedReader(new InputStreamReader(conn.getInputStream()));
String chunk = null;
while ((chunk = in.readLine()) != null) response += chunk;            3

The code first creates a URLConnection (line 1) and then wraps a BufferedReader around the connection’s InputStream (line 2). A while loop is used to read the Amazon response chunk by chunk (line 3). On a successful GET request, the payload in the HTTP response body is an XML document. Here is a slice from a sample run:

<?xml version="1.0" ?>
<ItemLookupResponse
   xmlns="http://webservices.amazon.com/AWSECommerceService/2011-08-01">
  <OperationRequest>
    <HTTPHeaders>
      <Header Name="UserAgent" Value="Java/1.7"></Header>
    </HTTPHeaders>
    <RequestId>591ac8db-0435-4c53-9b01-e3756ea9c55d</RequestId>
    <Arguments>
      <Argument Name="Operation" Value="ItemLookup"></Argument>
      <Argument Name="Service" Value="AWSECommerceService"></Argument>
      ...
      <Argument Name="ResponseGroup" Value="Small"></Argument>
    </Arguments>
    <RequestProcessingTime>0.0083090000000000</RequestProcessingTime>
  </OperationRequest>
  <Item>
    <Request>
      <IsValid>True</IsValid>
      <ItemLookupRequest>
        <IdType>ASIN</IdType>
        <ItemId>0545010225</ItemId>
        ...
      </ItemLookupRequest>
    </Request>
    <Item>
      <ASIN>0545010225</ASIN>
      <DetailPageURL>
        http://www.amazon.com/Harry-Potter-Deathly-Hallows-Book...
      </DetailPageURL>
      <ItemLinks>
        <ItemLink>
          <Description>Technical Details</Description>
          <URL>http://www.amazon.com/Harry-Potter-Deathly-Hallows-Book...</URL>
        </ItemLink>
        ...
        <ItemLink>
          <Description>Add To Wedding Registry</Description>               1
          <URL>http://www.amazon.com/gp/registry/wedding/add-item.html...</URL>
        </ItemLink>
        ...
      </ItemLinks>
      <ItemAttributes>
        <Author>J. K. Rowling</Author>
        <Creator Role="Illustrator">Mary GrandPré</Creator>
        <Manufacturer>Arthur A. Levine Books</Manufacturer>
        <ProductGroup>Book</ProductGroup>
        <Title>Harry Potter and the Deathly Hallows (Book 7)</Title>
      </ItemAttributes>
    </Item>
  </Items>
</ItemLookupResponse>

Even a cursory look at the XML makes clear, to anyone who has searched on the Amazon website, that the web service response contains essentially the same information as the corresponding HTML page viewed in a browser visit. For example, there is an XML element labeled:

Add To Wedding Registry (line 1)

Amazon’s goal is to make the website and the web service deliver the same information and the same functionality but in different formats: the website delivers HTML documents, whereas the web service delivers XML documents.

With the response XML in hand, the RestfulAmazon client then parses the document to extract, as proof of concept, the author’s name, J. K. Rowling. The code uses the relatively old-fashioned DOM parser, implemented as the standard Java DocumentBuilder class. Here is the relevant code segment:

Document doc = builder.parse(bais);                        1
NodeList results = doc.getElementsByTagName("Author");     2
for (int i = 0; i < results.getLength(); i++) {
   Element e = (Element) results.item(i);
   NodeList nodes = e.getChildNodes();
      for (int j = 0; j < nodes.getLength(); j++) {
         Node child = nodes.item(j);
         if (child.getNodeType() == Node.TEXT_NODE)        3
           author = child.getNodeValue();
      }
}

The code first builds the DOM tree structure from the Amazon response bytes (line 1) and then gets a list, in this case a list of one element, from DOM elements tagged as Author. The author’s name, J. K. Rowling, occurs as the contents of a TEXT_NODE (line 3). The parse deals with the usual complexities of the tree structure that a DOM represents. Similar DOM searches could extract from Amazon’s XML response document any other information of interest, for example, the book’s ISBN number.

The RequestHelper class (see Example 3-3) has one job: format the HTTP GET request in accordance with Amazon’s strict requirements. This class acts as a utility that hides many low-level details, but a quick overview should provide some insight about what the E-Commerce service requires in a well-formed request. Recall that a request against the E-Commerce service requires both an accessId and a secretKey but the two play quite different roles in the request. The accessId occurs as a value in a key/value pair, with AWSAccessKeyId as the key (line 2). There is also a key/value pair for the timestamp that the E-Commerce service requires (line 3); hence, the accessId and the timestamp are peers. Amazon uses the timestamp to ensure that the requests are timely—that is, recently constructed.

Example 3-3. The utility class RequestHelper, which supports the RestfulAmazon class

package restful;

import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.SortedMap;
import java.util.TimeZone;
import java.util.TreeMap;
import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import org.apache.commons.codec.binary.Base64;

public class RequestHelper {
    private static final String utf8 = "UTF-8";
    private static final String hmacAlg = "HmacSHA256";
    private static final String requestUri = "/onca/xml";
    private static final String requestMethod = "GET";
    private String endpoint = null;
    private String accessKeyId = null;
    private String secretKey = null;
    private SecretKeySpec secretKeySpec = null;
    private Mac mac = null;

    public RequestHelper(String endpoint,
                         String accessKeyId,
                         String secretKey) {
        if (endpoint == null || endpoint.length() == 0)
            throw new RuntimeException("The endpoint is null or empty.");
        if (null == accessKeyId || accessKeyId.length() == 0)
            throw new RuntimeException("The accessKeyId is null or empty.");
        if (null == secretKey || secretKey.length() == 0)
            throw new RuntimeException("The secretKey is null or empty.");
        this.endpoint = endpoint.toLowerCase();
        this.accessKeyId = accessKeyId;
        this.secretKey = secretKey;
        try {                                                                    1
            byte[ ] secretKeyBytes = this.secretKey.getBytes(utf8);
            this.secretKeySpec = new SecretKeySpec(secretKeyBytes, hmacAlg);
            this.mac = Mac.getInstance(hmacAlg);
            this.mac.init(this.secretKeySpec);
        }
        catch(Exception e) { throw new RuntimeException(e); }
    }
    public String sign(Map<String, String> params) {
        params.put("AWSAccessKeyId", this.accessKeyId);                          2
        params.put("Timestamp", this.timestamp());                               3
        // The parameters need to be processed in lexicographical order, with
        // sorting on the first byte: a TreeMap is perfect for this.
        SortedMap<String, String> sortedParamMap =                               4
           new TreeMap<String, String>(params);
        // Ensure canonical form of the query string, as Amazon REST is fussy.
        String canonicalQS = this.canonicalize(sortedParamMap);                  5
        // Prepare the signature with grist for the mill.
        String toSign =
            requestMethod + "\n"
            + this.endpoint + "\n"
            + requestUri + "\n"
            + canonicalQS;
        String hmac = this.hmac(toSign);
        String sig = null;
        try {
            sig = URLEncoder.encode(hmac, utf8);
        }
        catch(UnsupportedEncodingException e) { System.err.println(e); }
        String url =                                                             6
            "http://" + this.endpoint + requestUri + "?" + canonicalQS +
            "&Signature=" + sig;
        return url;
    }
    public String sign(String queryString) {
        Map<String, String> params = this.createParameterMap(queryString);
        return this.sign(params);
    }
    private String hmac(String stringToSign) {
        String signature = null;
        byte[ ] data;
        byte[ ] rawHmac;
        try {
            data = stringToSign.getBytes(utf8);
            rawHmac = mac.doFinal(data);
            Base64 encoder = new Base64();                                      7
            signature = new String(encoder.encode(rawHmac));
        }
        catch (UnsupportedEncodingException e) {
            throw new RuntimeException(utf8 + " is unsupported!", e);
        }
        return signature;
    }
    // Amazon requires an ISO-8601 timestamp.
    private String timestamp() {
        String timestamp = null;
        Calendar cal = Calendar.getInstance();
        DateFormat dfm = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'");
        dfm.setTimeZone(TimeZone.getTimeZone("GMT"));
        timestamp = dfm.format(cal.getTime());
        return timestamp;
    }
    private String canonicalize(SortedMap<String, String> sortedParamMap) {
        if (sortedParamMap.isEmpty()) return "";
        StringBuffer buffer = new StringBuffer();
        Iterator<Map.Entry<String, String>> iter =
                 sortedParamMap.entrySet().iterator();
        while (iter.hasNext()) {
            Map.Entry<String, String> kvpair = iter.next();
            buffer.append(encodeRfc3986(kvpair.getKey()));
            buffer.append("=");
            buffer.append(encodeRfc3986(kvpair.getValue()));
            if (iter.hasNext()) buffer.append("&");
        }
        return buffer.toString();
    }
    // Amazon requires RFC 3986 encoding, which the URLEncoder may not get right.
    private String encodeRfc3986(String s) {
        String out;
        try {
            out = URLEncoder.encode(s, utf8)
                .replace("+", "%20")
                .replace("*", "%2A")
                .replace("%7E", "~");
        }
        catch (UnsupportedEncodingException e) { out = s; }
        return out;
    }
    private Map<String, String> createParameterMap(String queryString) {
        Map<String, String> map = new HashMap<String, String>();
        String[ ] pairs = queryString.split("&");
        for (String pair : pairs) {
            if (pair.length() < 1) continue;
            String[ ] tokens = pair.split("=", 2);
            for(int j = 0; j < tokens.length; j++) {
                try {
                    tokens[j] = URLDecoder.decode(tokens[j], utf8);
                }
                catch (UnsupportedEncodingException e) { }
            }
            switch (tokens.length) {
                case 1: {
                    if (pair.charAt(0) == '=') map.put("", tokens[0]);
                    else map.put(tokens[0], "");
                    break;
                }
                case 2: {
                    map.put(tokens[0], tokens[1]);
                    break;
                }
            }
        }
        return map;
    }
}

The secretKey plays a different role than does the accessId. The secretKey is used to initialize a message authentication code (MAC), which the Java javax.crypto.Mac class represents. (The initialization occurs in the try block that begins on line 1.) Amazon requires a particular type of MAC, an HMAC (Hash Message Authentication Code) that uses the SHA-256 algorithm (Secure Hash Algorithm that generates a 256-bit hash). The important security point is that the secretKey itself does not go over the wire from the client to Amazon. Instead the secretKey is used to initialize the process that generates a message digest (hash value). Finally, this hash value is encoded in base64 (line 7). Amazon calls the result a signature, which occurs as the value in a key/value pair whose key is Signature (line 6).

There is a final preparatory step. The E-Commerce service expects, in a GET request, that the query string key/value pairs be in sorted order. The RequestHelper uses a TreeMap, as this data structure is ideally suited for the task (line 4). The properly formatted query string results from a call to canonicalize; this query string is then appended to the base URL for the Amazon E-Commerce service.

Not every commercial site is as fussy as Amazon when it comes to request formatting. This first Amazon example shows that generating a correctly formatted RESTful request may be nontrivial. This client also explicitly parses the returned XML. The next client addresses the issue of how to avoid such parsing. Before looking at the code for the second client, however, it will be useful to focus on the JAX-B utilities used in the second client.