Amazon’s Simple Queue Service (SQS) provides reliable storage and delivery of messages between any clients or computers with access to the Internet. It allows message senders and recipients to interact without having to communicate directly with each other and without requiring that either side be always available or connected to the network.
SQS combines the key advantages of a conventional messaging architecture—loosely-coupled and fault-tolerant communication—with a reliable and flexible distributed infrastructure that stores messages redundantly over multiple data centers. Because SQS is accessible to clients on any platform that can send and receive HTTP requests, the service makes it possible to build distributed applications with truly heterogeneous components using a range of platforms and development languages.
As this book went to press, Amazon Web Services released a new
version of the SQS service API: 2008-01-01
. This new API includes an updated
pricing model that is intended to make the service cheaper for most
users, however it also includes significant changes that are not
compatible with previous API versions or the third-party libraries and
tools we use in this book.
The previous APIs will remain available until May 6, 2009, after
when all SQS users must migrate to the 2008-01-01
version. In this book we describe
the older 2007-05-01
API and we will
not discuss the new API in depth. For more information about the
benefits of the new API and the main differences between this and
previous versions, refer to API Version 2008-01-01.”
The SQS is composed of two main resources that can be acted on through the application programming interface (API): messages and queues.
A message is a piece of textual data up to 256 KB in size that can be sent to SQS and stored in the service until it has been delivered to one or more receivers.
Messages are stored in queues. Queues serve to group related messages together and provide configuration options for message delivery and access control.
SQS is implemented as a distributed system in which copies of each message are stored in multiple physical servers and potentially across multiple data centers. This strategy provides benefits in terms of redundancy, reliability, and scalability; it also results in some drawbacks that would not apply to a more centralized messaging system. These drawbacks are not serious and can be avoided, provided you bear the following points in mind when designing your SQS-based applications.
When a message receiver asks SQS to return the messages available in a queue, the system only samples a subset of its physical servers for messages that belong in the queue. Only the messages stored on the sampled servers are returned to the receiver, though there may be other messages stored on servers that were not sampled. This server sampling technique is represented in Figure 8-1.
The service uses what Amazon calls a weighted random distribution algorithm to determine which servers to sample in each retrieval request. If the first retrieval request does find all the available messages, subsequent retrievals will eventually return all the messages from all the servers. However, you cannot assume that the result of any particular retrieval request contains the total number of messages available in the queue.
You are most likely to receive a limited subset of messages when there are than several hundred messages in a queue. If you query a queue that contains relatively few messages, you may receive no messages at all in some results, even though there are messages available in the queue.
You cannot rely on SQS in scenarios where messages must be delivered immediately or very quickly. The delivery of SQS messages is delayed by at least the amount of time it takes for receivers to poll the service for new messages. In some cases, messages will be delayed further if they are stored on a server that has become temporarily unavailable, or which is not in the subset sampled by a retrieval request. As a general guide, you can expect messages to take from 2 to 10 seconds to be delivered by SQS under normal circumstances.
SQS cannot guarantee that messages will be delivered in the same order they were sent. Although the service will attempt to keep your messages in order, this is not always possible in a distributed system. If your application requires messages to be processed in a particular order, you will have to build order-checking functionality into your application to use SQS safely.
SQS decides whether or not to deliver a particular message based on two criteria: whether it still exists in the system and its visibility state, a property we will describe shortly. Because it is impossible to guarantee that information about a message’s state or life-cycle status will always be synchronized between all the servers in the distributed SQS system, your application must gracefully handle the redelivery of messages that ought to be invisible or deleted.
Here are the most important guidelines that SQS application developers should follow to take advantage of the service’s strengths and avoid the drawbacks of its distributed architecture:
Design your application so there will be no problem if a message is delivered to more than one recipient, or if it is redelivered after it has already been processed. In other words, message processing in your application should be idempotent.
Design your application to cope with delayed message delivery. For the most part, your messages will be delivered within seconds, but they may occasionally be delayed for minutes or hours if an SQS server component fails.
Avoid using SQS messaging for highly transactional processes in which it is vital for messages to be delivered in order and on time.
Most importantly, try to forget any preconceptions you may have about how messaging systems work. SQS has different capabilities from most other messaging systems, and it does not aim to provide a solution for timely, ordered, and once-only delivery of messages between application components. If you need a messaging system that provides these features, you will either have to implement them yourself by adding a layer of business logic on top of SQS, or you will need to use an alternative system that already does all this work for you.
One situation in which SQS works particularly well is when it is used to deliver work items to the components of a distributed application. For example, a director component might send messages with task instructions to a pool of worker components that receive and process these messages as they arrive. In this scenario, the order and time frame of message delivery is not as important as ensuring that all messages are eventually delivered at least once.
SQS is designed to allow multiple clients to receive messages from a single queue. The service aims to deliver each message only once, though it will deliver a message multiple times if this is necessary to ensure that the message is properly processed and acknowledged. This approach means that messages are not lost, even if a message-receiving component crashes or loses network connectivity before it has finished processing a message.
To manage the delivery of messages, the service maintains state information about each message that indicates whether or not it should be delivered to potential receivers. This state information is called the message’s visibility. A message may be visible or invisible. While a message is invisible, it remains in the queue but will not be delivered to message receivers until it becomes visible again. The state of a message is changed from visible to invisible each time the message is received by a client; this prevents the message from being received by another client straightaway. The change to the invisible state is only temporary, and after a set amount of time SQS will make the message visible again. Figure 8-2 shows the main events in a message’s life cycle.
The time interval for which a message will remain invisible is called its visibility timeout. The visibility timeout of a message is managed automatically by SQS queues, or it may be modified directly by API operations on the service.
The visibility timeout of a message is measured in seconds, and it can be a value from zero—in which case the message is in the visible state—up to 86,400 seconds, which means the message will remain invisible for a full day (86,400 seconds equals 24 hours). The duration of the timeout may be set on a per-queue or per-message basis. A queue’s default visibility timeout setting determines how long a message remains invisible when it is delivered, though this value can be overridden with a message-specific timeout at any point in the message’s lifetime. Because a message only remains invisible for a limited time, the only way to prevent the message from being eventually redelivered is to delete it from the queue.
The content of a message can be viewed at any time using a
peek
operation, even when the
message is invisible and cannot be retrieved with a standard receive
request.
To make the most efficient use of SQS messaging in an application, it is vital to apply the appropriate visibility timeout values to your messages. If the timeout is too short, a message could be redelivered before the original recipient has had enough time to process and delete it, resulting in unnecessary and wasteful reprocessing. If the timeout is too long, the redelivery and processing of messages will be delayed unnecessarily when a component that has already received some messages fails or loses connectivity. Ideally, a message’s visibility timeout setting should match the time it will take to process that message and delete it from the queue.
To understand how a distributed application may be built around SQS messaging, we should look at the different roles that may be played by SQS clients. A client may perform one or more of the following tasks: send messages, receive messages, or manage and monitor SQS resources.
A message sender contacts SQS, asks it to create a new message in a specific queue, and provides the data that will make up the content of the message. Once the service has acknowledged the receipt of the message, the sender can be certain that the message will be delivered at least once to a message receiver watching that queue.
When a message is sent, the sender is provided with an ID string that uniquely identifies the message in the target queue. This ID can be used to perform operations on the message, such as viewing its contents, changing its visibility state, or deleting the message altogether.
SQS is available via a web service interface that requires clients to initiate a connection to the service to perform actions or receive information. This means that clients of SQS must actively contact the service to receive messages; there is no mechanism for notifying message receivers when new messages become available. Message receivers must poll the service at intervals to receive new messages.
A message receiver client contacts the service and asks it to provide one or more messages from a specific queue. If there are messages in the queue that are visible and are stored on one of the SQS servers sampled by the operation, these messages are returned to the receiver. If there are no messages available, the receiver will generally wait for some amount of time before contacting the service again.
When there are messages available, the receiver can obtain the data content of the message from the service’s response and process this information. The receiver also obtains the identifier for each message it receives, and it can use this identifier to perform a follow-up operation on the message. Here are the actions a message receiver may perform on a message, depending whether the message was successfully processed.
If a message is processed successfully, it has fulfilled its purpose, and the receiver can delete the message from the service’s queue so that it will not be redelivered.
If the message was not processed, the receiver may do nothing and allow the message to be redelivered automatically once its visibility timeout has expired, or it may modify the message’s visibility timeout so that it can be redelivered to another receiver straightaway.
If the message receiver is unable to process the message before its visibility timeout expires, the receiver can extend the message’s visibility timeout to gain more time and prevent the message from being re-delivered too early. If the receiver can estimate how long the task will take when it first receives a message, the timeout can be set to an appropriate value before it starts processing it.
An administrator client performs the management tasks necessary to keep the messaging infrastructure running smoothly. These tasks may include creating new queues, defining the default visibility timeout settings for a queue, and configuring a queue’s access control settings.
In addition to one-off management tasks like these, an administrator may undertake monitoring and maintenance tasks. For example, an administrator may monitor the number of messages stored in a queue to determine whether the messages are being processed quickly enough to keep up with demand.
SQS account holders are billed monthly for their usage of the service based on the number of messages they have sent and the amount of data transfered into and out of the service. There is no separate fee for the number of API requests you have performed.
SQS incurs a fee of 0.01¢ for each message sent (10¢ for 1,000 messages).
Data received by SQS in message-sending operations costs 10¢ per GB. Data transmitted by SQS when messages are delivered is charged on a sliding scale, depending on how much data was transferred during the month: 18¢/GB from 0 to 10 TB, 16¢/GB from 10 to 50 TB, and 13¢/GB for any amount over 50 TB.
Amazon automatically debits the credit card attached to the Amazon Web Services (AWS) account. All charges are in U.S. dollars.
In February of 2008, Amazon Web Services released a redesigned
version of the SQS API that provides for lower usage costs. This new
API, version 2008-01-01
, offers an
updated pricing model that will make the service cheaper for most
users. However, it also includes significant changes that are
incompatible with the service’s previous APIs and any libraries or
tools based on them. Between February 6, 2008 and May 6, 2009, SQS
developers can use either the previous API versions or the newest API.
After May 6 2009, only the 2008-01-01
API version will remain
available.
Because the new API was released late in this book’s production
process, our discussion of SQS will be limited to the superseded API
version 2007-05-01
, for which
third-party libraries and tools were available at the time of writing.
In this section we will briefly describe the new API and how it
differs from previous versions. Readers who wish to take advantage of
the new pricing model, or who are updating their applications in
preparation for the mandatory switch-over on May 6 2009, should bear
these differences in mind when reading Chapters 8 and 9.
For a detailed description of the new API and how it differs
from the previous version 2007-05-01
as discussed within this book,
refer to “Migrating to Amazon SQS API Version 2008-01-01” here:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1148.
SQS account holders are billed differently depending on
whether they use the 2008-01-01
API version or previous versions. The pricing schedule for previous
API versions is described in Pricing“ above.
The API version 2008-01-01
incurs
fees based on the number of requests performed and the amount of
data transfered into and out of Amazon’s network.
SQS incurs a fee of 0.0001¢ per request (1¢ for 10,000 requests).
This per-request fee replaces the per-message fee imposed by the older APIs. The request fee is only one hundredth the value of the prior message fee, however the request fee is charged for every API operation including message retrieval requests. To take advantage of the new pricing model you should avoid polling for messages more often than is strictly necessary to minimize the number of requests you perform.
The data transfer rates for the new API are identical to
the rates for previous versions, however when you use the
2008-01-01
API there are no
fees for data transferred between the EC2 and SQS
services.
These prices are correct as of February 2008; refer to Amazon’s web site to confirm the latest pricing.
In order to lower the fees it charges for the SQS service,
Amazon made changes to the service’s API to reduce its internal
running costs. A number of service features and API operations are
deprecated in the new version, the structure and content of the
service’s input and output messages have changed, and some data and
time limits have been tightened. These changes are not compatible
with previous versions of the API. Libraries, tools and applications
that were designed to work with the previous APIs will not work with
the 2008-01-01
version. These
changes may require you to design and manage your SQS application
differently from the examples presented in this book.
Table 8-1 describes the
limits that have changed between API versions 2007-05-01
and 2008-01-01
, and the bulleted list below
lists some of the other major changes applied in the new version.
For a complete list of changes, refer to the article “Migrating to
Amazon SQS API Version 2008-01-01” and to the latest API
documentation available on Amazon’s web site.
Table 8-1. Limits in API version 2008-01-01
Limit | Previous Limit |
---|---|
Messages can contain no more than 8 KB of data. | 256 KB |
Messages are automatically deleted from SQS after 4 days. | 15 days |
The maximum visibility timeout for a message is 2 hours. If a messages takes longer than 2 hours to process and delete, it will inevitably be redelivered. | 24 hours |
A maximum of 10 messages can be received from the service at once. | 256 messages |
Queues that have no activity may be deleted after 30 days. | - |
The entire REST API interface has been deprecated, leaving only the Query and SOAP interfaces available. Amazon is “actively considering” reinstating this interface in a future release.
Queues created with the 2008-01-01
API are not accessible
using the previous APIs, and vice versa. Note that although
queues are not accessible between the API versions, your queue
names must still be unique over both versions.
The ChangeMessageVisibility operation has been removed. This means that the visibility timeout of a message can be controlled only at the queue level or at the instant when the message is received. It is no longer possible to adjust a message’s visibility timeout after it has been received.
SQS applications that use the new API must estimate in advance how long it will take to process and delete a message, and then send the message to a queue that has an appropriate default visibility timeout setting. The technique we discuss in “Distributed Application Services with BOTO” for preventing the redelivery of a message by dynamically adjusting its visibility will no longer be available.
The PeekMessage operation has been removed. This means that it is no longer possible to query the service and discover whether a particular message has been deleted.
The access control functionality has been removed entirely, as have the ListGrants, AddGrant, and RemoveGrant operations. Amazon intends to provide a mechanism to allow SQS queues to be shared in the new API, however at the time of writing there was no timeline for this feature to be made available.
Messages may only be deleted using a new receipt handle identifier that the service provides to a message’s receiver. This means that messages cannot be deleted from SQS until they have been received at least once, and that senders cannot delete their sent messages.
SQS provides an MD5 digest of a messages body to the sender and receiver of the message. This digest allows applications to confirm that the messages body data has been correctly transmitted to or from the service.
The DeleteQueue operation deletes queues immediately, even
when they contain messages. In previous versions, the parameter
ForceDeletion
had to be
included in DeleteQueue requests to make the service delete a
non-empty queue.