Chapter 6. Using EC2 Instances and Images

A common workaround for this problem is to use a dynamic DNS service to assign domain names to your instances, instead of the standard DNS system. For an example, refer to Dynamic DNS” in Chapter 7. The two main advantages of dynamic DNS services is that they generally provide a mechanism for IP addresses to be updated automatically using simple APIs, and they propagate the new address much more quickly than standard DNS.^[*] Dynamic DNS services do not completely solve the addressing problem, however, because it can still take time for your users’ computers to become aware of address changes; in the meantime your application may appear to be broken.

The lack of static addressing in EC2 is especially problematic for publicly accessible applications that require high availability, because the majority of techniques for ensuring high availability rely on the presence of a public-facing server that has a static IP address. At present, there is no real solution for the issue beyond running an additional, statically addressed server outside EC2 as a frontend for your application.

Instance Data

EC2 provides contextual data to each instance running in the EC2 environment. This data includes a set of metadata information about the instance itself, which can include arbitrary custom user data that you provide to an instance when you launch it. This metadata and user data is very useful for configuring the behavior of your instances.

The contextual information provided by the EC2 environment is known as Instance Data. It is made available to instances through a simple HTTP interface available inside the EC2 environment at the location http://169.254.169.254. Instances can retrieve this information by sending standard GET requests to Universal Resource Identifier (URI) paths at this location; for example, to list the instance data versions available, you can run the curl command on your instance as follows:

ec2$ curl http://169.254.169.254/ 
1.0
2007-01-19
2007-03-01
2007-08-29

Metadata is returned from the instance data service as plain text strings with the mime type text/plain. Metadata values may comprise single items or a set of multiple items. Items are listed as strings that are delimited by ASCII line feed and carriage-return characters. If you provide custom data when you start an instance, that data will be returned as binary data with the mime type application/x-octet-stream. If you request a piece of instance data that is not available, the instance data service will return an HTTP 404: Not Found error response. If you request a multi-value item but leave off the trailing slash in the URI, the service will return a 301: Permanent Redirect response indicating the correct URI you should use.

Note

Instance data text values do not include new-line characters at the end of the string. Single-value items return only the item text without a terminating new line character, and multivalue lists include ASCII line-feed and carriage-return characters between each value in the list, but not after the last item.

Instance data versioning

The instance data service made available in the EC2 environment is versioned, so although the structure and content of the available information may change over time, your scripts will not be affected provided they refer to a specific version of the service.

Table 6-1 lists URI resource paths that retrieve information about the different versions of instance data available from the instance data service. These resource paths are appended to the service’s location URI, http://169.254.169.254.

Table 6-1. Resource paths to access versioned instance data

Resource Path Description Example Response

Resource Path	Description	Example Response
/	Returns a list of the instance data versions available	As of January 2008, there were four versions: `1.0` `2007-01-19` `2007-03-01` `2007-08-29`
/latest/	Automatically retrieves the most recent version of instance data, for example this resource path will be an alias for`2007-08-29` while it is the latest version	`user-data` `meta-data/`
/2007-08-29/	Use a specific named version of instance data	`user-data` `meta-data/`

Returns a list of the instance data versions available

As of January 2008, there were four versions:

1.0

2007-01-19

2007-03-01

2007-08-29

/latest/

Automatically retrieves the most recent version of instance data, for example this resource path will be an alias for2007-08-29 while it is the latest version

user-data

meta-data/

/2007-08-29/

Use a specific named version of instance data

user-data

meta-data/

Instance Metadata

A set of metadata information is available to every instance running in EC2 to help the instance learn about itself and the environment it is running in.

Table 6-2 lists resource path fragments that can be added to the base URI that refers to the instance data version you wish to access. For example, http://169.254.169.254/2007-08-29/meta-data/ will be the base URI to retrieve metadata information from the 2007-08-29 version.

Table 6-2. Resource path fragments to access instance metadata

Path Fragment	Description	Example Response
reservation-id	The identifier of the reservation set in which the instance was started.	`r-d97e99b0`
ami-id	The identifier of the AMI from which the instance was launched.	`ami-889075e1`
ami-manifest-path	The location in S3 of the AMI’s manifest file.	oreilly-aws/ami-fedora7-base.img.manifest.xml
ami-launch-index	A zero-based offset value that uniquely identifies each instance in a batch started at the same time.	The first instance launched: 0 The third instance launched: 2
instance-id	The instance’s identifier.	i-b434d4dd
instance-type	The type of the instance: `m1.small`, `m1.large`, or `m1.xlarge`. This metadata field may not be made available to instances launched from older AMIs. If this value is not present, you can assume that the instances are `m1.small`.	.m1.small
local-hostname	The instance’s private DNS name address.	domU-12-31-36-00-3C-A2.z-1.compute-1.internal
public-hostname	The instance’s public DNS name address.	ec2-72-44-57-154.z-1.compute-1.amazonaws.com
local-ipv4	The instance’s private IP address.	10.253.67.80
public-ipv4	The instance’s public IP address.	72.44.57.154
public-keys/	Lists the keypair names provided to the instance, if any. This listing is the root of a hierarchy described further in the next two items.	0=ec2-private-key
public-keys/`n`/	A list of the formats in which the keypair’s public key component is available.	openssh-key
public-keys/`n`/`openssh-key`	The public key data for the `n`th public key item in the `openssh-key` format.	ssh-rsa AAA...ZEf ec2-private-key
security-groups	A list of the security groups the instance belongs in.	default
product-codes	A list of the product codes associated with the instance.	`A79EC0DB`

Instance user data

You can provide EC2 with up to 16 KB of arbitrary data when you launch an instance. This information is called user data and it is made available to the instance by the instance data service. By passing data to your instance with this mechanism, you can easily supply your instances with custom configuration information or instructions.

The user data you provide is made available to your instances as binary data from the instance data service at the URI resource-path fragment user-data. For example, to obtain this data from the 2007-08-29 version of the service, you would use the URI http://169.254.169.254/2007-08-29/user-data. The instance data service returns exactly the same bytes as you supplied to the RunInstances operation, so it is possible to use binary information instead of plain text; for example, you could provide encrypted binary content or images as user data. You must Base-64-encode the user data you supply to the RunInstances operation; EC2 will automatically decode the user data before making it available via the URI.

When you start multiple AMI instances at the same time, each instance receives exactly the same user data. In some circumstances you may want to provide different information to each instance in a set, even though they are launched at the same time. This is possible by taking advantage of the unique launch index offset value that is assigned to each instance you launch in a set, and which is available from the ami-launch-index item in the instance data service. If you provide user data that can be interpreted differently by your instance, depending on that instance’s launch index number, you can supply slightly different data to each instance.

Performance

There are a broad range of factors that will influence the level of performance you can expect from a particular EC2 instance. Some of these factors are consistent and well-known, such as the network bandwidth you can expect within the EC2 environment; others are ill-defined or subject to change between different instances or even on a single instance over time. Overall, you should not rely too much on the stated performance characteristics of instances to indicate how well your own application will perform. The only way to know for sure how your application will perform in EC2 is to test it thoroughly for yourself.

Bearing that in mind, here is a list of the main factors that affect the performance of EC2 instances.

Instance type

There are three different instance types available in EC2 with different performance characteristics and platforms; these were summarized in Table 5-1. These instance types determine the amount of CPU processing power, RAM, and storage space allocated to an instance.

Shared subsystems

Each instance runs as a virtual machine on underlying physical hardware, which may run multiple instances at a time. Although CPU, RAM, and storage resources are dedicated to each instance and perform at a constant rate, the networking and disk I/O subsystems are shared between all the instances running on an underlying machine, and their performance may fluctuate. Fortunately, this fluctuation will always be to your benefit.

The networking and I/O resources of the underlying hardware are shared equitably, such that each instance is guaranteed at least a baseline performance level; however, if an instance is fortunate enough to be running on a host where the other instances are idle, it can access more of these resources than the guaranteed baseline. This means that some of your instances may occasionally perform better than expected, though this benefit is entirely unpredictable.

Network bandwidth

Instances are allowed 250 Mbps of network bandwidth within the EC2 network. Bandwidth outside of the EC2 environment will vary.

Storage space initialization

The first write operation to any location on the virtualized disks used by EC2 instances will be slow. Only after an initial write has been performed to a specific disk location will subsequent writes to that location run at full speed. For applications where write operations must occur at top speed from the very beginning, you will have to initialize your instance's storage space by writing once to every location. This task can take hours, depending on the instance type you are using and how many partitions you must initialize.

RAID

To improve the performance of software-based RAID on an instance, you can increase the default minimum reconstruction speed from 1MBps to 30MBps. This will help to overcome the initial write-speed penalty imposed by the virtualized disks.

^[2] We recommend you find a good book or two on Linux server administration. Two worthy offerings from O’Reilly are Linux System Administration by Tom Adelstein and Bill Lubanovic, or Linux in a Nutshell by Ellen Siever et al.

^[*]Some dynamic DNS services provide additional features, such as basic load balancing for multiple instances, which is performed by providing the IP address of a different instance for each domain name look-up query the service answers.