In-place upgrades

The first type of deployment that we'll cover is in-place upgrades. This style of deployment operates on an infrastructure that already exists, in order to upgrade the existing application. This model is a traditional model that was used when the creation of a new infrastructure was a costly endeavor, in terms of both time and money.

A general design pattern to minimize the downtime during this type of upgrade is to deploy the application across multiple hosts, behind a load balancer. The load balancer will act as a gateway between users of the application and the servers that run the application. Requests for the application will come to the load balancer, and, depending on the configuration, the load balancer will decide which backend server to direct the requests to.

To perform a rolling in-place upgrade of an application deployed with this pattern, each server (or a small subset of the servers) will be disabled at the load balancer, upgraded, and then re-enabled to take on new requests. This process will be repeated for the remaining servers in the pool, until all servers have been upgraded. As only a portion of the available application servers are taken offline to be upgraded, the application as a whole remains available for requests. Of course, this assumes that an application can perform well with mixed versions running at the same time.

Let's build a playbook to upgrade a fictional application. Our fictional application will run on servers foo-app01 through foo-app08, which exist in the foo-app  group. These servers will have a simple website that's served via the nginx web server, with the content coming from a foo-app Git repository, defined by the foo-app.repo variable. A load balancer server, foo-lb, running the haproxy software, will front these app servers.

In order to operate on a subset of our foo-app servers, we need to employ the serial mode. This mode changes how Ansible will execute a play. By default, Ansible will execute the tasks of a play across each host in the order that the tasks are listed. Ansible executes each task of the play on every host before it moves on to the next task in the play. If we were to use the default method, our first task would remove every server from the load balancer, which would result in the complete outage of our application. Instead, the serial mode lets us operate on a subset, so that the application as a whole stays available, even if some of the members are offline. In our example, we'll use a serial amount of 2, in order to keep the majority of the application members online:

--- 
- name: Upgrade foo-app in place 
  hosts: foo-app 
  serial: 2 
Ansible 2.2 introduced the concept of serial batches: a list of numbers that can increase the number of hosts addressed serially each time through the play. This allows the size of the hosts addressed to increase as the confidence increases. Where a batch of numbers is provided to the serial keyword, the last number provided will be the size of any remaining batch, until all hosts in the inventory have been completed.

Now, we can start to create our tasks. The first task will be to disable the host from the load balancer. The load balancer runs on the foo-lb host; however, we're operating on the foo-app hosts. Therefore, we need to delegate the task by using the delegate_to task operator. This operator redirects where Ansible will connect to in order to execute the task, but it keeps all of the variable contexts of the original host. We'll use the haproxy module to disable the current host from the foo-app backend pool:

  tasks: 
  - name: disable member in balancer 
    haproxy: 
      backend: foo-app 
      host: "{{ inventory_hostname }}" 
      state: disabled 
    delegate_to: foo-lb 

With the host disabled, we can now update the foo-app content. We'll use the git module to update the content path with the desired version, defined as foo-version. We'll add a notify handler to this task to reload the nginx server if the content update results in a change. This can be done every time, but we're using this as an example usage of notify:

  - name: pull stable foo-app 
    git: 
      repo: "{{ foo-app.repo }}" 
      dest: /srv/foo-app/ 
      version: "{{ foo-version }}" 
    notify: 
      - reload nginx 

Our next step would be to re-enable the host in the load balancer; however, if we did that task next, we'd put the old version back in place, as our notified handler hasn't run yet. So, we need to trigger our handlers early, by way of the meta: flush_handlers call, which you learned about in Chapter 9Extending Ansible:

  - meta: flush_handlers 

Now, we can re-enable the host in the load balancer. We can just enable it right away and rely on the load balancer to wait until the host is healthy before sending requests to it. However, because we are running with a reduced number of available hosts, we need to ensure that all of the remaining hosts are healthy. We can make use of a wait_for task to wait until the nginx service is once again serving connections. The wait_for module will wait for a condition on either a port or a file path. In our example, we will wait for port 80 and the condition that the port should be in. If it is started (the default), that means it is accepting connections:

  - name: ensure healthy service 
    wait_for: 
      port: 80 

Finally, we can re-enable the member within haproxy. Once again, we'll delegate the task to foo-lb:

  - name: enable member in balancer 
    haproxy: 
      backend: foo-app 
      host: "{{ inventory_hostname }}" 
      state: enabled 
    delegate_to: foo-lb 

Of course, we still need to define our reload nginx handler:

  handlers: 
  - name: reload nginx 
    service: 
      name: nginx 
      state: restarted 

This playbook, when run, will now perform a rolling in-place upgrade of our application.