Building container images using best practices

Dockerfiles are like recipes for your applications, but you can't just throw in the ingredients and hope for the best. Creating an efficient image requires you to be careful about how you utilize the tools at your disposal.

The whole point of containers is to have a small footprinthaving a 1 GB+ image for a 100 MB application is not indicative of a small footprint, nor is it efficient at all. Microservices are all about this as well; having small container images for your microservices not only improves performance, but storage utilization decreases security vulnerabilities and points of failure, and it also saves you money.

Container images are stored locally in your host and remotely in a container registry. Public cloud providers charge you for the storage utilization of your registry and not by the image quantity that you have stored there. Think of a registry as the GitHub of containers. Let's say that you have to pull an image from your cloud provider's registry; which image do you think it will be faster to pull? A 1 GB image or a 100 MB image? The image size is essential.

The first thing to consider when building an image is the base image that you are going to use. Instead of using large images (such as full Linux distributions, Ubuntu, Debian, or CentOS) that have a lot of tools and executables that you will not need for your application to run, use smaller ones such as Alpine:

REPOSITORY  SIZE
centos 200 MB
ubuntu 83.5 MB
debian 101 MB
alpine  4.41 MB

You will find that most of the images have a slimmer version of themselves, for example, httpd and nginx:

REPOSITORY TAG SIZE
httpd alpine 91.4 MB
httpd latest 178 MB
nginx alpine 18.6 MB
nginx latest 109 MB

As you can see, httpd:alpine is almost 50% smaller than httpd:latest, while nginx:alpine is 80% smaller!

Smaller images will not only reduce your storage consumption, but they will also reduce your attack surface. This is because a smaller container has a lower attack surface; let's take a look at the latest Ubuntu image versus the latest Alpine.

For Ubuntu, we can see an increased count for vulnerabilities as per the Docker Hub page for the latest tag; this is captured in the following screenshot:

For Alpine Linux, the count goes down to zero, as demonstrated in the following screenshot:

In the preceding screenshot, we can see the vulnerability count when compared to Ubuntu. Even today, the latest Alpine image has no vulnerabilities whatsoever. In comparison, Ubuntu has seven vulnerable components that are not even needed for our application to run.

Another thing to take into account is the layering of your image; each time you run a RUN statement in the build it will add one more layer and size to your final image. Reducing the number of RUN statements and what you run on them will dramatically decrease your image size.

Let's take our first Dockerfile, as follows: 

    FROM ubuntu:latest
LABEL maintainer="WebAdmin@company.com"

RUN apt update
RUN apt install -y apache2
RUN mkdir /var/log/my_site

ENV APACHE_LOG_DIR /var/log/my_site
ENV APACHE_RUN_DIR /var/run/apache2
ENV APACHE_RUN_USER www-data
ENV APACHE_RUN_GROUP www-data

COPY /my_site/ /var/www/html/

EXPOSE 80

CMD ["/usr/sbin/apache2","-D","FOREGROUND"]

We can modify the RUN instruction into the following way:

RUN apt update && \
apt install -y apache2 --no-install-recommends && \
apt clean && \
mkdir /var/my_site/ /var/log/my_site

Now instead of creating three layers, we will be producing only one, by running all our commands in a single statement.

Remember that everything you do in RUN is executed with /bin/sh -c or any other shell that you specified with SHELL, so &, ;, and \ are accepted as they would be in a regular shell.

However, we didn't only remove the extra RUN instructions; we also added apt clean to clean the cache of our container before it commits, and used the --no-install-recommend flag to avoid installing any unnecessary packages, thus reducing both storage space and the attack surface:

Here are the details of the original image:

REPOSITORY  SIZE
bigimage 221 MB

 

Here are the details of the smaller image:

REPOSITORY  SIZE
smallerimage 214 MB

 

Of course, this is not a huge difference, but this is only an example and no real application was being installed. In a production image, you will have to do more than just install apache2.

Now let's use both of the techniques that we have learned and slim our image down:

FROM alpine

RUN apk update && \
apk add mini_httpd && \
mkdir /var/log/my_site

COPY /my_site/ /var/www/localhost/htdocs/
EXPOSE 80

CMD ["/usr/sbin/mini_httpd", "-D", "-d", "/var/www/localhost/htdocs/"]

Here is the final size of the image:

REPOSITORY  SIZE
finalimage 5.79 MB

 

Now, you can see there is a great difference in sizeswe passed from 221 MB to 217 MB, and finally ended up with a 5.79-MB image! Both images did the exact same thing, which was to serve a web page, but with an entirely different footprint.