Dockerfiles are like recipes for your applications, but you can't just throw in the ingredients and hope for the best. Creating an efficient image requires you to be careful about how you utilize the tools at your disposal.
The whole point of containers is to have a small footprint—having a 1 GB+ image for a 100 MB application is not indicative of a small footprint, nor is it efficient at all. Microservices are all about this as well; having small container images for your microservices not only improves performance, but storage utilization decreases security vulnerabilities and points of failure, and it also saves you money.
Container images are stored locally in your host and remotely in a container registry. Public cloud providers charge you for the storage utilization of your registry and not by the image quantity that you have stored there. Think of a registry as the GitHub of containers. Let's say that you have to pull an image from your cloud provider's registry; which image do you think it will be faster to pull? A 1 GB image or a 100 MB image? The image size is essential.
The first thing to consider when building an image is the base image that you are going to use. Instead of using large images (such as full Linux distributions, Ubuntu, Debian, or CentOS) that have a lot of tools and executables that you will not need for your application to run, use smaller ones such as Alpine:
REPOSITORY | SIZE |
centos | 200 MB |
ubuntu | 83.5 MB |
debian | 101 MB |
alpine | 4.41 MB |
You will find that most of the images have a slimmer version of themselves, for example, httpd and nginx:
REPOSITORY | TAG | SIZE |
httpd | alpine | 91.4 MB |
httpd | latest | 178 MB |
nginx | alpine | 18.6 MB |
nginx | latest | 109 MB |
As you can see, httpd:alpine is almost 50% smaller than httpd:latest, while nginx:alpine is 80% smaller!
Smaller images will not only reduce your storage consumption, but they will also reduce your attack surface. This is because a smaller container has a lower attack surface; let's take a look at the latest Ubuntu image versus the latest Alpine.
For Ubuntu, we can see an increased count for vulnerabilities as per the Docker Hub page for the latest tag; this is captured in the following screenshot:
For Alpine Linux, the count goes down to zero, as demonstrated in the following screenshot:
In the preceding screenshot, we can see the vulnerability count when compared to Ubuntu. Even today, the latest Alpine image has no vulnerabilities whatsoever. In comparison, Ubuntu has seven vulnerable components that are not even needed for our application to run.
Another thing to take into account is the layering of your image; each time you run a RUN statement in the build it will add one more layer and size to your final image. Reducing the number of RUN statements and what you run on them will dramatically decrease your image size.
Let's take our first Dockerfile, as follows:
FROM ubuntu:latest
LABEL maintainer="WebAdmin@company.com"
RUN apt update
RUN apt install -y apache2
RUN mkdir /var/log/my_site
ENV APACHE_LOG_DIR /var/log/my_site
ENV APACHE_RUN_DIR /var/run/apache2
ENV APACHE_RUN_USER www-data
ENV APACHE_RUN_GROUP www-data
COPY /my_site/ /var/www/html/
EXPOSE 80
CMD ["/usr/sbin/apache2","-D","FOREGROUND"]
We can modify the RUN instruction into the following way:
RUN apt update && \
apt install -y apache2 --no-install-recommends && \
apt clean && \
mkdir /var/my_site/ /var/log/my_site
Now instead of creating three layers, we will be producing only one, by running all our commands in a single statement.
Remember that everything you do in RUN is executed with /bin/sh -c or any other shell that you specified with SHELL, so &, ;, and \ are accepted as they would be in a regular shell.
However, we didn't only remove the extra RUN instructions; we also added apt clean to clean the cache of our container before it commits, and used the --no-install-recommend flag to avoid installing any unnecessary packages, thus reducing both storage space and the attack surface:
Here are the details of the original image:
REPOSITORY | SIZE |
bigimage | 221 MB |
Here are the details of the smaller image:
REPOSITORY | SIZE |
smallerimage | 214 MB |
Of course, this is not a huge difference, but this is only an example and no real application was being installed. In a production image, you will have to do more than just install apache2.
Now let's use both of the techniques that we have learned and slim our image down:
FROM alpine
RUN apk update && \
apk add mini_httpd && \
mkdir /var/log/my_site
COPY /my_site/ /var/www/localhost/htdocs/
EXPOSE 80
CMD ["/usr/sbin/mini_httpd", "-D", "-d", "/var/www/localhost/htdocs/"]
Here is the final size of the image:
REPOSITORY | SIZE |
finalimage | 5.79 MB |
Now, you can see there is a great difference in sizes—we passed from 221 MB to 217 MB, and finally ended up with a 5.79-MB image! Both images did the exact same thing, which was to serve a web page, but with an entirely different footprint.