CHAPTER 4 Configuring a Reverse Proxy with Nginx

This chapter aims to explain how a reverse proxy is used and what it does. You will also learn how to create a simple page in Python using the framework Flask and make it available using the web server Gunicorn.

Structure

In this chapter, we will discuss the following topics:

Installing Nginx
Installing Python
- Creating a simple page using Flask
Configuring reverse proxy

Objectives

After studying this unit, you should be able to answer:

What is Nginx and how it works
What is a reverse proxy
The basics of a web page in Python
How to deploy an application using Gunicorn

Installing the Nginx

Nginx, the name is made up by the sound when you pronounce Engine X. It is also one of the most famous web servers, more frequently used as a load balancer or as a proxy pass. One of its own best capabilities is working with threads to respond a lot of requisitions in parallel. It is frequently used with Python applications. It receives the requisitions and passes to another Python server like Gunicorn, as we will see in the following pages.

To begin with, let’s install Nginx by running the following command:

root@devops:~# apt install nginx -y

Reading package lists… Done

Building dependency tree

Reading state information… Done

The following additional packages will be installed:

fontconfig-config fonts-dejavu-core libfontconfig1 libgd3 libjbig0 libjpeg-turbo8 libjpeg8 libnginx-mod-http-geoip libnginx-mod-http-image-filter

libnginx-mod-http-xslt-filter libnginx-mod-mail libnginx-mod-stream libtiff5 libwebp6 libxpm4 nginx-common nginx-core

Suggested packages:

libgd-tools fcgiwrap nginx-doc

The following NEW packages will be installed:

fontconfig-config fonts-dejavu-core libfontconfig1 libgd3 libjbig0 libjpeg-turbo8 libjpeg8 libnginx-mod-http-geoip libnginx-mod-http-image-filter

libnginx-mod-http-xslt-filter libnginx-mod-mail libnginx-mod-stream libtiff5 libwebp6 libxpm4 nginx nginx-common nginx-core

0 upgraded, 18 newly installed, 0 to remove and 72 not upgraded.

Need to get 2,461 kB of archives.

After the installation finishes, you can check all the configuration files in this folder by the following command:

root@devops:~# ls /etc/nginx/

conf.d fastcgi_params koi-win modules-available nginx.conf scgi_params sites-enabled uwsgi_params

fastcgi.conf koi-utf mime.types modules-enabled proxy_params sites-available snippets win-utf

Probably, you saw some similarities with Apache. For example, here also, we have the folder sites-enable and sites-disable. So, you already know what these directories do. However, the most important is the nginx.conf. that stores the main configuration of the web server. We can use the command grep to check the enabled lines, in other words, the lines which are not commentaries:

root@devops:~# grep -v "#" /etc/nginx/nginx.conf

user www-data;

worker_processes auto;

pid /run/nginx.pid;

include /etc/nginx/modules-enabled/*.conf;

events {

worker_connections 768;

}

http {

sendfile on;

tcp_nopush on;

tcp_nodelay on;

keepalive_timeout 65;

types_hash_max_size 2048;

include /etc/nginx/mime.types;

default_type application/octet-stream;

ssl_prefer_server_ciphers on;

access_log /var/log/nginx/access.log;

error_log /var/log/nginx/error.log;

gzip on;

include /etc/nginx/conf.d/*.conf;

include /etc/nginx/sites-enabled/*;

}

Therefore, this file can show you the important lines, like the number of connections per worker, so each worker will be in charge to respond 768 connections. By default, the value is 1, but we can have at least one per CPU processor. If you want to check how many processors you have in your CPU, run the following command:

root@devops:~# cat /proc/cpuinfo | grep processor

processor : 0

The last command shows you that you have only one processor in your CPU. I ran this command on my VM. Since I have only one processor, if I run in my physical machine, the result is as follows:

alisson@avell:~$ cat /proc/cpuinfo | grep processor

processor : 0

processor : 1

processor : 2

processor : 3

processor : 4

processor : 5

processor : 6

processor : 7 show - show package details

Therefore, in this case, I can configure my Nginx to use 8 workers. If you want to do that, just add the following line below the worker connections:

Worker_processes 8;

Restart your server and the configuration is enabled.

We just saw some important files. Now is the time to test our web server. In the last chapter, we installed Apache web server in the virtual machine. So, probably it is still running. Let's check if there is any process running on port 80:

The last command is printed on the screen. Of all the services that are currently listening in my VM, if Apache is one of them, let's stop it, disable the server, and start Nginx:

root@devops:~# systemctl stop apache2

root@devops:~# systemctl disable apache2

Synchronizing state of apache2.service with SysV service script with /lib/systemd/systemd-sysv-install.

Executing: /lib/systemd/systemd-sysv-install disable apache2

Now, even if you restart your VM, Apache will be still stopped. To start and enable Nginx, run the following commands:

root@devops:~# systemctl start nginx

root@devops:~# systemctl enable nginx

Synchronizing state of nginx.service with SysV service script with /lib/systemd/systemd-sysv-install.

Executing: /lib/systemd/systemd-sysv-install enable nginx

Open the IP address in the browser to check the index page: http://192.168.178.62/:

Figure 4.1

It is interesting that it is displaying the Apache example page. This happens because Nginx is using the same folder as Apache for the document root. You can check that using the following command:

root@devops:~# grep -v "#" /etc/nginx/sites-enabled/default

server {

listen 80 default_server;

listen [::]:80 default_server;

root /var/www/html;

index index.html index.htm index.nginx-debian.html;

server_name _;

location / {

try_files $uri $uri/ =404;

}

The grep command showed you that the document root is defined as /var/www/html, which is the same as Apache. If you want to see the default index page of Nginx, run the following command:

root@devops:~# cp -v /usr/share/nginx/html/index.html /var/www/html/

'/usr/share/nginx/html/index.html' -> '/var/www/html/index.html'

The grep command showed you that the document root is defined as /var/www/html, which is the same as Apache. If you want to see the default index page of the Nginx, run the following command:

Figure 4.2

Now, you can see the Nginx default page.

Installing Python

Python, nowadays, is one of the most famous programming languages, mainly in the areas, like Data Science and DevOps. It is frequently used to automate Linux tasks and integrate systems using REST APIs. Some years ago, I worked in a system called BeavOps. I was responsible to make the enrollment of the students for the consulting company I used to work for.

That system was receiving a JSON file from the CRM with the student data, the respective course, and the starting and ending date. The mission was to create some routines with this data which will integrate many different tools, like Gitlab, Jenkins, Apache, MongoDB, etc, aiming to create a development environment for the students and with a CI/CD pipeline. That system was completely made up using Python, Flask, and MongoDB to store the data.

To install Python, we basically need two packages:

root@devops:~# apt install python3 python3-pip -y

Reading package lists… Done

Building dependency tree

Reading state information… Done

python3 is already the newest version (3.6.7-1~18.04).

python3 set to manually installed.

So, the options are self-explained. Therefore, let's use them in practice. After the installation finishes, install the Flask framework using pip:

Now, your environment is ready to run a Python application.

Creating a simple page using Flask

Flask is a micro-framework for Python, which is perfect for you to create your APIs. Now, you will learn how to create a simple Hello World using this micro-framework and Python. You will also learn all the steps to deploy your application using Nginx and Gunicorn.

Let's create a file, called app.py with the following content:

from flask import Flask

app = Flask(__name__)

@app.route("/")

def index():

return "DevOps with Linux"

if __name__ == "__main__":

app.run(debug=True,host="0.0.0.0")

Now, run the following command:

root@devops:~# python3 app.py

* Serving Flask app "app" (lazy loading)

* Environment: production

WARNING: This is a development server. Do not use it in a production deployment.

Use a production WSGI server instead.

* Debug mode: on

* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

* Restarting with stat

* Debugger is active!

* Debugger PIN: 179-281-442

Now, you can see the internal server running. Like we used PHP, the same is applied to Python, and then you can get your IP address. Access from your web browser and you must see a page like that the one following:

Figure 4.3

As the log from the Flask says by itself, this server is just for the development purposes. We must use a WSGI (Web Server Gateway Interface). If you want to learn more about that, you can access the PEP 333 page: https://www.python.org/dev/peps/pep-0333/. It explains more details about that topic, but for now, our focus is to deploy the application.

The WSGI server that we will use will be Gunicorn. It is just one of the many options that are available. To install it, run the following command:

Now, we can deploy the application again, but using the WSGI server. To do this, run the following command:

root@devops:~# gunicorn --chdir /root/ app:app -b "0.0.0.0"

[2020-03-23 18:34:13 +0000] [23783] [INFO] Starting gunicorn 20.0.4

[2020-03-23 18:34:13 +0000] [23783] [INFO] Listening at: http://0.0.0.0:8000 (23783)

[2020-03-23 18:34:13 +0000] [23783] [INFO] Using worker: sync

[2020-03-23 18:34:13 +0000] [23786] [INFO] Booting worker with pid: 23786

Notice that the port has changed. Now, you must use the port 8000:

Figure 4.4

It’s so simple like that. However, we also want to run this application behind a reverse proxy. So, let's see how we are going to configure that.

Configuring the Reverse Proxy

We have already installed Nginx in our VM and we also know the configuration files. Knowing that, the configuration for the pages can be found in the virtual hosts which are found in the directory:

root@devops:~# ls /etc/nginx/sites-available/default

/etc/nginx/sites-available/default

Then, we can open the file and your file should have the content as follows:

server {

listen 80 default_server;

listen [::]:80 default_server;

server_name _;

location / {

proxy_pass http://localhost:8000/ ;

}

Restart the server to enable the configuration:

root@devops:~# systemctl restart nginx

root@devops:~#

If you try to access the server using the default port 80, you will see the following message:

Figure 4.5

It happens because the server behind our proxy is not running yet. So, let's run it and check the changes:

root@devops:~# gunicorn --chdir /root/ app:app

[2020-03-23 18:52:26 +0000] [24025] [INFO] Starting gunicorn 20.0.4

[2020-03-23 18:52:26 +0000] [24025] [INFO] Listening at: http://127.0.0.1:8000 (24025)

[2020-03-23 18:52:26 +0000] [24025] [INFO] Using worker: sync

[2020-03-23 18:52:26 +0000] [24028] [INFO] Booting worker with pid: 24028

This time we do not need to pass the bind option, -b to make the server available to all the IP address. In the current case, the server is just listening to the localhost. Therefore, Nginx is open to the world receiving the connections and forwarding it to Gunicorn.

If you try to access again using the default port, you will see that now we are able to see the service run:

Figure 4.6

The environment is ready. Reverse proxies are used for many strategies. One of the most common is caching the static files, like JavaScript, CSS, and HTML. Therefore, all the incoming connections are responded by Nginx and only the dynamic content is forwarded to Gunicorn. This way is less costly for the CPU to process everything. In the future chapters, we will see how to apply the same logic, but using the Cloud services, which are the most common today. But, if you are using on-premises infrastructure, this can be a good strategy to increase the performance.