AsyncIO for networking

AsyncIO was specifically designed for use with network sockets, so let's implement a DNS server. More accurately, let's implement one extremely basic feature of a DNS server.

The DNS's basic purpose is to translate domain names, such as https://www.python.org/, into IP addresses, such as IPv4 addresses (for example 23.253.135.79) or IPv6 addresses (such as 2001:4802:7901:0:e60a:1375:0:6). It has to be able to perform many types of queries and know how to contact other DNS servers if it doesn't have the answer required. We won't be implementing any of this, but the following example is able to respond directly to a standard DNS query to look up IPs for a few sites:

import asyncio
from contextlib import suppress

ip_map = {
b"facebook.com.": "173.252.120.6",
b"yougov.com.": "213.52.133.246",
b"wipo.int.": "193.5.93.80",
b"dataquest.io.": "104.20.20.199",
}


def lookup_dns(data):
domain = b""
pointer, part_length = 13, data[12]
while part_length:
domain += data[pointer : pointer + part_length] + b"."
pointer += part_length + 1
part_length = data[pointer - 1]

ip = ip_map.get(domain, "127.0.0.1")

return domain, ip


def create_response(data, ip):
ba = bytearray
packet = ba(data[:2]) + ba([129, 128]) + data[4:6] * 2
packet += ba(4) + data[12:]
packet += ba([192, 12, 0, 1, 0, 1, 0, 0, 0, 60, 0, 4])
for x in ip.split("."):
packet.append(int(x))
return packet


class DNSProtocol(asyncio.DatagramProtocol):
def connection_made(self, transport):
self.transport = transport

def datagram_received(self, data, addr):
print("Received request from {}".format(addr[0]))
domain, ip = lookup_dns(data)
print(
"Sending IP {} for {} to {}".format(
domain.decode(), ip, addr[0]
)
)
self.transport.sendto(create_response(data, ip), addr)


loop = asyncio.get_event_loop()
transport, protocol = loop.run_until_complete(
loop.create_datagram_endpoint(
DNSProtocol, local_addr=("127.0.0.1", 4343)
)
)
print("DNS Server running")

with suppress(KeyboardInterrupt):
loop.run_forever()
transport.close()
loop.close()

This example sets up a dictionary that dumbly maps a few domains to IPv4 addresses. It is followed by two functions that extract information from a binary DNS query packet and construct the response. We won't be discussing these; if you want to know more about DNS read RFC (request for comment, the format for defining most IPs) 1034 and 1035.

You can test this service by running the following command in another terminal:

    nslookup -port=4343 facebook.com localhost  

Let's get on with the entree. AsyncIO networking revolves around the intimately linked concepts of transports and protocols. A protocol is a class that has specific methods that are called when relevant events happen. Since DNS runs on top of UDP (User Datagram Protocol), we build our protocol class as a subclass of DatagramProtocol. There are a variety of events this class can respond to. We are specifically interested in the initial connection occurring (solely so that we can store the transport for future use) and the datagram_received event. For DNS, each received datagram must be parsed and responded to, at which point, the interaction is over.

So, when a datagram is received, we process the packet, look up the IP, and construct a response using the functions we aren't talking about (they're black sheep in the family). Then, we instruct the underlying transport to send the resulting packet back to the requesting client using its sendto method.

The transport essentially represents a communication stream. In this case, it abstracts away all the fuss of sending and receiving data on a UDP socket on an event loop. There are similar transports for interacting with TCP sockets and subprocesses, for example.

The UDP transport is constructed by calling the loop's create_datagram_endpoint coroutine. This constructs the appropriate UDP socket and starts listening on it. We pass it the address that the socket needs to listen on and, importantly, the protocol class we created so that the transport knows what to call when it receives data.

Since the process of initializing a socket takes a non-trivial amount of time and would block the event loop, the create_datagram_endpoint function is a coroutine. In our example, we don't need to do anything while we wait for this initialization, so we wrap the call in loop.run_until_complete. The event loop takes care of managing the future, and when it's complete, it returns a tuple of two values: the newly initialized transport and the protocol object that was constructed from the class we passed in.

Behind the scenes, the transport has set up a task on the event loop that is listening for incoming UDP connections. All we have to do, then, is start the event loop running with the call to loop.run_forever() so that the task can process these packets. When the packets arrive, they are processed on the protocol and everything just works.

The only other major thing to pay attention to is that transports (and, indeed, event loops) are supposed to be closed when we are finished with them. In this case, the code runs just fine without the two calls to close(), but if we were constructing transports on the fly (or just doing proper error handling!), we'd need to be quite a bit more conscious of it.

You may have been dismayed to see how much boilerplate is required in setting up a protocol class and the underlying transport. AsyncIO provides an abstraction on top of these two key concepts, called streams. We'll see an example of streams in the TCP server in the next example.