Using service mesh for shared concerns

As web services' frameworks and standards evolve, the amount of boilerplate or shared application concerns is reduced. This is because, collectively, we figure out what parts of our applications are universal and therefore shouldn't need to be re-implemented by every programmer or team. When people first started networking computers, programmers writing network-aware applications had to worry about a lot of low-level details that are now abstracted out by the operating system's networking stack. Similarly, there are certain universal concerns that all microservices share. Frameworks such as Twitter's Finagle wrap all network calls in a circuit breaker, increasing fault tolerance and isolating failures in systems. Finagle and Spring Boot, the Java framework we've been using for most of these recipes, both support exposing a standard metrics endpoint that standardizes basic network, JVM, and application metrics collected for microservices.

Every microservice should consider a number of shared application concerns. From an observability perspective, services should strive to emit consistent metrics and structured logs. To improve the reliability of our systems, services should wrap network calls in circuit breakers and implement consistent retry and back-off logic. To support changes in network and service topology, services should consider implementing client-side load balancing and use centralized service discovery.

Instead of implementing all of these features in each of our services, it would be ideal to abstract them out to something outside our application code that could be maintained and operated separately. Like the features of our operating systems network stack, if each of these features is implemented by something our application could rely on being present, we would not have to worry about them being available. This is the idea behind a service mesh.

Running a service mesh configuration involves running each microservice in your system behind a network proxy. Instead of services speaking directly to one another, they communicate via their respective proxies, which are installed as sidecars. Practically speaking, your service would communicate with its own proxy running on localhost. As network requests are sent through a services proxy, the proxy can control what metrics are emitted and what log messages are output. The proxy can also integrate directly with your service registry and distribute requests evenly among active nodes, keeping track of failures and opting to fail fast when a certain threshold has been reached. Running your system in this kind of configuration can ease the operational complexity of your system while improving the reliability and observability of your architecture.

Like most of the recipes discussed in this chapter, there are numerous open source solutions for running a service mesh. We'll focus on Linkerd, an open source proxy server built and maintained by buoyant. The original authors of Linkerd worked at Twitter before forming buoyant and as such, Linkerd incorporates many of the lessons learned by teams at Twitter. It shares many features with the Finagle Scala framework, but can be used with services written in any language. In this recipe, we'll walk through installing and configuring Linkerd and discuss how we can use it to control communication between our Ruby on Rails monolith API and our newly developed media service.