Summary

After installing ZFS, creating the zpools, installing GlusterFS, and creating the volumes, we have ended up with a solution with respectable performance that can sustain a node failure and still serve data to its clients.

For the setup, we used Azure as the cloud provider. While each provider has their own set of configuration challenges, the core concepts can be used on other cloud providers as well.

However, this design has a disadvantage. When adding new disks to the zpools, the stripes don't align, causing new reads and writes to yield lower performance. This problem can be avoided by adding an entire set of disks at once; lower read performance is mostly covered by the read cache on RAM (ARC) and the cache disk (L2ARC).

For GlusterFS, we used a dispersed layout that balances performance with high availability. In this three-node cluster setup, we can sustain a node failure without holding I/O from the clients.

The main takeaway is to have a critical mindset when designing a solution. In this example, we worked with the resources that we had available to achieve a configuration that would perform to specification and utilize what we provided. Make sure that you always ask yourself how this setting will impact the result, and how you can change it to be more efficient.

In the next chapter, we'll go through testing and validating the performance of the setup.