Chapter 3: Architecting a Storage Cluster 

  1. It depends on the volume type used, but 2 CPUs and 4+GB of RAM is a good starting point.
  2. GlusterFS would use the brick’s filesystem cache mechanism.
  3. It is a fast storage layer where I/Os will be served instead of going to slower storage. Cache can be RAM or faster storage media, such as SSDs.
  4. As more concurrency is achieved, the software will require more CPU cycles to server the requests.
  5. Distributed will aggregate space, Replicated will mirror data, hence “halving” space, Dispersed will aggregate space but will consume 1 node for parity. Think of it as a RAID5.
  6. Depends on many variables such as retention periods, data ingress, etc...
  7. The expected amount of data growth.
  8. Throughput is a function of a given amount of data over a given amount of time, normally displayed as MB/s or Megabytes per second
    Input output operations per second (IOPS) is a function of certain amount of operations per second
    I/O size refers to the request size done by the appliance
  9. The layout of the storage locations used by GlusterFS.
  10. GlusterFS’s process of replicating data from a cluster to another, normally located in a different Geo-location..