Chapter Summary
Figure 6.3 extends Figure 6.2 with additional commentary summarizing Amazon Redshift basic concepts along with Aurora, Neptune, and DocumentDB similarities and differences.
Figure 6.3 Redshift Summary Contrasting Aurora, Neptune and DocumentD
B
Amazon Redshift (Redshift) -- like RDS, Aurora, Neptune and DocumentDB -- is another managed, platform-based database service offered by AWS. Like RDS and Aurora, Redshift offers a relational database service, based on PostgreSQL, and appeals to apps with very large data sets (e.g., BIDW, OLAP, ETL, and AI). Like Aurora, Neptune and DocumentDB, Redshift enables clusters of database instances. Redshift clusters are composed of nodes
rather than DB instances
. A Redshift cluster consists of a leader node
assisted by one or more compute nodes
, rather than using the terminology for other DB clusters … a primary DB instance
with one or more read replica DB instances
.
Redshift compute nodes -- unlike Aurora, Neptune and DocumentDB -- do not
serve as read replicas. Compute nodes enable massively parallel processing
(MPP) crucial for optimizing query performance across very large partitioned data
sets (i.e., table rows are distributed across compute nodes). A Redshift-specific SQL syntax enables data partitioning. Redshift’s CREATE TABLE
statement features a DISTSTYLE
parameter for specifying options for distributing a table’s rows (e.g., by designating a table column as the distribution key
for horizontal partitioning).
Redshift accomplishes data partitioning by dividing each compute node’s memory and storage capacity into two or more slices
. This enables additional parallelism carried out within
each compute node. The leader node disseminates data to compute node slices and delegates work (i.e., code based on execution plans) to be performed by each compute node slice. The leader node coordinates and aggregates result sets from compute node slices.
Redshift achieves additional OLAP performance efficiencies by exploiting columnar
data storage for database tables. OLAP queries often access a small number of columns at a time (e.g., in Dimension and Fact Tables for Star Schemas). Values for a single column for many rows are stored in each data block, rather than row-wise data blocks benefitting OLTP. Columnar data blocks take advantage of a common data type and compression scheme thereby decreasing space requirements and reducing I/O operations.
Redshift also offers node type options for using solid-state storage devices and -- like RDS and Aurora -- enables the purchase of reserved instances for compute nodes offering considerable cost savings compared to on-demand pricing.
Redshift clusters are similar to Aurora, Neptune and DocumentDB clusters -- but noteworthy relationship and attribute differences justify differentiating Redshift clusters from the other DB clusters. A Redshift cluster -- like an Aurora, Neptune and DocumentDB cluster, is a kind of
DB cluster which in turn is a kind of
cluster. A Redshift cluster inherits a KMS customer master key (i.e., a CMK for data encryption), IAM roles, and cluster snapshots because it is a kind of
DB cluster. A Redshift cluster indirectly inherits (because a DB cluster is a kind of
cluster) many security groups, a subnet group, a parameter group, an engine version, and is a source for event notifications. After a Redshift cluster is created, customary SQL client tools (i.e., tools compatible with PostgreSQL or other SQL client tools that are DBMS and platform independent) can be used to connect to the cluster database.
Unlike other DB clusters …
- Snapshots for a Redshift cluster can be automatically copied to a different region (i.e., cross-region
copies) for supporting disaster recovery.
- Redshift provides the ability to individually restore database tables
from database snapshots and maintains the history of table restore requests.
- Redshift does not support exportable log types.
- Redshift -- unlike Aurora, but in addition to DocumentDB and Neptune -- does not support backtracking, serverless clusters, cross-region replication, and option groups.