- Let's create an S3 bucket for our data. Go to S3 and click on Create new bucket. Type the name cookbook-spectrum.
- Now, we can unload data into this bucket and run this command, as shown in the following code block:
unload ('select * from lineorder')
to 's3://cookbook-spectrum/
iam_role 'arn:aws:iam::615381814665:role/RedshiftS3Access'
delimiter '\t'
- Let's create the external schema, as follows:
create external schema datalake
from data catalog
database 'spectrumdb'
iam_role 'arn:aws:iam::615381814665:role/RedshiftS3Access'
create external database if not exists;
- Let's create an external table in this schema, as follows:
create external table datalake.lineorder
(
lo_orderkey INTEGER
,lo_linenumber INTEGER
,lo_custkey INTEGER
,lo_partkey INTEGER
,lo_suppkey INTEGER
,lo_orderdate INTEGER
,lo_orderpriority VARCHAR(15)
,lo_shippriority VARCHAR(1)
,lo_quantity INTEGER
,lo_extendedprice INTEGER
,lo_ordertotalprice INTEGER
,lo_discount INTEGER
,lo_revenue INTEGER
,lo_supplycost INTEGER
,lo_tax INTEGER
,lo_commitdate INTEGER
,lo_shipmode VARCHAR(10)
)
row format delimited
fields terminated by '\t'
stored as textfile
location 's3://cookbook-spectrum/
table properties ('numRows'='172000');
- We can test this table by running queries against it, such as SELECT * FROM datalake.lineorder, or we can adjust our Tableau Data Source and use a Spectrum table instead of the initial one. However, it is better to tune the external table before use, otherwise Spectrum will scan the full table.