Loading a cumulative fact table

In order to load a fact table, you have to look at the grain, or level, of detail. For example, looking at our model, you can see that the maximum level of detail is a city for the REGION dimension; the maximum level of detail is a day for the time dimension. This information is then used to aggregate the measures. For example, suppose that we want to load a sales fact table with the following definition:

CREATE TABLE FT_SALES (
    date integer NOT NULL,    
    id_region integer NOT NULL,
    id_puzzle integer NOT NULL,
    quantity integer default 0 NOT NULL,
    sales numeric(8,2) default 0 NOT NULL
   );

In this sample fact table, we have foreign keys to three dimensions—TIME, REGION, and PUZZLES—and two measures—quantity representing the number of products sold and sales representing the amounts. In order to feed the table, what you need to take from the source is the sum of quantity and the sum of sales for every combination of day and city as well as the puzzle.

You do this with a regular query using a Table input step. The output of the step will be the aggregated data by business keys.

As this is a cumulative load, as part of the query, you should also limit the results with a WHERE constraint specifying a data range for the sales.

After getting the data from the source, you have to translate the business keys into surrogate keys. You do it in different ways depending on the kind of each related dimension.

Finally, you are ready to insert the data into the fact table using a regular Table Output step.

The following screenshot shows you a draft of what you would have in the main Transformation:

Loading a cumulative fact table