How it works...

In R10_Table, we create the table variable, __Table. The __Table variable creates a 1 million row table with each row having a unique index value and unique date and time stamp using GENERATESERIES and ADDCOLUMNS. We then use SELECTCOLUMNS to only select our TimeStamp column from __Table.

In R10_Table2, we also create the table variable, __Table. __Table starts out with the same calculation as used in R10_Table. However, we use ADDCOLUMNS to add four columns, Date, Hour, Minute, and Second. We then use SELECTCOLUMNS to only select our Date, Hour, Minute, and Second columns from __Table.

Although we have exactly the same information within both tables, R10_Table and R10_Table2, storing R10_Table requires nearly 12 MB of disk space (which equates to required memory storage space as well), while table R10_Table2 only requires 39 KB (51 KB-12 KB) of disk space. This means that R10_Table requires over 300 times the amount of storage as R10_Table2. What is going on?

At issue here is the underlying data model behind DAX, the in-memory tabular cube. The tabular data model utilizes columnar compression to compress data to a fraction of its original size under the right set of circumstances. To understand how this works, consider the simple scenario of a dataset that includes the following values in a column:

Without delving into the gory details of how column compression actually operates, you can essentially visualize the compression process by understanding that in the circumstance provided by the example table, the optimization implemented by the tabular data model is to only store unique values within the column, 111, 222, and 333, instead of storing each value within each row. In the example provided, this effectively cuts the amount of data required to be stored in order to express the data model in half.

Thus, the problem with the first table, R10_Table, is that each row value is unique, and so compression cannot occur. However, in the second table, R10_Table2, by splitting out the information in the single column into four different columns, there are lots of repeating values within the columns of R10_Table2, so columnar compression has a huge impact on the amount of storage required to express the data.