Structuring data for Tableau

We've already seen that Tableau can connect to nearly any data source. Whether it's a built-in direct connection, ODBC, or using the Tableau data extract API to generate an extract, no data is off limits. However, there are certain structures that make data easier to work with in Tableau.

There are two keys to ensuring a good data structure that works well with Tableau:

For example, let's say you have a table of test scores with one record per classroom in a school. Within the record, you may have three measures: the average GPA for the classroom, the number of students in the class, and the average GPA of the school:

School

Classroom

Average GPA

Number of Students

Number of Students (School)

Pickaway Elementary

4th Grade

3.78

153

1,038

Pickaway Elementary

5th Grade

3.73

227

1,038

Pickaway Elementary

6th Grade

3.84

227

1,038

McCord Elementary

4th Grade

3.82

94

915

McCord Elementary

5th Grade

3.77

89

915

McCord Elementary

6th Grade

3.84

122

915

 

The first two measures (Average GPA and Number of Students) are at the same level of detail as the individual record of data (per classroom in the school). Number of Students (School) is at a higher level of detail (per school). As long as you are aware of this, you can do careful analysis. However, you would have a data structure issue if you tried to store each student's GPA in the class record. If the data were structured in an attempt to store all the students' GPA per grade level (maybe with a column for each student, or a single field containing a comma-separated list of student scores), we'd need to do some work to make the data more usable in Tableau.

Understanding the level of detail of the source (often referred to as granularity) is vital. Every time you connect to a data source, the very first question you should ask and answer is: what does a single record represent? If, for example, you were to drag and drop the Number of Records field into the view and observed 1,000 records, then you should be able to complete the statement, I have 1,000 _____. It could be 1,000 students, 1,000 test scores, or 1,000 schools. Having a good grasp of the granularity of the data will help you avoid poor analysis and allow you to determine if you even have the data that's necessary for your analysis.

A quick way to find the level of detail of your data is to put the Number of Records on the Text shelf, and then try different dimensions on the Rows shelf. When all of the rows display a 1 and the total that's displayed in the lower left status bar equals the number of records in the data, then that dimension (or combination of dimensions) uniquely identifies a record and defines the lowest level of detail of your data.

With an understanding of the overarching principles regarding the granularity of the data, we'll move on and understand certain data structures that allow you to work seamlessly and efficiently in Tableau. Sometimes, it may be preferable to restructure the data at the source using tools such as Alteryx or Tableau Prep Builder. At times, restructuring the source data isn't possible or is not feasible. We'll take a look at some options in Tableau for those cases. For now, let's consider what kinds of data structures work well with Tableau.