In the modern world, often we can meet unstructured data that is generating by machines, apps, sensors, and so on. There are the following two main attributes of semi-structured data that differ it from structure data:
- Semi-structured data can contain n-level hierarchies of nested information
- Structured data always needs a defined schema before loading it. Semi-structured data doesn't need this, so as a result we can create the schema on the fly
Despite the fact that Tableau supports direct connection to the JSON format, we still have the same issue with big data, when we need more compute resource than Tableau allows us to use and also, we can collect data types such as Avro, ORC, Parquet, and XML.
Usually, we should parse unstructured data and write into the table. But not with Snowflake; it has a special data type VARIANT that allows us to store semi-structured data. Moreover, we can easily parse key-value pairs. You can run the following SQL and check how it looks:
select * from "SNOWFLAKE_SAMPLE_DATA"."WEATHER"."WEATHER_14_TOTAL" limit 1
Let's try to use the Snowflake sample database in order to see how it looks. Unfortunately, Tableau can't parse VARIANT data type, which is why we should create the SQL for Tableau based on a VARIANT column.