To create the Data Model relationships identified in the Data Warehouse Bus Matrix image:
- Click Manage Relationships from the Modeling tab in Report View.
- From the Manage Relationships dialog, click the New command button at the bottom to open the Create relationship interface. Choose the fact table, such as Internet Sales, for the top table via the dropdown and then select the dimension table as shown in the following screenshot:
data:image/s3,"s3://crabby-images/6e4f8/6e4f8fa5019497757c049dc5011c7c46a15c98b6" alt=""
If the relationship columns have the same name, such as Currency Key in this example, Power BI will automatically select the columns to define the relationship. Almost all relationships will follow this Many to one(*:1) or fact-to-dimension pattern with the Cross-filter direction property set to Single and the relationship set to active.
The two columns used for defining each relationship should be of the same data type. In most relationships, both columns will be of the whole number data type as only a numeric value can be used with slowly changing dimensions. For example, a Product Key column could use the values 12, 17, and 27 to represent three time periods for a single product as certain attributes of the product changed over time.
As more relationships are created, it can be helpful to switch to the Relationships view and move or organize the dimension tables around the fact table. Relationships view can make it clear when additional relationships need to be defined and can be useful in explaining the model to report authors and users.
Click OK to create the relationship and repeat this process to build the planned star schema relationships for both the Internet Sales and Reseller Sales fact tables, as shown in the following screenshot of Internet Sales:
data:image/s3,"s3://crabby-images/f570b/f570b986d39696eeeca20b1a0449c81c650223c0" alt=""
All relationships from Internet Sales to a dimension table are active (solid line) except for two additional relationships to the Date dimension table. In this dataset, the Order Date is used as the active relationship, but two additional inactive (dotted line) relationships are created based on the Due Date and Ship Date columns of the fact table. DAX Measures can be created to invoke these alternative relationships via the USERELATIONSHIP() DAX function, as shown in the following example:
Internet Net Sales (Due Date) =
CALCULATE([Internet Net Sales], USERELATIONSHIP('Internet Sales'[Due Date Key],'Date'[Date Key]))
Internet Net Sales (Ship Date) =
CALCULATE([Internet Net Sales],USERELATIONSHIP('Internet Sales'[Ship Date Key],'Date'[Date Key]))
The inactive relationships and their corresponding measures enable report visualizations based on a single-date dimension table, such as in the following table:
data:image/s3,"s3://crabby-images/24293/24293c2c6148a6a4e1d5b3f69786a9b6ee56e270" alt=""
In this scenario, the Internet Net Sales measure uses the active relationship based on Order Date by default, but the other measures override this relationship via the CALCULATE() and USERELATIONSHIP() functions.
For DirectQuery datasets, the Assume referential integrity relationship property is critical for performance as this determines whether inner- or outer-join SQL statements are generated to resolve report queries. When enabled, as shown in the following screenshot, inner-join SQL queries will be passed to the source system when report queries require columns or logic from both tables of the relationship:
data:image/s3,"s3://crabby-images/1b75c/1b75c8458f1cbbc729e13590ec412ddf259d8de8" alt=""
If Assume referential integrity is not enabled, outer-join SQL queries will be generated to ensure that all necessary rows from the fact table or many sides of the relationship are retrieved to resolve the report query. The query optimizers within supported DirectQuery sources, such as SQL Server and Oracle, are able to produce much more efficient query execution plans when presented with inner-join SQL statements. Of course, improved performance is of no value if the outer join is necessary to return the correct results, thus it's essential for referential integrity violations in the source system to be addressed.