Common Challenges with Star Schema Modeling

2 minute read

A few articles ago, we wrote about data modeling then data warehousing using star schema for slowly changing dimension (SCD) Type 2 and most recently bitemporality. In this article, we continue our series on data modeling and data warehousing with a discussion of common challenges with star schema modeling.

Dimension Changes Where Facts Reference Current

With a star schema data warehouse, there are 3 types for how slowly changing dimensions are handled. In our previous articles we have focused on Type 1 and Type 2. Type 1 updates dimension records in place. Type 2, which retains the full history of values, adds new dimension records, updating the previous dimension record to no longer be current.

You will have many more fact records than dimension records. Many fact records will have foreign key relationships to the same set of current dimension records. 

For Type 1 SCD, when an update occurs to a dimension record, it is made in place. This means that fact records referencing the updated dimension record will still be correct.

For Type 2 SCD, when an update occurs to a dimension record, the update is made by: 1. adding a new dimension record, which is marked as current; then 2. updating the previous dimension record so that is no longer current. A problem occurs if fact records should be pointing to the current dimension records. In this case, many fact records foreign key references to dimension records would need to be updated.

If your design does not require bitemporality and calls for fact records to reference the current dimension record, Type 1 SCD would be the better choice.

Bitemporality With Backdated Changes

Bitemporality in your data warehouse allows you to make backdated changes. Here is an example dimension table:

The second row reflects the pay rate for the employee being updated effective May 1st. The third row reflects the pay rate for the employee being updated effective May 1st, which is the same effective date as the second row. This row is a backdated or corrective change. The second row had a pay rate from May 1st that was incorrect. However, we didn’t correct this error until October 31st. Fact records which reference the second dimension row should be updated to instead reference the third corrected record. Depending on the frequency of your fact record creation and how long in the past the corrective record applies to, large numbers of fact records may need to be updated.

Summary

Star schema designs for data marts or data warehouses are powerful but there are complexities that must be considered in order to provide the desired functionality with acceptable performance.

Twitter

LinkedIn