Skip to content

Data Vault Satellite Stored Procedure

This code-generation template creates a SQL Server stored procedure that loads a Data Vault Satellite table from a Persistent Staging Area (PSA) source — the physical, table-based counterpart to the Satellite view template.

A Satellite stores the time-variant descriptive attributes attached to a Hub or Link. This procedure compacts incoming data so that only changed records are written, preserving full history without redundancy.

In a physical Data Vault deployment, a Satellite must:

  • Detect which incoming PSA rows represent a real change to the attributes (versus a re-arrival of the same values).
  • Hash the parent Hub/Link business key into the Satellite’s surrogate key.
  • Calculate a row checksum so subsequent loads can compare incoming values to the most recent stored values.
  • Insert only the changed rows, with their inscription timestamps.

The generated procedure implements all of that against the PSA, using window functions (LAG) to compare each incoming row against the previous arrival.

  • SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.

This template is intended to be assigned to Data Object Mappings whose target is a Satellite Data Object hanging off a Hub.

The generated procedure:

  1. Reads an audit-trail @ModuleInstanceId parameter and evaluates its load window via CreateLoadWindow.
  2. Builds the parent surrogate key by hashing the business-key columns.
  3. Builds a row checksum by hashing the change-data column together with every mapped descriptive attribute.
  4. Uses a LAG-based window function (compacting mechanism) to compare each incoming row’s checksum against the previous arrival for the same parent key.
  5. Writes only the rows where the checksum changed.
  • The compacting (LAG-based change detection) means the procedure can be replayed safely — re-running it for the same window inserts nothing new, because the previous-row comparison already excluded unchanged records.
  • The procedure preserves the change-data column (typically C for insert/update, D for delete) so downstream consumers can interpret the row’s intent.
  • MD5 is used for both the surrogate key and the row checksum.
  • The Satellite is loaded by a single mapping at a time. A Satellite that aggregates data from multiple sources will need one procedure per source mapping.

The conventions auditTrailIdColumn, inscriptionTimeStampColumn, inscriptionRecordIdColumn, changeDataColumn, checksumColumn, sourceTimestampColumn, and controlFramework* must all be defined on the project.