Skip to content

Data Vault Link Satellite View

This code-generation template creates a Data Vault Link Satellite object; an option for the physical representation of a Context Entity.

With this template, the Link Satellite is created as a database view, for easy refactoring and quick prototyping. The view can be used the same way as a database table, and it shows the data as a Link Satellite table would be loaded by various data logistics (‘ETL’) processes.

The intent for this template is to be represent (parts of) the data solution as views, in line with the virtual data warehouse concept. This approach enables data teams to deliver the data solution exactly the same way as when database tables or files would be created and managed through data load processes.

[!NOTE] Depending on your philosophy on Data Vault implementation, using Link Satellites may not be relevant or applicable. There are very viable considerations to implement a Data Vault model without Link-Satellites. This is outlined in more detail in the associated design pattern.

  • SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.

The Data Vault Link Satellite view template represents the represents the modeled, time-variant, representation of the underlying (raw) Persistent Staging Area (PSA) data as a Data Vault Link Satellite.

The template selects the data items from the underlying data object(s), applies any defined transformations and column renames, and represents the data the same way as the user would create a table and data load process, and then run this process to populate the table with data.

To accurately represent the data, the template contains a compacting (condensing) mechanism to remove data redundancy. This is necessary since typically only a few columns from a given data object are required in the Link Satellite. Any records that contain data changes over time that are not relevant for the end result are filtered out using the compacting mechanism.

The template does not directly support end-dating. However, commented-out code is provided to enable a ‘current record’ indicator and an end-date based on the inscription timestamp.

This is provided as an example only. Generally speaking, the evaluation of timelines will be implemented in the selection of data from the Link Satellite.

The general recommendation is to not use the inscription timestamp / assertion timeline for end-dating and evaluation of what data is considered ‘current’. Instead, the Agnostic Data Labs samples contain a standard STATE_TIMESTAMP column which is used to standardize the business, or functional ‘state’ timeline. It is recommended to define date ranges and assert validity using this column.

As a view, the generated code is fully idempotent and deterministic. The view can be used in subsequent data load processes as input object, if required.

  • This template requires a Persistent Staging Area (PSA) or equivalent immutable transaction data set.
  • The view will be created directly on a PSA dataset, and therefore typically located in the same database as where the PSA resides. If the view is intended to be created in a different database, the template requires both the source and target databases to be accessible. This can be done either via a Linked Server, on-premise cross-database query feature.
  • Using views, performance is always a consideration. Each time the view is called the underlying query is executed. For large data sets, or with insufficient resources, the time it takes to return the data might not meet requirements.
  • The template uses a MD5 hashing algorithm to implement hash keys and record comparison checksums.

Not applicable.

{{#each dataObjectMappings}}
CREATE OR ALTER VIEW [{{../conventions.vdwSchemaName}}].[{{targetDataObject.name}}]
AS
--
-- Link Satellite View definition for {{../dataObjectMappings.0.targetDataObject.name}}.
--
-- Generated from template '200 Data Vault Link-Satellite View'.
--
SELECT
HASHBYTES('MD5', {{#each businessKeyDefinitions}}{{#unless @first}} +{{/unless}}{{!-- Create the Surrogate Key using the Business Key and components --}}{{#each businessKeyComponentMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(100), {{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}{{/each}}{{#each businessKeyDefinitions}}{{#if @first}}
) AS {{surrogateKey}},{{/if}}{{/each}}
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
--COALESCE (
-- LEAD (DATEADD(mcs,[{{../conventions.sourceRowIdAttribute}}], {{../conventions.loadDateTimeAttribute}}) ) OVER
-- ( PARTITION BY {{#each businessKeyDefinitions}} {{!-- The Hub business key --}}{{#unless @first}},{{/unless}}
-- {{#each businessKeyComponentMappings}}{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
-- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}}
-- ORDER BY {{../conventions.loadDateTimeAttribute}}),
-- CAST( '9999-12-31' AS DATETIME)
--) AS [{{../conventions.expiryDateTimeAttribute}}],
--CASE
-- WHEN ( RANK() OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}}
-- {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
-- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}}
-- ORDER BY {{../conventions.loadDateTimeAttribute}} desc )) = 1
-- THEN 'Y'
-- ELSE 'N'
--END AS [CURRENT_RECORD_INDICATOR],
-1 AS {{../conventions.etlProcessAttribute}}, {{!-- List out the Control Framework attributes --}}
{{../conventions.changeDataCaptureAttribute}},
--CASE
-- WHEN {{../conventions.changeDataCaptureAttribute}} = 'D' THEN 'Y'
-- ELSE 'N'
--END AS [DELETED_RECORD_INDICATOR],
HASHBYTES('MD5',
ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{../conventions.changeDataCaptureAttribute}})), 'N/A') + '#~!' +{{#each dataItemMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}
) AS [{{../conventions.recordChecksumAttribute}}],
{{#each dataItemMappings}}
[{{targetDataItem.name}}]{{#unless @last}},{{/unless}}
{{/each}}
FROM
(
SELECT
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.eventDateTimeAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
[{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}
[{{targetDataItem.name}}],{{/each}}{{/each}}
{{#each dataItemMappings}}
[{{targetDataItem.name}}],
{{/each}}
[COMBINED_VALUE],
CASE
WHEN LAG(COMBINED_VALUE,1,0x00000000000000000000000000000000) OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] DESC) = COMBINED_VALUE
THEN 'Same'
ELSE 'Different'
END AS [VALUE_CHANGE_INDICATOR],
CASE
WHEN LAG([{{../conventions.changeDataCaptureAttribute}}],1,'') OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.changeDataCaptureAttribute}}]
THEN 'Same'
ELSE 'Different'
END AS [CDC_CHANGE_INDICATOR],
CASE
WHEN LEAD([{{../conventions.loadDateTimeAttribute}}],1,'9999-12-31') OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.loadDateTimeAttribute}}]
THEN 'Same'
ELSE 'Different'
END AS [TIME_CHANGE_INDICATOR]
FROM
(
SELECT
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.eventDateTimeAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
[{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
CAST ({{sourceDataItems.0.name}} AS NVARCHAR(100)) AS [{{targetDataItem.name}}],{{/each}}{{/each}}
{{#each dataItemMappings}}
[{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}],
{{/each}}
HASHBYTES('MD5',{{#each dataItemMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(100),[{{sourceDataItems.0.name}}])), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}
) AS [COMBINED_VALUE]
FROM [{{lookupExtension sourceDataObjects.0.extensions "datastore"}}].[{{lookupExtension sourceDataObjects.0.extensions "location"}}].[{{sourceDataObjects.0.name}}]
{{#if filterCriterion}}{{#stringcompare filterCriterion ""}}{{else}}WHERE {{filterCriterion}}{{/stringcompare}}{{/if}}
) sub
) combined_value
WHERE
([VALUE_CHANGE_INDICATOR] = 'Different' AND [{{../conventions.changeDataCaptureAttribute}}] IN ('C'))
OR
([CDC_CHANGE_INDICATOR] = 'Different' AND [TIME_CHANGE_INDICATOR] = 'Different')
{{/each}}