Skip to content

Data Vault Satellite View

This code-generation template creates a Data Vault Satellite object; the physical representation of a Context Entity.

With this template, the Satellite is created as a database view, for easy refactoring and quick prototyping. The view can be used the same way as a database table, and it shows the data as a Satellite table would be loaded by various data logistics (‘ETL’) processes.

The intent for this template is to be represent (parts of) the data solution as views, in line with the virtual data warehouse concept. This approach enables data teams to deliver the data solution exactly the same way as when database tables or files would be created and managed through data load processes.

  • SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.

The Data Vault Satellite view template represents the represents the modeled, time-variant, representation of the underlying (raw) Persistent Staging Area (PSA) data as a Data Vault Satellite.

The template selects the data items from the underlying data object(s), applies any defined transformations and column renames, and represents the data the same way as the user would create a table and data load process, and then run this process to populate the table with data.

To accurately represent the data, the template contains a compacting (condensing) mechanism to remove data redundancy. This is necessary since typically only a few columns from a given data object are required in the Satellite. Any records that contain data changes over time that are not relevant for the end result are filtered out using the compacting mechanism.

The template does not directly support end-dating. However, commented-out code is provided to enable a ‘current record’ indicator and an end-date based on the inscription timestamp.

This is provided as an example only. Generally speaking, the evaluation of timelines will be implemented in the selection of data from the Satellite.

The general recommendation is to not use the inscription timestamp / assertion timeline for end-dating and evaluation of what data is considered ‘current’. Instead, the Agnostic Data Labs samples contain a standard STATE_TIMESTAMP column which is used to standardize the business, or functional ‘state’ timeline. It is recommended to define date ranges and assert validity using this column.

As a view, the generated code is fully idempotent and deterministic. The view can be used in subsequent data load processes as input object, if required.

  • This template requires a Persistent Staging Area (PSA) or equivalent immutable transaction data set.
  • The view will be created directly on a PSA dataset, and therefore typically located in the same database as where the PSA resides. If the view is intended to be created in a different database, the template requires both the source and target databases to be accessible. This can be done either via a Linked Server, on-premise cross-database query feature.
  • Using views, performance is always a consideration. Each time the view is called the underlying query is executed. For large data sets, or with insufficient resources, the time it takes to return the data might not meet requirements.
  • The template uses a MD5 hashing algorithm to implement hash keys and record comparison checksums.

Not applicable.

{{#each dataObjectMappings}}
{{#if @first}}
CREATE OR ALTER VIEW [{{../conventions.vdwSchemaName}}].[{{targetDataObject.name}}]
AS
--
-- Satellite View definition for {{../dataObjectMappings.0.targetDataObject.name}}.
--
-- This template represents a standard Data Vault style 'Satellite' table as a view.
-- The view shows the same data as would otherwise be the case if the table would be created and all data logistics processes run.
--
-- Generated from template '200 Data Vault Satellite View'.
--
SELECT
HASHBYTES('MD5', {{#each businessKeyDefinitions}} {{!-- Create the Surrogate Key using the Business Key and components --}}
{{#each businessKeyComponentMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(MAX), {{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}
{{/each}}
) AS [{{surrogateKey}}],{{/each}}
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
--COALESCE (
-- LEAD (DATEADD(mcs,[{{../conventions.sourceRowIdAttribute}}], {{../conventions.loadDateTimeAttribute}}) ) OVER
-- ( PARTITION BY {{#each businessKeyDefinitions}}
-- {{#each businessKeyComponentMappings}}
-- {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
-- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}}
-- ORDER BY {{../conventions.loadDateTimeAttribute}}),
-- CAST( '9999-12-31' AS DATETIME)
--) AS [{{../conventions.expiryDateTimeAttribute}}],
--CASE
-- WHEN ( RANK() OVER (PARTITION BY {{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
-- {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
-- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}}
-- ORDER BY {{../conventions.loadDateTimeAttribute}} desc )) = 1
-- THEN 'Y'
-- ELSE 'N'
--END AS [CURRENT_RECORD_INDICATOR],
-1 AS {{../conventions.etlProcessAttribute}}, {{!-- List out the Control Framework attributes --}}
{{../conventions.changeDataCaptureAttribute}},
HASHBYTES('MD5',
ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),{{../conventions.changeDataCaptureAttribute}})), 'N/A') + '#~!' +{{#each dataItemMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(MAX),{{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}
) AS [{{../conventions.recordChecksumAttribute}}],
{{#each dataItemMappings}}
[{{targetDataItem.name}}]{{#unless @last}},{{/unless}}
{{/each}}
FROM
(
SELECT
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.eventDateTimeAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
[{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}
{{targetDataItem.name}},{{/each}}{{/each}}
{{#each dataItemMappings}}
[{{targetDataItem.name}}],
{{/each}}
[COMBINED_VALUE],
CASE
WHEN LAG(COMBINED_VALUE,1,0x00000000000000000000000000000000) OVER (PARTITION BY {{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] DESC) = COMBINED_VALUE
THEN 'Same'
ELSE 'Different'
END AS [VALUE_CHANGE_INDICATOR],
CASE
WHEN LAG([{{../conventions.changeDataCaptureAttribute}}],1,'') OVER (PARTITION BY {{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.changeDataCaptureAttribute}}]
THEN 'Same'
ELSE 'Different'
END AS [CDC_CHANGE_INDICATOR],
CASE
WHEN LEAD([{{../conventions.loadDateTimeAttribute}}],1,'9999-12-31') OVER (PARTITION BY {{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}}
{{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}}
ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.loadDateTimeAttribute}}]
THEN 'Same'
ELSE 'Different'
END AS [TIME_CHANGE_INDICATOR]
FROM
(
{{/if}}
{{/each}}
{{#each dataObjectMappings}}
-- The inner-most subquery is specific for each data object mapping.
SELECT
[{{../conventions.loadDateTimeAttribute}}],
[{{../conventions.eventDateTimeAttribute}}],
--[{{../conventions.recordSourceAttribute}}],
[{{../conventions.sourceRowIdAttribute}}],
[{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}}
CAST ({{sourceDataItems.0.name}} AS NVARCHAR(100)) AS [{{targetDataItem.name}}],{{/each}}{{/each}}
{{#each dataItemMappings}}
[{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}],
{{/each}}
HASHBYTES('MD5',{{#each dataItemMappings}}
ISNULL(RTRIM(CONVERT(NVARCHAR(100),[{{sourceDataItems.0.name}}])), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}
) AS [COMBINED_VALUE]
FROM [{{lookupExtension sourceDataObjects.0.extensions "datastore"}}].[{{lookupExtension sourceDataObjects.0.extensions "location"}}].[{{sourceDataObjects.0.name}}]
{{#if filterCriterion}}{{#stringcompare filterCriterion ""}}{{else}}WHERE {{filterCriterion}}{{/stringcompare}}{{/if}}
{{/each}}
{{#each dataObjectMappings}}
{{#if @first}}
) sub
) combined_value
WHERE
([VALUE_CHANGE_INDICATOR] = 'Different' AND [{{../conventions.changeDataCaptureAttribute}}] IN ('C'))
OR
([CDC_CHANGE_INDICATOR] = 'Different' AND [TIME_CHANGE_INDICATOR] = 'Different')
GO
{{/if}}
{{/each}}