Data Vault Link Satellite Driving Key View
Purpose
Section titled “Purpose”This code-generation template creates a Data Vault Link Satellite object that implements a ‘driving key’ mechanism; an option for the physical representation of a Context Entity.
With this template, the Link Satellite is created as a database view, for easy refactoring and quick prototyping. The view can be used the same way as a database table, and it shows the data as a Link Satellite table would be loaded by various data logistics (‘ETL’) processes.
Motivation
Section titled “Motivation”The intent for this template is to be represent (parts of) the data solution as views, in line with the virtual data warehouse concept. This approach enables data teams to deliver the data solution exactly the same way as when database tables or files would be created and managed through data load processes.
[!NOTE] Depending on your philosophy on Data Vault implementation, using Link Satellites may not be relevant or applicable. There are very viable considerations to implement a Data Vault model without Link-Satellites, and this sometimes considered especially true for the driving key concept. This is outlined in more detail in the associated design pattern.
A driving key mechanism manages the validity of a relationship (Link) or keyed-instance (relationship-describing Hub). This can be implemented if there is no natural data-driven way to evaluate when a relationship should be considered active, or not.
Applicability
Section titled “Applicability”- SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.
Design Pattern
Section titled “Design Pattern”Schema Type
Section titled “Schema Type”Output Type
Section titled “Output Type”Implementation guidelines
Section titled “Implementation guidelines”The Data Vault Link Satellite view template represents the represents the modeled, time-variant, representation of the underlying (raw) Persistent Staging Area (PSA) data as a Data Vault Link Satellite.
The template selects the data items from the underlying data object(s), applies any defined transformations and column renames, and represents the data the same way as the user would create a table and data load process, and then run this process to populate the table with data.
To accurately represent the data, the template contains a compacting (condensing) mechanism to remove data redundancy. This is necessary since typically only a few columns from a given data object are required in the Link Satellite. Any records that contain data changes over time that are not relevant for the end result are filtered out using the compacting mechanism.
The template does not directly support end-dating. However, commented-out code is provided to enable a ‘current record’ indicator and an end-date based on the inscription timestamp.
This is provided as an example only. Generally speaking, the evaluation of timelines will be implemented in the selection of data from the Link Satellite.
The general recommendation is to not use the inscription timestamp / assertion timeline for end-dating and evaluation of what data is considered ‘current’. Instead, the Agnostic Data Labs samples contain a standard STATE_TIMESTAMP column which is used to standardize the business, or functional ‘state’ timeline. It is recommended to define date ranges and assert validity using this column.
As a view, the generated code is fully idempotent and deterministic. The view can be used in subsequent data load processes as input object, if required.
Considerations and consequences
Section titled “Considerations and consequences”- This template requires a Persistent Staging Area (PSA) or equivalent immutable transaction data set.
- The view will be created directly on a PSA dataset, and therefore typically located in the same database as where the PSA resides. If the view is intended to be created in a different database, the template requires both the source and target databases to be accessible. This can be done either via a Linked Server, on-premise cross-database query feature.
- Using views, performance is always a consideration. Each time the view is called the underlying query is executed. For large data sets, or with insufficient resources, the time it takes to return the data might not meet requirements.
- The template uses a MD5 hashing algorithm to implement hash keys and record comparison checksums.
Extensions
Section titled “Extensions”Not applicable.
{{#each dataObjectMappings}}CREATE OR ALTER VIEW [{{lookupExtension extensions "vdwSchemaName"}}].[{{targetDataObject.name}}]AS---- Link Satellite View definition for {{../dataObjectMappings.0.targetDataObject.name}}---- Generated from template '{{templateMetadata.name}}'.--
SELECT HASHBYTES('MD5', {{#each businessKeyDefinitions}}{{#unless @first}} +{{/unless}}{{!-- Create the Surrogate Key using the Business Key components --}}{{#each businessKeyComponentMappings}} ISNULL(RTRIM(CONVERT(NVARCHAR(100), {{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}{{/each}}{{#each businessKeyDefinitions}}{{#if @first}} ) AS {{surrogateKey}},{{/if}}{{/each}}{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}--[{{targetDataItem.name}}],{{/each}}{{/each}}[{{../conventions.loadDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], --COALESCE ( -- LEAD (DATEADD(mcs,[{{../conventions.sourceRowIdAttribute}}], {{../conventions.loadDateTimeAttribute}}) ) OVER -- ( PARTITION BY -- {{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#each extensions}}{{#stringcompare key "isDrivingKey"}} [{{../../../targetDataItem.name}}]{{#unless @last}},{{/unless}}{{/stringcompare}}{{/each}}{{/each}}{{/each}}{{/each}} -- ORDER BY {{../conventions.loadDateTimeAttribute}}), -- CAST( '9999-12-31' AS DATETIME) --) AS [{{../conventions.expiryDateTimeAttribute}}], --CASE -- WHEN ( LEAD ({{../conventions.loadDateTimeAttribute}}) OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringcompare classifications.0.classification "DrivingKey"}}{{../../targetDataItem.name}}{{#unless @last}},{{/unless}}{{/stringcompare}}{{/each}}{{/each}}{{/each}} -- ORDER BY {{../conventions.loadDateTimeAttribute}})) IS NULL -- THEN 'Y' -- ELSE 'N' --END AS CURRENT_RECORD_INDICATOR, -1 AS {{../conventions.etlProcessAttribute}}, {{!-- List out the Control Framework attributes --}}{{../conventions.changeDataCaptureAttribute}}, --CASE -- WHEN {{../conventions.changeDataCaptureAttribute}} = 'D' THEN 'Y' -- ELSE 'N' --END AS [DELETED_RECORD_INDICATOR], HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{../conventions.changeDataCaptureAttribute}})), 'N/A') + '#~!' +{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}{{/each}}) AS [{{../conventions.recordChecksumAttribute}}],{{#each dataItemMappings}} [{{targetDataItem.name}}]{{# unless @last}},{{/unless}}{{/each}}FROM( SELECT [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], [{{../conventions.eventDateTimeAttribute}}], [{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}} [{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}],{{/each}}{{/each}}{{#each dataItemMappings}} [{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}],{{/each}}{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringdiff extensions.0.key "isDrivingKey"}}LAG ([{{../name}}], 1, '0') OVER ( PARTITION BY {{#each ../../../../businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#each extensions}}{{#stringcompare key "isDrivingKey"}} [{{../../name}}]{{#unless @last}},{{/unless}}{{/stringcompare}}{{/each}}{{/each}}{{/each}}{{/each}} ORDER BY [{{../../../../../conventions.loadDateTimeAttribute}}] ) AS PREVIOUS_FOLLOWER_KEY{{@index}} {{/stringdiff}}{{/each}}{{/each}}{{/each}} FROM [{{lookupExtension sourceDataObjects.0.extensions "datastore"}}].[{{lookupExtension sourceDataObjects.0.extensions "location"}}].[{{sourceDataObjects.0.name}}] WHERE NOT ([{{../conventions.sourceRowIdAttribute}}]>1 AND [{{../conventions.changeDataCaptureAttribute}}] ='D'){{#if filterCriterion}}{{#stringcompare filterCriterion ""}}{{else}}AND {{filterCriterion}}{{/stringcompare}}{{/if}}
/* -- Dummy record only for example purposes. UNION SELECT [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], [{{../conventions.eventDateTimeAttribute}}], [{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}} [{{targetDataItem.name}}],{{/each}}{{/each}}{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringdiff extensions.0.key "isDrivingKey"}}'{{@index}}' AS PREVIOUS_FOLLOWER_KEY{{@index}}{{/stringdiff}}{{/each}}{{/each}}{{/each}} FROM ( SELECT '1900-01-01' AS [{{../conventions.loadDateTimeAttribute}}], 1 AS [{{../conventions.sourceRowIdAttribute}}], '1900-01-01' AS [{{../conventions.eventDateTimeAttribute}}], --'Data Warehouse' AS [{{../conventions.recordSourceAttribute}}], 'C' AS [{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}[{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}],{{/each}}{{/each}}DENSE_RANK() OVER ( PARTITION BY {{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#each extensions}}{{#stringcompare key "isDrivingKey"}} [{{../../name}}]{{#unless @last}},{{/unless}}{{/stringcompare}}{{/each}}{{/each}}{{/each}}{{/each}} ORDER BY [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], {{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringcompare classifications.0.classification "DrivingKey"}}{{name}}{{#unless @last}},{{/unless}}{{/stringcompare}}{{/each}}{{/each}}{{/each}} ASC ) AS ROWVERSION FROM [{{lookupExtension sourceDataObjects.0.extensions "datastore"}}].[{{lookupExtension sourceDataObjects.0.extensions "location"}}].[{{sourceDataObjects.0.name}}] ) dummysub WHERE ROWVERSION=1 */) subWHERE{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringdiff extensions.0.key "isDrivingKey"}}{{../../targetDataItem.name}} != PREVIOUS_FOLLOWER_KEY{{@index}}{{/stringdiff}}{{/each}}{{/each}}{{/each}}-- ORDER BY{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}}{{#each sourceDataItems}}{{#stringcompare extensions.0.key "isDrivingKey"}}-- [{{../../targetDataItem.name}}],{{/stringcompare}}{{/each}}{{/each}}{{/each}}-- [{{../conventions.loadDateTimeAttribute}}]
{{/each}}