Data Vault Link Satellite View
Purpose
Section titled “Purpose”This code-generation template creates a Data Vault Link Satellite object; an option for the physical representation of a Context Entity.
With this template, the Link Satellite is created as a database view, for easy refactoring and quick prototyping. The view can be used the same way as a database table, and it shows the data as a Link Satellite table would be loaded by various data logistics (‘ETL’) processes.
Motivation
Section titled “Motivation”The intent for this template is to be represent (parts of) the data solution as views, in line with the virtual data warehouse concept. This approach enables data teams to deliver the data solution exactly the same way as when database tables or files would be created and managed through data load processes.
[!NOTE] Depending on your philosophy on Data Vault implementation, using Link Satellites may not be relevant or applicable. There are very viable considerations to implement a Data Vault model without Link-Satellites. This is outlined in more detail in the associated design pattern.
Applicability
Section titled “Applicability”- SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.
Design Pattern
Section titled “Design Pattern”Schema Type
Section titled “Schema Type”Output Type
Section titled “Output Type”Implementation guidelines
Section titled “Implementation guidelines”The Data Vault Link Satellite view template represents the represents the modeled, time-variant, representation of the underlying (raw) Persistent Staging Area (PSA) data as a Data Vault Link Satellite.
The template selects the data items from the underlying data object(s), applies any defined transformations and column renames, and represents the data the same way as the user would create a table and data load process, and then run this process to populate the table with data.
To accurately represent the data, the template contains a compacting (condensing) mechanism to remove data redundancy. This is necessary since typically only a few columns from a given data object are required in the Link Satellite. Any records that contain data changes over time that are not relevant for the end result are filtered out using the compacting mechanism.
The template does not directly support end-dating. However, commented-out code is provided to enable a ‘current record’ indicator and an end-date based on the inscription timestamp.
This is provided as an example only. Generally speaking, the evaluation of timelines will be implemented in the selection of data from the Link Satellite.
The general recommendation is to not use the inscription timestamp / assertion timeline for end-dating and evaluation of what data is considered ‘current’. Instead, the Agnostic Data Labs samples contain a standard STATE_TIMESTAMP column which is used to standardize the business, or functional ‘state’ timeline. It is recommended to define date ranges and assert validity using this column.
As a view, the generated code is fully idempotent and deterministic. The view can be used in subsequent data load processes as input object, if required.
Considerations and consequences
Section titled “Considerations and consequences”- This template requires a Persistent Staging Area (PSA) or equivalent immutable transaction data set.
- The view will be created directly on a PSA dataset, and therefore typically located in the same database as where the PSA resides. If the view is intended to be created in a different database, the template requires both the source and target databases to be accessible. This can be done either via a Linked Server, on-premise cross-database query feature.
- Using views, performance is always a consideration. Each time the view is called the underlying query is executed. For large data sets, or with insufficient resources, the time it takes to return the data might not meet requirements.
- The template uses a MD5 hashing algorithm to implement hash keys and record comparison checksums.
Extensions
Section titled “Extensions”Not applicable.
{{#each dataObjectMappings}}CREATE OR ALTER VIEW [{{../conventions.vdwSchemaName}}].[{{targetDataObject.name}}]AS---- Link Satellite View definition for {{../dataObjectMappings.0.targetDataObject.name}}.---- Generated from template '200 Data Vault Link-Satellite View'.--
SELECT HASHBYTES('MD5', {{#each businessKeyDefinitions}}{{#unless @first}} +{{/unless}}{{!-- Create the Surrogate Key using the Business Key and components --}}{{#each businessKeyComponentMappings}} ISNULL(RTRIM(CONVERT(NVARCHAR(100), {{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}}{{/each}}{{#each businessKeyDefinitions}}{{#if @first}} ) AS {{surrogateKey}},{{/if}}{{/each}} [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], --COALESCE ( -- LEAD (DATEADD(mcs,[{{../conventions.sourceRowIdAttribute}}], {{../conventions.loadDateTimeAttribute}}) ) OVER -- ( PARTITION BY {{#each businessKeyDefinitions}} {{!-- The Hub business key --}}{{#unless @first}},{{/unless}} -- {{#each businessKeyComponentMappings}}{{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}} -- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}} -- ORDER BY {{../conventions.loadDateTimeAttribute}}), -- CAST( '9999-12-31' AS DATETIME) --) AS [{{../conventions.expiryDateTimeAttribute}}], --CASE -- WHEN ( RANK() OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}} -- {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}} -- {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../sourceDataItems.0.name}}{{/if}}{{/each}}{{/each}} -- ORDER BY {{../conventions.loadDateTimeAttribute}} desc )) = 1 -- THEN 'Y' -- ELSE 'N' --END AS [CURRENT_RECORD_INDICATOR], -1 AS {{../conventions.etlProcessAttribute}}, {{!-- List out the Control Framework attributes --}} {{../conventions.changeDataCaptureAttribute}}, --CASE -- WHEN {{../conventions.changeDataCaptureAttribute}} = 'D' THEN 'Y' -- ELSE 'N' --END AS [DELETED_RECORD_INDICATOR], HASHBYTES('MD5', ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{../conventions.changeDataCaptureAttribute}})), 'N/A') + '#~!' +{{#each dataItemMappings}} ISNULL(RTRIM(CONVERT(NVARCHAR(100),{{targetDataItem.name}})), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}} ) AS [{{../conventions.recordChecksumAttribute}}], {{#each dataItemMappings}} [{{targetDataItem.name}}]{{#unless @last}},{{/unless}} {{/each}}FROM ( SELECT [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.eventDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], [{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}}{{#each businessKeyComponentMappings}} [{{targetDataItem.name}}],{{/each}}{{/each}} {{#each dataItemMappings}} [{{targetDataItem.name}}], {{/each}} [COMBINED_VALUE], CASE WHEN LAG(COMBINED_VALUE,1,0x00000000000000000000000000000000) OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}} {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}} {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}} ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] DESC) = COMBINED_VALUE THEN 'Same' ELSE 'Different' END AS [VALUE_CHANGE_INDICATOR], CASE WHEN LAG([{{../conventions.changeDataCaptureAttribute}}],1,'') OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}} {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}} {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}} ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.changeDataCaptureAttribute}}] THEN 'Same' ELSE 'Different' END AS [CDC_CHANGE_INDICATOR], CASE WHEN LEAD([{{../conventions.loadDateTimeAttribute}}],1,'9999-12-31') OVER (PARTITION BY {{#each businessKeyDefinitions}}{{#unless @first}},{{/unless}}{{#each businessKeyComponentMappings}} {{targetDataItem.name}}{{#unless @last}},{{/unless}}{{/each}}{{/each}} {{#each dataItemMappings}}{{#each targetDataItem.classifications}}{{#if classification}},{{../targetDataItem.name}}{{/if}}{{/each}}{{/each}} ORDER BY [{{../conventions.loadDateTimeAttribute}}] ASC, [{{../conventions.eventDateTimeAttribute}}] ASC, [{{../conventions.changeDataCaptureAttribute}}] ASC) = [{{../conventions.loadDateTimeAttribute}}] THEN 'Same' ELSE 'Different' END AS [TIME_CHANGE_INDICATOR] FROM ( SELECT [{{../conventions.loadDateTimeAttribute}}], [{{../conventions.eventDateTimeAttribute}}], [{{../conventions.sourceRowIdAttribute}}], [{{../conventions.changeDataCaptureAttribute}}],{{#each businessKeyDefinitions}} {{#each businessKeyComponentMappings}} CAST ({{sourceDataItems.0.name}} AS NVARCHAR(100)) AS [{{targetDataItem.name}}],{{/each}}{{/each}} {{#each dataItemMappings}} [{{sourceDataItems.0.name}}] AS [{{targetDataItem.name}}], {{/each}} HASHBYTES('MD5',{{#each dataItemMappings}} ISNULL(RTRIM(CONVERT(NVARCHAR(100),[{{sourceDataItems.0.name}}])), 'N/A') + '#~!'{{#unless @last}} +{{/unless}}{{/each}} ) AS [COMBINED_VALUE] FROM [{{lookupExtension sourceDataObjects.0.extensions "datastore"}}].[{{lookupExtension sourceDataObjects.0.extensions "location"}}].[{{sourceDataObjects.0.name}}] {{#if filterCriterion}}{{#stringcompare filterCriterion ""}}{{else}}WHERE {{filterCriterion}}{{/stringcompare}}{{/if}} ) sub) combined_valueWHERE ([VALUE_CHANGE_INDICATOR] = 'Different' AND [{{../conventions.changeDataCaptureAttribute}}] IN ('C')) OR ([CDC_CHANGE_INDICATOR] = 'Different' AND [TIME_CHANGE_INDICATOR] = 'Different'){{/each}}