Data Vault Hub SQL Server Table

Purpose

This code-generation template creates the physical SQL Server table for a Data Vault Hub — the persisted store of distinct business keys for a Core Business Entity.

Unlike the Hub view template (which represents the Hub virtually on top of a Persistent Staging Area), this template materialises the Hub as a real table, suitable for a physical data warehouse deployment.

Motivation

A physical Hub table is the canonical Data Vault implementation. It gives you a stable, queryable list of business keys and their assigned surrogate (hash) keys, independent of the source PSA. Subsequent Hub-load processes target this table, and downstream Satellites and Links use it as their integration anchor.

Applicability

SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.

Implementation guidelines

This template is intended to be assigned to Hub Data Objects in the Integration Layer. It reads the Data Object’s dataItems and emits a CREATE TABLE statement with:

One column per Data Item, using the declared SQL data type and (optionally) character length or numeric precision/scale.
A non-clustered primary key on the columns flagged isPrimaryKey — typically the Hub hash key.
A unique clustered index on columns carrying the Alternate Key classification — typically the business key columns.

The clustering choice (NONCLUSTERED PK + CLUSTERED business-key index) follows the standard Data Vault Hub recommendation: business-key lookups are the dominant access pattern, so the clustered index serves them directly.

The script is idempotent at the table-creation level — DROP TABLE IF EXISTS precedes the CREATE TABLE, so re-running the generated script will rebuild the structure (note: this will drop any data; manage that with your deployment process).

Considerations and consequences

The data type emitted comes verbatim from each Data Item’s dataType, optionally suffixed with (characterLength) or (numericPrecision, numericScale). Cleanse or normalise data types at the metadata level before generating.
All columns are emitted as NOT NULL. For Hubs this is correct (the business key and audit columns are always populated); if you adapt this template, revisit nullability.
The Alternate Key classification must be applied to the business-key columns for the clustered index to be generated correctly.

Extensions

The template reads datastore and location extensions on the Data Object’s Data Connection to build the fully qualified table name. Both must be set on the connection.

Code

--
-- Table creation statement for [{{lookupExtension dataConnection.extensions "datastore"}}].[{{lookupExtension dataConnection.extensions "location"}}].[{{name}}].
--
-- Generated from template '{{templateMetadata.name}}'.
--

DROP TABLE IF EXISTS [{{lookupExtension dataConnection.extensions "datastore"}}].[{{lookupExtension dataConnection.extensions "location"}}].[{{name}}];

CREATE TABLE [{{lookupExtension dataConnection.extensions "datastore"}}].[{{lookupExtension dataConnection.extensions "location"}}].[{{name}}]
(
{{#each dataItems}}
  [{{name}}] {{stringUpper dataType}}{{#if characterLength}}({{characterLength}}){{/if}}{{#if numericPrecision}}({{numericPrecision}}{{#if numericScale}},{{numericScale}}{{/if}}){{/if}} NOT NULL,
{{/each}}
  CONSTRAINT [PK_{{name}}] PRIMARY KEY NONCLUSTERED
  (
{{#each dataItems}}
{{#if isPrimaryKey}}    {{#if @index}},{{/if}}[{{stringUpper name}}]
{{/if}}
{{/each}}
  )
);

-- Create a clustered index on the business key column(s).
CREATE UNIQUE CLUSTERED INDEX IX_{{name}}
ON [{{lookupExtension dataConnection.extensions "datastore"}}].[{{lookupExtension dataConnection.extensions "location"}}].[{{name}}]
(
{{#each dataItems}}
{{#hasClassification classifications "Alternate Key"}}
{{name}}{{#unless @last}},{{/unless}}
{{/hasClassification}}
{{/each}}
);
GO