Data Vault Hub Stored Procedure
Purpose
Section titled “Purpose”This code-generation template creates a SQL Server stored procedure that loads a Data Vault Hub table from a Persistent Staging Area (PSA) source — the physical, table-based counterpart to the Hub view template.
The procedure handles one source-to-target mapping per data object pair, so a Hub fed by multiple PSA tables results in multiple procedures — one per source.
Motivation
Section titled “Motivation”In a physical Data Vault deployment, a Hub is populated by a load process that inserts only previously unseen business keys. This template generates exactly that — an idempotent, incrementally driven stored procedure that:
- Asks a control framework for its load window (start/end timestamps).
- Reads new PSA rows whose inscription timestamp falls in that window.
- Hashes the business-key components into a surrogate key.
- Inserts the resulting rows into the Hub, suppressing keys that already exist there.
Applicability
Section titled “Applicability”- SQL Server family databases; the template uses Microsoft SQL (T-SQL) syntax.
Design Pattern
Section titled “Design Pattern”Schema Type
Section titled “Schema Type”Output Type
Section titled “Output Type”Implementation guidelines
Section titled “Implementation guidelines”This template is intended to be assigned to Data Object Mappings whose target is a Hub Data Object.
The generated procedure:
- Takes an audit-trail
@ModuleInstanceIdparameter (defaulted to 0 for ad-hoc runs). - Calls
CreateLoadWindowin the configured control-framework database to evaluate the date range it should process. - Reads candidate rows from the PSA, filtering on the inscription timestamp falling within the load window.
- Hashes the business-key columns with MD5 to produce the Hub surrogate key.
- Uses a
ROW_NUMBERpartition on the business key to keep only the earliest arrival per key. - Uses
NOT EXISTSagainst the Hub to insert only previously unseen keys.
The procedure is idempotent: repeated runs with the same load window have no effect after the first.
Considerations and consequences
Section titled “Considerations and consequences”- The template requires a Persistent Staging Area (PSA) as its source — the
dataObjectMappings.0.sourceDataObjects.0is expected to be a PSA-style table with aloadDateTime/ inscription timestamp column. - The template requires a control framework to be present in the configured location, exposing
CreateLoadWindow(and optionallyRegisterModule/RunModule). - MD5 is used for hash-key generation. If you need a stronger hash, replace
HASHBYTES('MD5', …)in both the surrogate-key construction and theNOT EXISTSlookup — they must stay aligned.
Extensions
Section titled “Extensions”If the mapping carries a hasControlFramework extension, the template appends a commented-out RegisterModule / RunModule block to the output. Strip the comment markers to register the procedure as a control-framework module.
The conventions controlFrameworkDatastore, controlFrameworkLocation, persistentStagingAreaObjectDatastore, persistentStagingAreaObjectLocation, integrationLayerObjectDatastore, integrationLayerObjectLocation, auditTrailIdColumn, and inscriptionTimeStampColumn must all be defined on the project.