Snowflake PSA
The Snowflake Persistent Staging Area (PSA) starter solution introduces two ways to derive data delta / differential from a sample operational system using Stored Procedures.
The PSA is a historized (time-variant) archive of all data changes that were presented to the data solution. It is a foundational part of many data solutions. Much like a transaction log, the PSA ensures that downstream solutions can be easily modified if and when required.
By default, changes are detected by performing a direct comparison, a Full Outer Join, between the PSA and the sample operational source system. Any changes detected will be stored in the PSA. Alternatively, a second template can be used to first land (‘stage’) the data in the landing area, and perform the Full Outer Join there.
These are just two of many ways that technical change data capture can be implemented. Templates can be added and modified to achieve the desired outcome for different approaches.
The sample data and generated code apply to a Snowflake database.
This sample also includes the generation of:
- Sample data
- Deployment script (for SnowSQL)
- Documentation
Deploying the sample
Section titled “Deploying the sample”When deploying this sample, the contents are deployed in a single Snowflake database called SAMPLE with the various object types separated by schemas. Please check the Connections to make sure these are referring to the intended schema structure.
After generating the output using the ADL Code Generator, a SnowSQL deployment script (deployment.sql) is available in the output/databases folder.
Running this script, for example using the Visual Studio Code terminal (or equivalent), will create the tables, stored procedures, and deploy the sample data.
snowsql -c adl -d SAMPLE -f .\Databases\Deployment\Deployment.sql
Please note that snowsql requires to be configured first. The above example uses the ‘adl’ connection and SAMPLE database.
After this, the procedures are ready to be run, for example by running the following SQL:
CALL STG.SP_STG_PROFILER_CUSTOMER_CONTACT(0)
[!NOTE]
The Snowflake Stored Procedure examples require an integer value to be passed as input parameter. This represents the pointer to the control framework, and is mandatory for this sample.