Hi everyone
Wanted - trial users for Calon Backbone, our AI-enabled data warehouse / data vault automation solution. The first 5 trial users we get will get all the dbt code generated for free so they can run it themselves.
We are currently at alpha stage and looking to test Backbone out in the wild.
Our bet is that data teams can speed up data integration into a data vault through to displaying metrics in dashboards, from multiple systems, by 20-40x at least, but that’s not proven.
Ideal profile - data team at a consumer products company with multiple channels with data already on Snowflake with 2+ systems at least available. Consumer products is not necessary though. Snowflake is.
We need… raw data from e.g. Shopify, Hubspot, Xero, Quickbooks, or anything else, etc in your Snowflake account (or we can pipe it using fivetran or Matillion into our account or yours)
You get a new end to end (i.e. backbone) data pipeline within a few days starting at a raw snowflake data all the way into a set of basic streamlit dashboards.
What we do in Backbone…
- a) For each system you have, read the Snowflake metadata and turn this in to the instructions needed to build a data vault model. This is AI-driven with API-based agentic workflows. By default we map to Microsoft Common Data Model where possible, but if you have an ontology or data model, we can read it in. We also automatically detect potential PII fields, amounts, dates and keys.
(Doing this for a system like Shopify costs about $30 in Claude tokens and takes about 30 mins in its current state. Mapping each table to the instructions manually takes about 20 min to an hour per table)
- b) Agree priority systems and rules for different systems, in case of duplicates etc. Agree priority business metric rules e.g. COGS allocation, pre / post tax, inventory rules and so on
- Make adjustments to the AI-generated integration instructions where needed for the raw vault layer (implement some hard rules, check that things are correct, check that we have similar keys from multiple systems (e.g. SKU). Feed these back into the AI agent workflow so that they are added to the input. The first time we do this for a particular system then this is usually slower and needs a bit of work
- Run the data vault automation on the Snowflake data (this builds dbt models firstly covering the raw vault and point in time tables. This usually takes 2 min to generate the code and then initial load depends on how much data you have. (Also includes a first cut of dbt tests)Iterate 2 and 3 until we are happy with the results. Usually, takes a few runs over the course of a couple of days. At some point, start on 4 too.
- Tweak the unit economics, customer profiles, SKU profiles (i.e. dbt code for business vault sats and links) and so on to meet requirements
- Define what bridges are needed and use another agentic workflow to build these.
- Using yet another AI agent workflow, define the metrics, charts and dashboards you need to be displayed in the dashboards.
- Run the end to end thing again on a full refresh and hope it works. (Sorry, I mean test it against expected results.)
8 - then you can have the dbt code as a freebie for you to continue to run yourself for the first 5 testers.