dbt Integration

Your dbt run succeeded.
But the output has already drifted.

A source table changed upstream. Your model compiled fine. But the output has a new null pattern, a type shift, a missing column. Catch drift in transformed data before it reaches downstream consumers.

Free tier · 500K rows/month · no credit card required

Baselines your model schema on first run — every subsequent run is compared automatically for drift.

Common dbt data failures

  • dbt run succeeds but model output has drifted from baseline
  • A source table changed — model compiles fine but produces wrong results
  • Null rate crept up 2% per week — nobody noticed for a month
  • A field type changed upstream — downstream aggregations are silently wrong
  • dbt tests only catch what you wrote rules for — drift you didn't anticipate goes undetected
The problem
Before vs after
Before

Drift in transformed data goes undetected

A source table changes upstream. Your dbt model transforms it successfully — no compilation errors. But the output has a new null pattern, a type change, or a column that disappeared. dbt tests only catch what you wrote rules for.

dbt run ✓ → model output → downstream consumers
[ type changed in output — no alert ]
[ ML model training on bad features ]
After

Bad data never reaches your warehouse

DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.

dbt run ✓ → screen output → PASS → downstream ✓
                       → BLOCK → alert + stop
Quick start

Add this after dbt run

python copy & paste
import datascreeniq as dsiq
import pandas as pd

df = pd.read_sql("SELECT * FROM analytics.fct_orders LIMIT 50000", conn)
report = dsiq.Client().screen_dataframe(df, source="fct_orders")
report.raise_on_block()  # raises if data has drifted critically
Run this after every dbt run. Your model output is now screened before downstream consumers see it.
Setup guide
Get running in minutes

Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.

01

Install the SDK

Add datascreeniq to your dbt project's Python environment.

02

Set your API key

Export DATASCREENIQ_API_KEY in your environment or add it to your secrets manager.

03

Add the post-run hook script

Create a Python script that reads model output from your warehouse and screens it. Run it after dbt run in your CI or orchestration step.

04

Set per-model thresholds

Use source="model_name" to track baselines per model. Set tighter thresholds for critical models like fct_revenue or dim_customers.

scripts/dbt_quality_check.py
import datascreeniq as dsiq import pandas as pd from datascreeniq.exceptions import DataQualityError # Models to screen after dbt run MODELS = [ ("analytics.fct_orders", "fct_orders"), ("analytics.fct_revenue", "fct_revenue"), ("analytics.dim_customers", "dim_customers"), ] client = dsiq.Client() for table, source in MODELS: df = pd.read_sql( f"SELECT * FROM {table} LIMIT 50000", conn ) try: report = client.screen_dataframe(df, source=source) print(f"{source}: {report.summary()}") report.raise_on_block() except DataQualityError as e: alert_team(f"dbt model {source} failed quality gate: {e}") raise # re-raise to fail the CI step
Makefile / CI step
dbt run --select +fct_orders python scripts/dbt_quality_check.py # runs after dbt
Drift detection: After the first run, DataScreenIQ baselines each model's schema. Subsequent runs compare against that baseline — catching field additions, removals, type changes, and null rate spikes automatically.
How it works
Every batch returns a verdict

DataScreenIQ runs 18 quality checks in a single pass — null rates, type mismatches, schema drift, outliers, duplicate rates, and more. The result is one of three verdicts.

PASS

Data is clean

All checks within thresholds. Pipeline proceeds to load. No action needed.

WARN

Issues detected

Quality degraded but above BLOCK threshold. Load proceeds, issue flagged for review.

BLOCK

Pipeline stopped

Critical quality issue detected. Row load prevented. Dead-letter queue or alert triggered.

More integrations
Works with your whole stack

DataScreenIQ drops into any pipeline that can make an HTTP call.

Start screening in minutes

Free tier: 500K rows/month. No credit card. API key in 30 seconds.

Get a free API key →