Prefect Integration

Your Prefect flow succeeded.
But your data is already broken.

A null spike, a type mismatch, a schema change — your flow loads it anyway. Debugging takes hours. Block bad data before it loads.

Free tier · 500K rows/month · no credit card required

Catches schema drift, null spikes, and type mismatches in milliseconds — before your load task runs.

Common Prefect data failures

  • Prefect flow succeeds but downstream data is wrong
  • Null values suddenly spike in production — no alert fires
  • Schema drift breaks dashboards after a successful run
  • Downstream flow fails mysteriously — root cause was at ingest hours earlier
  • dbt tests catch the issue after data is already loaded
The problem
Before vs after
Before

Pipeline fails after damage is done

Your Prefect flow extracts, transforms, and loads successfully. But a null spike crept in upstream. The warehouse has bad rows. Your downstream flow fails mysteriously. Root cause takes hours.

extract → transform → load → ✓ (bad rows loaded)
[ downstream flow fails with no clear cause ]
[ 3 hours of debugging ]
After

Bad data never reaches your warehouse

DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.

extract → transform → screen → PASS → load ✓
                               → BLOCK → clear failure + report
Quick start

Add this to your Prefect flow

python copy & paste
report = client.screen(rows, source="orders")
if report.is_blocked:
    raise ValueError(report.summary())  # flow fails fast with a clear report
That's it. Your pipeline now fails fast on bad data instead of corrupting your warehouse.
Setup guide
Get running in minutes

Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.

01

Install the SDK

Add datascreeniq to your Prefect deployment's dependencies.

02

Store your API key

Add DATASCREENIQ_API_KEY as a Prefect Secret Block or environment variable.

03

Add the quality gate task

Insert a @task between your extract/transform and load tasks. The screen call returns a report — raise on BLOCK to fail the flow cleanly.

04

Configure alerting

Use Prefect automations or the report summary to route alerts to Slack, PagerDuty, or email when quality degrades.

prefect_flow.py
from prefect import flow, task from prefect.blocks.system import Secret import datascreeniq as dsiq from datascreeniq.exceptions import DataQualityError @task(retries=0) # no retries on quality failures def screen_data(rows: list, source: str) -> list: client = dsiq.Client() report = client.screen(rows, source=source) if report.is_warn: print(f"⚠ Quality warning for {source}: {report.summary()}") # continue but flag — adjust threshold in dashboard to BLOCK if needed if report.is_blocked: raise ValueError( f"Quality gate BLOCKED {source}: {report.summary()} " f"Issues: {report.issues}" ) return rows @task def extract(): return fetch_rows_from_api() @task def load(rows): insert_to_warehouse(rows) @flow(name="orders-pipeline") def orders_pipeline(): rows = extract() clean = screen_data(rows, source="orders") # fails fast on BLOCK load(clean) if __name__ == "__main__": orders_pipeline()
Prefect automations: Pair this with a Prefect automation on flow run failures to route BLOCK events to Slack or PagerDuty. The report.summary() message gives you the failure reason directly in the alert.
How it works
Every batch returns a verdict

DataScreenIQ runs 18 quality checks in a single pass — null rates, type mismatches, schema drift, outliers, duplicate rates, and more. The result is one of three verdicts.

PASS

Data is clean

All checks within thresholds. Pipeline proceeds to load. No action needed.

WARN

Issues detected

Quality degraded but above BLOCK threshold. Load proceeds, issue flagged for review.

BLOCK

Pipeline stopped

Critical quality issue detected. Row load prevented. Dead-letter queue or alert triggered.

More integrations
Works with your whole stack

DataScreenIQ drops into any pipeline that can make an HTTP call.

Start screening in minutes

Free tier: 500K rows/month. No credit card. API key in 30 seconds.

Get a free API key →