Prefect Integration

Your Prefect flow succeeded.
But your data is already broken.

A null spike, a type mismatch, a schema change — your flow loads it anyway. Debugging takes hours. Block bad data before it loads.

Get free API key → API reference

Free tier · 500K rows/month · no credit card required

Catches schema drift, null spikes, and type mismatches in milliseconds — before your load task runs.

The problem

Before vs after

✕Before

Pipeline fails after damage is done

Your Prefect flow extracts, transforms, and loads successfully. But a null spike crept in upstream. The warehouse has bad rows. Your downstream flow fails mysteriously. Root cause takes hours.

extract → transform → load → ✓ (bad rows loaded)
[ downstream flow fails with no clear cause ]
[ 3 hours of debugging ]

✓After

Bad data never reaches your warehouse

DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.

extract → transform → screen → PASS → load ✓
→ BLOCK → clear failure + report

Setup guide

Get running in minutes

Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.

Install the SDK

Add datascreeniq to your Prefect deployment's dependencies.

Store your API key

Add DATASCREENIQ_API_KEY as a Prefect Secret Block or environment variable.

Add the quality gate task

Insert a @task between your extract/transform and load tasks. The screen call returns a report — raise on BLOCK to fail the flow cleanly.

Configure alerting

Use Prefect automations or the report summary to route alerts to Slack, PagerDuty, or email when quality degrades.

prefect_flow.py
from prefect import flow, task
from prefect.blocks.system import Secret
import datascreeniq as dsiq
from datascreeniq.exceptions import DataQualityError

@task(retries=0)  # no retries on quality failures
def screen_data(rows: list, source: str) -> list:
    client = dsiq.Client()
    report = client.screen(rows, source=source)

    if report.is_warn:
        print(f"⚠ Quality warning for {source}: {report.summary()}")
        # continue but flag — adjust threshold in dashboard to BLOCK if needed

    if report.is_blocked:
        raise ValueError(
            f"Quality gate BLOCKED {source}: {report.summary()}
"
            f"Issues: {report.issues}"
        )

    return rows

@task
def extract():
    return fetch_rows_from_api()

@task
def load(rows):
    insert_to_warehouse(rows)

@flow(name="orders-pipeline")
def orders_pipeline():
    rows = extract()
    clean = screen_data(rows, source="orders")  # fails fast on BLOCK
    load(clean)

if __name__ == "__main__":
    orders_pipeline()

Prefect automations: Pair this with a Prefect automation on flow run failures to route BLOCK events to Slack or PagerDuty. The report.summary() message gives you the failure reason directly in the alert.

How it works

Every batch returns a verdict

DataScreenIQ runs 18 quality checks in a single pass — null rates, type mismatches, schema drift, outliers, duplicate rates, and more. The result is one of three verdicts.

PASS

Data is clean

All checks within thresholds. Pipeline proceeds to load. No action needed.

WARN

Issues detected

Quality degraded but above BLOCK threshold. Load proceeds, issue flagged for review.

BLOCK

Pipeline stopped

Critical quality issue detected. Row load prevented. Dead-letter queue or alert triggered.

Your Prefect flow succeeded.
But your data is already broken.

Common Prefect data failures

Pipeline fails after damage is done

Bad data never reaches your warehouse

Add this to your Prefect flow

Install the SDK

Store your API key

Add the quality gate task

Configure alerting

Data is clean

Issues detected

Pipeline stopped

Airflow DAG

GitHub Action

dbt post-hook

Google Colab

Start screening in minutes

Your Prefect flow succeeded.But your data is already broken.

Common Prefect data failures

Pipeline fails after damage is done

Bad data never reaches your warehouse

Add this to your Prefect flow

Install the SDK

Store your API key

Add the quality gate task

Configure alerting

Data is clean

Issues detected

Pipeline stopped

Airflow DAG

GitHub Action

dbt post-hook

Google Colab

Start screening in minutes

Your Prefect flow succeeded.
But your data is already broken.