DataScreenIQ in Google Colab | Try Data Quality Screening

The problem

Before vs after

✕Before

Quality issues discovered late in the notebook

You load a CSV into a DataFrame. You run transformations. Three cells later you get a KeyError, or a model that refuses to train, or a chart that makes no sense. The data was bad from the start.

load CSV → transform → model.fit() → ❌
[ type error on row 1,847 ]
[ scroll back through 40 cells to find root cause ]

✓After

Bad data never reaches your warehouse

DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.

load CSV → screen(df) → PASS → transform → model.fit() ✓
→ WARN → flag + continue
→ BLOCK → fix before going further

Setup guide

Get running in minutes

Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.

Install in the first cell

One line: !pip install datascreeniq. No dependencies beyond requests, which Colab already has.

Use DemoClient — no API key needed

dsiq.DemoClient() lets you try the screening engine immediately. To screen your own data against a persistent baseline, get a free API key.

Screen your DataFrame

Call client.screen_dataframe(df, source="my_data"). The result is a ScreenReport with status, health score, and a full issues breakdown.

Check the verdict before proceeding

Use report.raise_on_block() to stop the notebook on critical quality failures, or inspect report.null_rates and report.type_mismatches directly.

Cell 1 — install
!pip install datascreeniq -q

Cell 2 — screen with DemoClient (no API key)
import datascreeniq as dsiq
import pandas as pd

# DemoClient — try instantly, no API key required
client = dsiq.DemoClient()

# Load your data
df = pd.read_csv("your_data.csv")

# Screen it
report = client.screen_dataframe(df, source="my_dataset")

print(report.summary())
# 🚨 BLOCK | Health: 34.0% | Rows: 1,240 | Type mismatch: amount | (7ms)

Cell 3 — inspect the results
# What's wrong?
print("Status:",       report.status)          # PASS | WARN | BLOCK
print("Health:",       report.health_pct)       # "34.0%"
print("Type issues:",  report.type_mismatches)  # ["amount", "price"]
print("Null rates:",   report.null_rates)       # {"email": 0.67}
print("Drift events:", report.drift_count)     # 2

# Stop the notebook if data is critically bad
report.raise_on_block()  # raises DataQualityError if BLOCK

Cell 4 — use your own API key for persistent baselines
import os
os.environ["DATASCREENIQ_API_KEY"] = "dsiq_live_..."  # or use Colab Secrets

client = dsiq.Client()  # now uses your account + persists baselines
report = client.screen_dataframe(df, source="my_dataset")

Colab Secrets: Store your API key in Colab's built-in secrets manager (🔑 icon in the left sidebar) rather than hardcoding it. Access it with from google.colab import userdata; key = userdata.get("DATASCREENIQ_API_KEY").

Your notebook ran without errors.
But your DataFrame was broken from the start.

Common notebook data failures

Quality issues discovered late in the notebook

Bad data never reaches your warehouse

Add this after loading your data

Install in the first cell

Use DemoClient — no API key needed

Screen your DataFrame

Check the verdict before proceeding

Data is clean

Issues detected

Pipeline stopped

Airflow DAG

GitHub Action

dbt post-hook

Prefect flow

Start screening in minutes

Your notebook ran without errors.But your DataFrame was broken from the start.

Common notebook data failures

Quality issues discovered late in the notebook

Bad data never reaches your warehouse

Add this after loading your data

Install in the first cell

Use DemoClient — no API key needed

Screen your DataFrame

Check the verdict before proceeding

Data is clean

Issues detected

Pipeline stopped

Airflow DAG

GitHub Action

dbt post-hook

Prefect flow

Start screening in minutes

Your notebook ran without errors.
But your DataFrame was broken from the start.