Google Colab

Your notebook ran without errors.
But your DataFrame was broken from the start.

A type mismatch on row 847. A null spike in a key column. Your model trains on garbage. Screen your data before you process it.

Free tier · 500K rows/month · no credit card required

No API key needed to start. Works with any pandas DataFrame. Result in under 10ms.

Common notebook data failures

  • model.fit() fails 40 cells in — the CSV had a type mismatch from the start
  • Null values in a key column silently corrupt feature engineering
  • A DataFrame loaded from an API has schema drift nobody caught
  • Training results are wrong — the data was bad, not the model
  • KeyError three cells in — a column was renamed upstream
The problem
Before vs after
Before

Quality issues discovered late in the notebook

You load a CSV into a DataFrame. You run transformations. Three cells later you get a KeyError, or a model that refuses to train, or a chart that makes no sense. The data was bad from the start.

load CSV → transform → model.fit() → ❌
[ type error on row 1,847 ]
[ scroll back through 40 cells to find root cause ]
After

Bad data never reaches your warehouse

DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.

load CSV → screen(df) → PASS → transform → model.fit() ✓
                        → WARN → flag + continue
                        → BLOCK → fix before going further
Quick start

Add this after loading your data

python copy & paste
import datascreeniq as dsiq

client = dsiq.DemoClient()  # no API key needed to start
report = client.screen_dataframe(df, source="my_dataset")
print(report.summary())
# 🚨 BLOCK | Health: 34.0% | Type mismatch: amount | Null rate: email=67% | (7ms)
Two lines. You now know exactly what's wrong before you spend an hour debugging downstream.
Setup guide
Get running in minutes

Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.

01

Install in the first cell

One line: !pip install datascreeniq. No dependencies beyond requests, which Colab already has.

02

Use DemoClient — no API key needed

dsiq.DemoClient() lets you try the screening engine immediately. To screen your own data against a persistent baseline, get a free API key.

03

Screen your DataFrame

Call client.screen_dataframe(df, source="my_data"). The result is a ScreenReport with status, health score, and a full issues breakdown.

04

Check the verdict before proceeding

Use report.raise_on_block() to stop the notebook on critical quality failures, or inspect report.null_rates and report.type_mismatches directly.

Cell 1 — install
!pip install datascreeniq -q
Cell 2 — screen with DemoClient (no API key)
import datascreeniq as dsiq import pandas as pd # DemoClient — try instantly, no API key required client = dsiq.DemoClient() # Load your data df = pd.read_csv("your_data.csv") # Screen it report = client.screen_dataframe(df, source="my_dataset") print(report.summary()) # 🚨 BLOCK | Health: 34.0% | Rows: 1,240 | Type mismatch: amount | (7ms)
Cell 3 — inspect the results
# What's wrong? print("Status:", report.status) # PASS | WARN | BLOCK print("Health:", report.health_pct) # "34.0%" print("Type issues:", report.type_mismatches) # ["amount", "price"] print("Null rates:", report.null_rates) # {"email": 0.67} print("Drift events:", report.drift_count) # 2 # Stop the notebook if data is critically bad report.raise_on_block() # raises DataQualityError if BLOCK
Cell 4 — use your own API key for persistent baselines
import os os.environ["DATASCREENIQ_API_KEY"] = "dsiq_live_..." # or use Colab Secrets client = dsiq.Client() # now uses your account + persists baselines report = client.screen_dataframe(df, source="my_dataset")
Colab Secrets: Store your API key in Colab's built-in secrets manager (🔑 icon in the left sidebar) rather than hardcoding it. Access it with from google.colab import userdata; key = userdata.get("DATASCREENIQ_API_KEY").
How it works
Every batch returns a verdict

DataScreenIQ runs 18 quality checks in a single pass — null rates, type mismatches, schema drift, outliers, duplicate rates, and more. The result is one of three verdicts.

PASS

Data is clean

All checks within thresholds. Pipeline proceeds to load. No action needed.

WARN

Issues detected

Quality degraded but above BLOCK threshold. Load proceeds, issue flagged for review.

BLOCK

Pipeline stopped

Critical quality issue detected. Row load prevented. Dead-letter queue or alert triggered.

More integrations
Works with your whole stack

DataScreenIQ drops into any pipeline that can make an HTTP call.

Start screening in minutes

Free tier: 500K rows/month. No credit card. API key in 30 seconds.

Get a free API key →