A type mismatch on row 847. A null spike in a key column. Your model trains on garbage. Screen your data before you process it.
No API key needed to start. Works with any pandas DataFrame. Result in under 10ms.
You load a CSV into a DataFrame. You run transformations. Three cells later you get a KeyError, or a model that refuses to train, or a chart that makes no sense. The data was bad from the start.
DataScreenIQ screens the payload before storage. A BLOCK stops the pipeline instantly — bad rows go to a dead-letter queue, not your database.
import datascreeniq as dsiq
client = dsiq.DemoClient() # no API key needed to start
report = client.screen_dataframe(df, source="my_dataset")
print(report.summary())
# 🚨 BLOCK | Health: 34.0% | Type mismatch: amount | Null rate: email=67% | (7ms)
Install the SDK, drop in the integration, get PASS / WARN / BLOCK on every run.
One line: !pip install datascreeniq. No dependencies beyond requests, which Colab already has.
dsiq.DemoClient() lets you try the screening engine immediately. To screen your own data against a persistent baseline, get a free API key.
Call client.screen_dataframe(df, source="my_data"). The result is a ScreenReport with status, health score, and a full issues breakdown.
Use report.raise_on_block() to stop the notebook on critical quality failures, or inspect report.null_rates and report.type_mismatches directly.
!pip install datascreeniq -q
import datascreeniq as dsiq
import pandas as pd
# DemoClient — try instantly, no API key required
client = dsiq.DemoClient()
# Load your data
df = pd.read_csv("your_data.csv")
# Screen it
report = client.screen_dataframe(df, source="my_dataset")
print(report.summary())
# 🚨 BLOCK | Health: 34.0% | Rows: 1,240 | Type mismatch: amount | (7ms)
# What's wrong?
print("Status:", report.status) # PASS | WARN | BLOCK
print("Health:", report.health_pct) # "34.0%"
print("Type issues:", report.type_mismatches) # ["amount", "price"]
print("Null rates:", report.null_rates) # {"email": 0.67}
print("Drift events:", report.drift_count) # 2
# Stop the notebook if data is critically bad
report.raise_on_block() # raises DataQualityError if BLOCK
import os
os.environ["DATASCREENIQ_API_KEY"] = "dsiq_live_..." # or use Colab Secrets
client = dsiq.Client() # now uses your account + persists baselines
report = client.screen_dataframe(df, source="my_dataset")
from google.colab import userdata; key = userdata.get("DATASCREENIQ_API_KEY").DataScreenIQ runs 18 quality checks in a single pass — null rates, type mismatches, schema drift, outliers, duplicate rates, and more. The result is one of three verdicts.
All checks within thresholds. Pipeline proceeds to load. No action needed.
Quality degraded but above BLOCK threshold. Load proceeds, issue flagged for review.
Critical quality issue detected. Row load prevented. Dead-letter queue or alert triggered.
DataScreenIQ drops into any pipeline that can make an HTTP call.
Quality gate between extract and load.
Block merges when data quality fails.
Catch schema drift in transformed data.
Quality gate flow with alerting on BLOCK.
Free tier: 500K rows/month. No credit card. API key in 30 seconds.
Get a free API key →