Data Quality Gates: How We Catch Bad Data Before It Hits the Dashboard
The dbt test suite and Great Expectations checks we run on every pipeline. Includes the actual YAML configs, custom tests, and the alerting setup that pages us when something breaks.
Last year, a retail client's daily revenue dashboard showed a 300% spike on a Tuesday. The CEO sent a celebratory Slack message. Then the data team realized a payment processor had started sending amounts in cents instead of dollars. By the time they caught it, the board had already seen the wrong number.
That engagement is why we now run data quality gates on every pipeline, in every environment, with zero exceptions.
Gate 1: Source Freshness
Before transforming anything, verify the source data actually arrived. Stale source data produces stale results that look correct — and that is worse than an outright failure.
Run dbt source freshness before every dbt build. If a source is stale beyond the error threshold, the pipeline stops. No silent propagation of old data.
# In the Airflow DAG or CI script
dbt source freshness --select source:raw_stripe
# Exit code 1 if any source exceeds error_after threshold
Gate 2: Schema and Row-Level Tests
This is the core of our testing strategy. Every staging and mart model gets a set of dbt tests in its YAML definition.
The Minimum Test Suite
Every model gets these four tests at minimum. No exceptions, even for "temporary" models (those become permanent faster than anyone plans):
Generic tests catch generic problems. The bugs that actually hurt are domain-specific. Here are three custom tests from real engagements:
Test 1: Revenue should not swing more than 50% day-over-day.
This is the test that would have caught the cents-vs-dollars bug.
-- tests/assert_revenue_daily_swing.sql
-- Fails if daily revenue changes by more than 50% from previous day
with daily_revenue as (
select
order_date,
sum(net_revenue_usd) as revenue
from {{ ref('fct_revenue') }}
where order_date >= dateadd(day, -7, current_date())
group by order_date
),
with_prev as (
select
order_date,
revenue,
lag(revenue) over (order by order_date) as prev_revenue,
case
when lag(revenue) over (order by order_date) = 0 then null
else abs(revenue - lag(revenue) over (order by order_date))
/ lag(revenue) over (order by order_date)
end as pct_change
from daily_revenue
)
-- Return rows that violate the threshold (dbt fails if any rows returned)
select * from with_prev
where pct_change > 0.50
and prev_revenue > 100 -- Ignore low-volume days
Test 2: Every order should have a matching customer.
-- tests/assert_orders_have_customers.sql
select
o.order_id,
o.customer_id
from {{ ref('fct_orders') }} o
left join {{ ref('dim_customers') }} c
on o.customer_id = c.customer_id
where c.customer_id is null
and o.order_date >= dateadd(day, -30, current_date())
Test 3: Row counts should not drop unexpectedly.
-- tests/assert_staging_row_count.sql
-- stg_orders should never have fewer rows today than yesterday
with counts as (
select
_loaded_at::date as load_date,
count(*) as row_count
from {{ ref('stg_shopify__orders') }}
where _loaded_at::date >= dateadd(day, -3, current_date())
group by 1
)
select
load_date,
row_count,
lag(row_count) over (order by load_date) as prev_count
from counts
having row_count < prev_count * 0.8 -- 20% drop threshold
Gate 3: Anomaly Detection on Marts
For tier-1 models (the ones that feed executive dashboards and financial reports), we add statistical anomaly checks using Great Expectations or dbt-expectations:
# models/marts/finance/_finance__models.yml
models:
- name: fct_revenue
meta:
tier: 1
owner: "finance-analytics@client.com"
tests:
# Volume anomaly: flag if today's row count is a statistical outlier
- dbt_expectations.expect_table_row_count_to_be_between:
min_value: 1
row_condition: "order_date = current_date() - 1"
# Metric anomaly: daily revenue within 3 standard deviations
- dbt_expectations.expect_column_quantile_values_to_be_between:
column: net_revenue_usd
quantile: 0.5
min_value: 500
max_value: 50000
The Alerting Layer
Tests are useless if nobody sees the failures. We wire dbt test results into Slack using a simple post-run hook:
# In Airflow: after the dbt build task
@task()
def notify_test_failures():
"""Parse dbt run_results.json and alert on failures."""
import json
from pathlib import Path
from slack_sdk import WebClient
results_path = Path("target/run_results.json")
results = json.loads(results_path.read_text())
failures = [
r for r in results["results"]
if r["status"] == "fail" and r["node"]["resource_type"] == "test"
]
if not failures:
return
client = WebClient(token=os.environ["SLACK_BOT_TOKEN"])
failure_list = "\n".join(
f"• `{f['node']['name']}` — {f['message'][:100]}"
for f in failures[:10]
)
client.chat_postMessage(
channel="#data-alerts",
text=f":rotating_light: *{len(failures)} dbt test(s) failed*\n\n{failure_list}",
)
Severity Levels: Not Everything Is a Blocker
A critical insight we picked up after a few engagements: not every test failure should stop the pipeline.
Anomalies that need investigation but are not necessarily wrong
Revenue swing >50%, unexpected nulls in optional fields
The first time we set everything to error, a warn-worthy anomaly in a non-critical table blocked a morning pipeline for three hours while the team investigated what turned out to be a legitimate business event (a flash sale). Now we are deliberate about what blocks and what alerts.
Making It Stick
The biggest challenge with data quality is not technical — it is cultural. Tests rot when the team stops writing them for new models. We enforce it with CI:
# .github/workflows/dbt-ci.yml (simplified)
# Fails the PR if any model lacks a unique + not_null test on its primary key
dbt ls --resource-type model --output json | \
python scripts/check_test_coverage.py --min-tests 2
Every pull request that adds a new model must include at least a unique and not_null test on the primary key. No exceptions. The CI check enforces it so humans do not have to nag each other.
Data quality is not a project with a finish line. It is a practice. The gates we described here — freshness checks, schema tests, anomaly detection — take about two days to set up on a new project. The bugs they catch in the first month alone make that investment pay for itself several times over.