Airflow vs Dagster: A Practical Comparison After Running Both in Production

We get asked this question on almost every engagement: "Should we use Airflow or Dagster?"

The honest answer is that both work. We have run Airflow in production for three years across eight clients. We have migrated two clients to Dagster and started two greenfield projects on it. Neither tool is broken. The question is which one fits your team, your existing stack, and how you think about data pipelines.

This is not a feature matrix copied from docs. This is what we actually experienced running both.

Where We Still Pick Airflow

Airflow wins when the team already knows it and the pipelines are mostly task-based workflows — "run this script, then that script, then load this file."

The typical Airflow engagement

A mid-size company with 30-50 DAGs, a team of 3-5 data engineers, and pipelines that call external APIs, run Python scripts, and trigger dbt. They have been running Airflow for a year. It works. They want us to clean it up, add monitoring, and make it production-grade.

Here is what a well-structured Airflow DAG looks like in our projects:

# dags/finance__stripe__daily_sync.py
from airflow.decorators import dag, task
from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator
from pendulum import datetime

@dag(
    dag_id="finance__stripe__daily_sync",
    schedule="0 6 * * *",
    start_date=datetime(2026, 1, 1),
    catchup=False,
    tags=["finance", "stripe", "tier-1"],
    default_args={
        "owner": "data-eng",
        "retries": 2,
        "retry_delay": 300,
    },
)
def finance_stripe_daily():

    @task()
    def extract_charges():
        """Pull yesterday's charges from the Stripe API."""
        from stripe_client import get_charges
        charges = get_charges(days_back=1)
        return charges  # XCom serialization

    @task()
    def load_to_raw(charges: list):
        """Load raw charge records into raw_stripe.charges."""
        from loaders import bulk_insert
        bulk_insert("raw_stripe.charges", charges)

    trigger_dbt = SQLExecuteQueryOperator(
        task_id="trigger_dbt_run",
        conn_id="snowflake_default",
        sql="CALL staging.run_dbt_tag('stripe')",
    )

    charges = extract_charges()
    load_to_raw(charges) >> trigger_dbt

finance_stripe_daily()

The naming convention {domain}__{source}__{pipeline_name} matters more than you think. When you have 80 DAGs and something fails at 3 AM, finding the right one fast is the difference between a 5-minute fix and a 40-minute scramble.

Where We Pick Dagster

Dagster wins on greenfield projects and teams that think in terms of data assets rather than tasks. The mental model shift is real: instead of "run this pipeline at 6 AM," you define "I want this table to be fresh every day" and Dagster figures out what to run.

The typical Dagster engagement

A company building a new data platform from scratch, or a team whose Airflow setup has become unmanageable (200+ DAGs, no testing, broken backfills). They want something that makes local development, testing, and iteration faster.

Here is the same pipeline in Dagster:

# assets/finance/stripe.py
from dagster import asset, AssetExecutionContext
from dagster_dbt import DbtCliResource, dbt_assets

@asset(
    group_name="finance",
    kinds={"python", "snowflake"},
    description="Raw charge records from Stripe API",
)
def raw_stripe_charges(context: AssetExecutionContext):
    """Pull charges from Stripe and load to raw_stripe.charges."""
    from stripe_client import get_charges
    from loaders import bulk_insert

    charges = get_charges(days_back=1)
    bulk_insert("raw_stripe.charges", charges)
    context.log.info(f"Loaded {len(charges)} charges")

@dbt_assets(manifest=dbt_manifest_path)
def dbt_models(context: AssetExecutionContext, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()

The difference is subtle in a small example, but at scale it changes how the team works:

Before

After

The dbt Integration Is the Deciding Factor

Honestly, this is where Dagster pulls ahead the most. With the dagster-dbt integration, every dbt model becomes a Dagster asset. You get:

Lineage from raw source through dbt model to downstream consumer, all in one UI
Per-model materialization — run just the models you need, not the whole project
Automatic freshness policies based on dbt source freshness checks
Test results visible alongside asset status

In Airflow, dbt is a black box. You trigger dbt run and hope for the best. You can use Cosmos to break it into individual tasks, but it is a workaround, not a native integration.

The Migration Question

Two clients asked us to migrate from Airflow to Dagster. Both times, we did it incrementally:

The migration itself is not technically difficult. The hard part is retraining the team's mental model from tasks to assets. Budget time for that.

Our Recommendation

Pick Airflow when:

Your team already runs it and it is working
You have fewer than 100 DAGs
Your pipelines are mostly task-based (API calls, file moves, script execution)
You need the broadest managed hosting options

Pick Dagster when:

You are starting a new platform from scratch
You use dbt heavily and want native integration
You need reliable backfills and partition-aware scheduling
Your team thinks in terms of data products, not scripts
Local development speed matters to you

We have no allegiance to either tool. We pick whatever makes the client's team more productive. But if you asked us what we would choose for a new engagement with a modern dbt-centric stack — it would be Dagster. The developer experience gap is real.