Upcoming Free MasterclassRegister now →

Your Pipeline Works. Your Data Doesn't. That's a Data Contracts Problem.

Your Pipeline Works. Your Data Doesn't. That's a Data Contracts Problem.

The failure mode most data teams don't see coming looks like this: a pipeline runs cleanly, all green, no errors. A dashboard updates. An executive makes a decision based on it. The decision is wrong, because the revenue figures were stale by 36 hours due to an upstream schema change nobody documented, and the pipeline had no idea anything was different.

Nothing broke. Everything lied.

This is what data teams mean when they talk about reliability — and it's why data contracts and observability have moved from nice-to-have to genuine engineering discipline over the past two years. According to Gartner, the average organisation loses $12.9 million annually from poor data quality. Most of that loss doesn't come from obvious failures. It comes from the quiet kind.

What a data contract actually is

The concept is simpler than the tooling ecosystem around it suggests. A data contract is a formal, machine-readable agreement between a data producer and a data consumer. It specifies what the data looks like — schema, column types, nullability constraints — what quality standards it meets, what SLA applies to its freshness, and who owns it.

Think of it as treating your data interfaces the same way a good software engineer treats API contracts. An API contract says: if you call this endpoint with these parameters, you'll get this response in this format. A data contract says: if you consume this table, it will contain these columns, updated within this window, with these quality expectations enforced.

The practical implication is the shift-left principle: problems are caught at the point of production, before bad data ever enters the pipeline, not discovered three dashboards downstream when someone notices a number looks wrong.

Glassdoor made this concrete — if a developer commits a schema change that violates a downstream contract, the build fails before the code merges. It's CI/CD discipline applied to data.

Where observability comes in

Contracts define expectations. Observability detects when reality diverges from them.

The five dimensions that observability platforms monitor are freshness, volume, schema, distribution, and lineage. That last one — lineage — is what makes observability genuinely powerful rather than just another alerting layer. When something breaks, lineage tells you which upstream tables fed the problem, which downstream assets it affects, and who owns each node in that chain.

Monte Carlo built its market position around this idea of "data downtime" — treating pipeline reliability the way DevOps teams treat application uptime. It monitors anomalies across those five dimensions and traces root cause automatically. Atlan operates at a different layer, aggregating signals from Monte Carlo, Soda, and other tools into a unified control plane with business context attached. When an alert fires, the right person gets notified with the full picture — what broke, what it affects, who owns it.

The distinction matters for choosing tooling. Observability platforms detect and diagnose. Governance platforms route and contextualise. Most production environments need both.

Why this matters even more as AI gets embedded in pipelines

An AI model that ingests stale or corrupted data doesn't crash — it produces confidently wrong outputs. A RAG pipeline fed schema-drifted documents retrieves irrelevant context and the LLM hallucinates coherently. Agentic systems that receive bad data through MCP connections make bad decisions several steps downstream before anyone notices.

Only 26% of Chief Data Officers say they're confident their data can reliably support AI-driven revenue streams. That's not a model problem. That's a pipeline discipline problem. Data contracts and observability are what bridge that gap — not by adding complexity, but by making existing complexity visible and enforceable.

Start with a contract on your most-consumed table. Define the schema, the freshness SLA, the owner. Put it in version control. That one contract will surface more about your data health than six months of ad-hoc monitoring.