FeaturesHow It WorksArchitectureIntegrationsPricingBlog
Industry8 min read

Data Mesh in Practice: Lessons for Financial Institutions

By Cupel Team
data-meshdomain-ownershipgovernance

The Promise and the Tension

Data mesh, as articulated by Zhamak Dehghani, rests on four foundational principles: domain-driven data ownership, data as a product, self-serve data infrastructure, and federated computational governance. Each principle addresses a real failure mode of centralized data architectures -- bottlenecked central teams, poor data quality from producers who never see their own output, and governance policies that are either too rigid to be practical or too loose to be effective.

For financial institutions, the promise of data mesh is compelling. Large banks operate dozens of data domains -- risk, trading, operations, compliance, customer, product -- each with deep subject matter expertise that a central data team can never fully replicate. Pushing data ownership to the domains that understand the data best should, in theory, produce higher-quality data products and faster iteration.

But financial institutions also operate under constraints that make a naive implementation of data mesh dangerous. Regulatory requirements like BCBS 239, MiFID II, and PCI-DSS demand centralized oversight of data quality, lineage, and access controls. The tension between decentralized ownership and centralized compliance is the core challenge of implementing data mesh in regulated environments.

Domain-Driven Ownership in Financial Services

Where It Works

The first principle of data mesh -- domain-driven ownership -- aligns naturally with how financial institutions are already organized. The risk team understands risk data better than anyone else. The trading desk knows its position data intimately. The compliance team is the authoritative source for regulatory reporting logic.

When these domain teams own their data products, several benefits emerge immediately. Data quality improves because the people closest to the data are responsible for its accuracy. Response times to data issues shorten because the team receiving the alert is also the team that understands the business context. And innovation accelerates because domain teams can iterate on their data products without waiting for a centralized team to prioritize their requests.

Where It Breaks Down

The challenge arises at domain boundaries. When the risk team consumes data from the trading desk, who is responsible for data quality? The trading team owns the source data, but the risk team has specific requirements for accuracy, completeness, and timeliness that may differ from the trading team's own needs. In a regulatory context, the answer matters: if a risk report contains inaccurate data, the regulator does not accept "the other team's data was wrong" as a defense.

In regulated financial services, domain ownership does not mean domain isolation. Every data product that feeds into regulatory reporting must meet organization-wide quality and compliance standards, regardless of which team produced it.

Cross-domain data products also raise governance questions. A Customer 360 view that combines data from onboarding, transactions, complaints, and compliance requires coordination across multiple domain teams. Without a governance framework that spans domains, these cross-cutting data products become organizational nightmares -- each contributing domain applying different standards, different update frequencies, and different access policies.

Data as a Product: The SLA Imperative

The second principle -- treating data as a product -- requires each domain team to publish well-defined, discoverable, and trustworthy data products with clear quality contracts. In financial services, this principle intersects directly with regulatory expectations.

Defining Data Products for Finance

A data product in a financial institution is not merely a dataset with a schema. It must include:

  • Schema definition and versioning so consumers can detect and adapt to changes
  • Quality SLAs specifying accuracy thresholds, freshness guarantees, and completeness requirements
  • Lineage metadata tracing every column back to its source, through every transformation
  • Access policies defining who can consume the data and under what conditions
  • Ownership identifying the responsible team and escalation path for quality issues
  • Regulatory classification marking which compliance frameworks apply (BCBS 239, PCI-DSS, GDPR)

A well-defined data product in financial services should answer three questions without requiring human consultation: What does this data mean? How fresh is it? Who is responsible when something goes wrong?

The Catalog Problem

Data products are only valuable if they are discoverable. A financial institution with 50 domain teams, each publishing multiple data products, needs a centralized catalog that provides search, lineage exploration, and impact analysis. The catalog must be automatically populated from pipeline metadata -- manual catalog entries become stale within weeks.

Self-Serve Infrastructure: The Platform Team's Role

The third principle -- self-serve data infrastructure -- is where many financial institutions struggle most. The goal is to give domain teams the autonomy to build, deploy, and operate their own data pipelines without depending on a central engineering team for every change.

What Self-Serve Means in Practice

Self-serve does not mean every domain team builds its own infrastructure from scratch. It means providing a platform that abstracts the complexity of pipeline creation, quality monitoring, and deployment. Domain teams should be able to:

  • Build pipelines using pre-approved components (sources, transforms, quality gates, compliance steps) without writing infrastructure code
  • Deploy to production through automated CI/CD without filing tickets
  • Monitor pipeline health and data quality through dashboards tailored to their domain
  • Publish data products to the organization catalog with a defined process

The platform team's role shifts from building pipelines for domain teams to building the platform that domain teams use to build their own pipelines. This is a fundamentally different operating model that requires investment in tooling, documentation, and developer experience.

The Component Library Model

A practical implementation of self-serve infrastructure uses a shared component library. The platform team maintains a curated set of approved pipeline components -- connectors, transformation templates, quality check suites, compliance steps -- that domain teams assemble into pipelines. Components are individually versioned, so teams can pin to a specific version and upgrade on their own schedule. New components can be contributed by domain teams, subject to a review and approval process.

This approach provides autonomy within guardrails. Domain teams build what they need using approved building blocks. The platform team ensures that every building block meets security, compliance, and quality standards.

Federated Governance: The Hard Part

The fourth principle -- federated computational governance -- is the most critical for regulated financial institutions and the one most often implemented poorly.

The "More Restrictive Only" Constraint

Federated governance does not mean that each domain sets its own rules independently. It means establishing a governance hierarchy where:

  1. Organization-level policies define the baseline -- minimum data classification requirements, mandatory quality checks, approved connector lists, encryption standards, audit logging requirements
  2. Team-level policies can add restrictions beyond the baseline -- additional quality thresholds, stricter access controls, domain-specific compliance checks -- but can never loosen organization-level requirements
  3. Pipeline-level policies can further tighten controls for specific use cases

This "more restrictive only" constraint is essential in regulated environments. It ensures that no domain team can, intentionally or accidentally, weaken security or compliance controls that apply organization-wide.

The most effective governance model for regulated data mesh implementations is one where policies are enforced computationally, not procedurally. When the platform itself prevents a team from deploying a pipeline that violates organization-level policies, compliance becomes structural rather than dependent on human vigilance.

Policy Enforcement at Runtime

Governance policies must be enforced at pipeline execution time, not just at design time. A policy resolver should merge organization, team, and pipeline policies at runtime, applying the most restrictive setting at every decision point. This includes data classification (automatically applying masking when PII is detected), quality thresholds (rejecting data that falls below minimum standards), and access controls (enforcing column-level and row-level security based on the consumer's attributes).

The Audit Trail Requirement

Federated governance in financial services requires a comprehensive audit trail that spans all domains. When a regulator asks to see the lineage of a specific risk metric, the answer must trace through multiple domain teams' data products without gaps. This demands a centralized metadata layer that collects lineage, quality metrics, and access logs from all domain teams, even as those teams operate independently.

Making Data Mesh Work in Regulated Finance

The institutions succeeding with data mesh in financial services share a common pattern: they invest heavily in the platform layer that makes decentralized ownership practical without sacrificing centralized oversight. Domain teams get autonomy in how they build and operate their data products. The platform enforces non-negotiable standards for quality, security, and compliance.

This is not a contradiction. It is the recognition that data mesh's four principles are complementary, not independent. Domain ownership without governance produces chaos. Governance without self-serve infrastructure produces bottlenecks. Data products without discoverability produce silos. And self-serve infrastructure without domain expertise produces low-quality data.

Cupel implements this balance through its multi-team workspace model with hierarchical governance. Each team operates in its own workspace with full autonomy to build pipelines using the visual builder and shared component library. Organization-level policies set compliance baselines that teams cannot weaken. Data products are published to a centralized catalog with automated lineage and quality metadata. And the platform's policy resolver enforces governance computationally at every pipeline execution -- ensuring that decentralized ownership never comes at the cost of regulatory compliance.

Ready to build your data platform?

See how Cupel can streamline your data engineering workflows.

Explore Features

Related Posts