Skip to content

Own Your Domain, Share Your Foundation

A tension kills AI initiatives: domain teams want autonomy to ship fast, platform teams want consistency to maintain sanity. Let both run unchecked and you get chaos. Over-centralize and you get bottlenecks that strangle innovation.

The companies that actually scale AI have figured out a simple principle: own your domain, share your foundation. The pattern: domain teams own what requires business context, platform teams own what requires infrastructure expertise.

What Domain Teams Own

If it requires understanding the customer, the business context, or the domain nuance, it belongs with the domain team:

  • Agent behavior and automation logic. Your pricing team knows pricing. Your support team knows customer pain points. They define how AI agents behave in their domain.
  • Model selection within guardrails. Domain teams pick which models fit their use cases. Platform provides approved options and governance. Domain chooses within it.
  • Prompt engineering and evaluation. The people closest to the work understand what "good" looks like. They build prompt libraries, define success metrics, iterate based on real feedback.
  • Domain-specific metrics. Central dashboards show platform health. Domain teams own the metrics that matter for their outcomes.

A 2024-2025 survey found that 89% of practitioners use AI daily, while 40% now own AI platform responsibilities directly4. Domain expertise and AI capability are converging, not separating.

What Platform Teams Own

If it requires infrastructure expertise, security knowledge, or cross-organizational consistency, it belongs with the platform team:

  • AI gateway and routing. One place to manage model access, track usage, enforce rate limits. When every team builds their own integration, you get ungovernable sprawl5.
  • Observability and monitoring. Real-time token consumption, cost accumulation per application, visibility into which department called which model, how often, at what cost.
  • Security and compliance infrastructure. Audit trails for SOC2, HIPAA, GDPR. Platform builds it once, domain teams inherit automatically. Only 6% of organizations have advanced AI security strategies6—the ones who do treat it as platform responsibility.
  • Core infrastructure and authentication. Model access, credential management, rate limiting. The substrate every domain team builds on.

OpenAI's internal financial engineering team articulated this philosophy: "decentralized experimentation on a centralized substrate"7. The platform provides reusable infrastructure so product teams move quickly without compromising trust.

Making It Work at Scale

Uber's Michelangelo demonstrates this at scale. The central platform owns infrastructure—data processing, training pipelines, deployment. Domain teams make critical decisions through a domain-specific language for selecting and combining features. Teams add their own functions without platform approval1. The result: approximately 400 active ML projects, over 20,000 training jobs monthly, 5,000+ models in production2. That's not a bottleneck. That's leverage.

Netflix built Metaflow with the same philosophy. Hundreds of domain teams maintain independent ML projects. The platform's job isn't to approve—it's to reduce friction so teams move from experimentation to production "with minimal overhead"3.

The debate between centralized and decentralized AI teams has a predictable answer—neither extreme works. Completely centralized teams become bottlenecks. Completely decentralized teams create tool sprawl and skyrocketing spend8.

GitLab learned this the hard way. Justin Farris, VP of Monetization, described years of "circular debate" until the reporting line changed. When he began reporting directly to the CEO, "decisions became faster and clearer"9. The lesson: unclear ownership creates friction that compounds.

This principle isn't about drawing org chart lines. It's about answering one question clearly for every AI capability—who can change this, and what can they change without asking permission? Get that answer right, and domain teams move fast while platform teams maintain sanity.


References


← Previous: The 4 Team Models for AI-First Operations | Chapter Overview | Next: Hiring for AI-First →


  1. Planet Cassandra. Meet Michelangelo: Uber's Machine Learning Platform — planetcassandra.org 

  2. Product in Data. Uber's Michelangelo: Revolutionizing ML Scalability — productindata.com 

  3. InfoQ. Netflix Metaflow — infoq.com 

  4. Vultr. Platform Engineering State of AI 2025 Report — discover.vultr.com 

  5. Jimmy Song. AI Gateway In Depth — jimmysong.io 

  6. Mint MCP. Enterprise AI Infrastructure Statistics 2025 — mintmcp.com 

  7. Metronome. Monetization Engineering for the AI Era — metronome.com 

  8. Devineni, P. AI Leadership Team Topologies — linkedin.com 

  9. Metronome. Monetization Engineering for the AI Era — metronome.com