The Platform Engineering Pivot
AI agents don't manage themselves. The teams that figure that out first will define what AI-native organizations actually look like.
The New Stack: Building AI-Native Organizations — Article 3
Platform engineering has had exactly one job for the last decade: make it faster and safer for developers to ship software. Build the internal developer platform. Manage the Kubernetes clusters. Own the CI/CD pipelines. Abstract away the infrastructure so developers can focus on code.
That job description just got a second chapter.
GitHub’s Agent HQ — introduced in late 2025 as a dedicated management experience for AI coding agents across enterprise organizations — is now maturing with key governance upgrades. On March 25, 2026, GitHub updated Copilot usage metrics to identify which users have active coding agent activity, giving platform and engineering leadership a usage signal they didn’t have before.
GitHub built this because their enterprise customers demanded it. Platform teams told them: we need to see what agents are doing, control what they can access, and measure what they cost. The same conversation is happening at every enterprise that has moved past experimentation and into deployment.
The platform engineering pivot is underway. The question is whether your team is ahead of it or behind it.
What Platform Engineering Was Built For
The modern platform engineering function emerged to solve a specific problem: developer cognitive overhead. As infrastructure grew more complex — multi-cloud, microservices, containerization, observability toolchains — the cost of context-switching between writing code and managing infrastructure became unacceptable. Platform engineering abstracted that complexity into an internal developer platform (IDP), giving developers paved roads to production.
The canonical platform engineering mandate covers five domains: provisioning (get developers environments fast), deployment (CI/CD pipelines that are reliable and governable), observability (know what’s running and what’s failing), security (shift-left controls baked into the platform), and cost management (attribution and optimization at the team and project level).
This mandate was designed around a human developer. The developer writes code. The platform handles everything else.
AI agents broke that assumption.
What Agents Changed
AI coding agents — Cursor’s self-hosted workers, GitHub Copilot coding agents, GitLab Duo agents, custom internal agents built on Claude or GPT-5 — don’t just use the platform. They operate on the platform. They clone repos, write code, commit changes, open pull requests, trigger pipelines, and invoke tools autonomously. They work at 2 AM. They work in parallel across dozens of projects simultaneously. They scale to thousands of concurrent sessions.
They are, in every operational sense, a new class of software worker. And like any worker operating on your infrastructure, they need to be provisioned, monitored, governed, and managed.
The platform engineering function wasn’t built for this. Most internal developer platforms have no concept of agent identity — they were designed for human users who authenticate once and work within understood patterns. Agents authenticate per session, operate across boundaries, invoke external tools via MCP, and generate costs at a rate no human developer matches.
The gap is significant, and it’s widening as agent adoption accelerates.
The New Platform Engineering Mandate
The platform teams who are ahead of this curve have reorganized around a new mandate. It has the same five domains as the original — provisioning, deployment, observability, security, cost management — but the requirements are fundamentally different when the primary users are AI agents.
| Domain | Human Developer Era | Agentic Era |
|---|---|---|
| Provisioning | Self-service environments | Governed, auto-scaling agent worker fleets + RBAC |
| Observability | Pipeline & app telemetry | Session-level agent traces + tool-call cost tracking |
| Security | Broad access + human judgment | Least-privilege per session, MCP gateway inspection |
| Cost Management | Cloud spend attribution | Per-agent, per-project token & inference costs |
| Governance | Human policies & audits | Policy-as-code at scale + automated compliance evidence |
Provisioning for agents, not just developers
Standing up a Cursor self-hosted worker fleet isn’t something developers should be doing themselves. It requires Kubernetes operators, RBAC policies, network isolation between agent sessions, and secrets management at scale. Platform engineering owns this. The Helm chart Cursor ships for enterprise scale deployments is a starting point, not a finished product. The finished product is a governed, monitored, auto-scaling agent worker fleet that developers can request access to without thinking about the infrastructure.
This is where Forge plays its most direct role in the agentic stack. Forge is the AI-Native DevOps Platform — managed provisioning, monitoring, and governance of your entire toolchain on infrastructure you control.
Observability that covers agents
Your existing observability stack almost certainly doesn’t track agent sessions as first-class citizens. It might log pipeline runs. It doesn’t log which agent opened which PR, what tools it invoked, what data it accessed, and how much that session cost. Platform engineering now needs to extend observability to cover agent activity — session-level telemetry, tool call traces, cost per agent per project, anomaly detection when an agent behaves unexpectedly.
GitHub’s addition of coding agent activity to Copilot usage metrics is a step in this direction. It’s not sufficient for enterprise governance, but it signals that the industry understands the gap.
Security that operates at the agent boundary
Human developers get broad access because they’re expected to exercise judgment. AI agents should get scoped access because they aren’t. The security model for agent infrastructure is fundamentally different: least privilege per session, no persistent credentials, every external interaction logged, tool call inspection before execution.
Platform engineering needs to own the agent security boundary. This means containerized MCP server gateways, scoped per-session credentials, rate limiting on tool calls, and a clear policy for which systems agents can reach and which they cannot.
Cost management for AI workers
A developer who runs inefficient code costs you in engineering time. An AI agent that runs inefficient inference at scale costs you in real dollars, immediately. Token spend, compute costs for self-hosted workers, API call costs for tool invocations — this is a new cost category that most engineering organizations aren’t attributing to teams, projects, or use cases yet.
Governance that scales with agent count
The hardest part of the new mandate isn’t any single capability. It’s that all of these requirements must scale. One agent session is manageable. One thousand concurrent agent sessions across dozens of teams — each with different tool access, different cost budgets, different compliance requirements — is a platform engineering problem of a different order.
The organizations building AI governance infrastructure now — policy-as-code for agent behavior, unified audit trails, automated compliance evidence — are the ones who will scale agent deployment without accumulating governance debt. This is the governance layer Reign provides. The organizations waiting are building a problem that gets harder to solve with every agent they add.
What This Means for Your Team
If you lead a platform engineering function, three things are true right now:
Your developers are already using AI agents. Some through approved channels, many through tools you haven’t sanctioned. The agents are in your environment whether your platform supports them or not. The only question is whether they’re governed.
The vendors are not going to solve this for you. GitHub, Cursor, GitLab — they’re each solving their own agent management problem. None of them are solving your cross-toolchain governance problem. When a Cursor agent invokes a Jira MCP connector that a Copilot session also has access to, who’s tracking that? Not your agent vendors.
The window for getting ahead of this is closing. Major EU AI Act enforcement phases take effect in August 2026. Gartner projects 40% of enterprise software will have embedded AI agents by end of 2026. The platform engineering teams that pivot now — extending their mandate to cover agent provisioning, observability, security, and cost governance — become the function that makes AI-native development possible at scale. The teams that don’t become the bottleneck everyone routes around.
The Stack That Makes It Work
Enterprises that have already made this pivot are using a layered control plane approach: governed infrastructure underneath and unified governance on top.
The agents run on governed infrastructure — provisioned, monitored, and managed by a platform function with the tooling to handle scale. In our architecture, Forge provides this layer: managed Kubernetes infrastructure, toolchain governance, and the platform engineering capability most enterprises are understaffed to build themselves.
The agents operate under a governance control plane — every API call, tool invocation, and agent action inspected, logged, and policy-enforced in real time. Reign provides this layer: cross-toolchain visibility, unified audit trails, policy-as-code enforcement, cost attribution, and regulatory evidence across every AI tool in your pipeline.
Infrastructure governance. Runtime governance. Together, they’re what transforms “we have AI agents” into “we govern AI agents” — which is the only version that scales.
The platform engineering pivot isn’t optional. It’s the unlock for everything else in the AI-native stack.
Audit your agent infrastructure coverage
Map every AI agent in use against the five platform domains: provisioning, deployment, observability, security, cost management. For each domain, answer: do we have governed coverage for agent workloads? The gaps are your roadmap.
Extend observability to agent sessions
Stand up session-level telemetry for your highest-volume agent workloads. You need tool call traces, cost per session, and anomaly detection before you can govern at scale. This is your first governed agent infrastructure milestone.
Implement scoped credentials for agent sessions
Move from developer-style broad access to least-privilege per session. Containerized MCP server gateways, per-session secrets, and rate limits on tool calls. The agent security boundary is yours to own.
Deploy policy-as-code for agent governance at scale
One thousand concurrent agent sessions across dozens of teams is a platform engineering problem. Policy-as-code for agent behavior, unified audit trails, automated compliance evidence — build this now, before agent count makes it exponentially harder.
The AI-Native DevOps Platform — managed provisioning, monitoring, and governance of your entire toolchain.
AI Governance Platform — cross-toolchain visibility, policy-as-code, and regulatory evidence.
Paul Goldman is the CEO of iTmethods, where his team helps enterprises build and govern AI-native developer platforms. This is Article 3 in “The New Stack” series on building AI-native organizations.
Previously: The AI-Native Stack: What It Actually Looks Like · Self-Hosted AI Agents Are Here. The Governance Isn’t.
Continue the AI Governance series
← Previous
OpenClaw: The Governance Failure We Saw Coming
An autopsy of the biggest AI agent security event of 2026
Next →
The AI Agent Control Plane: What It Looks Like
Coming Soon
Or share your thoughts here
Your comment will appear on this page. The best insights may be shared in the LinkedIn discussion.
Get Paul's next article before it publishes
Join 500+ security leaders
