Azure API Management's control plane problem — and what regulated teams run on Azure instead

Teams in regulated industries often reach for Azure API Management because it integrates cleanly with Entra ID and sits inside procurement catalogues that took years to build. Banking teams know it. Healthcare IT departments have it pre-approved. Government agencies have it on framework contracts. That familiarity is real, and the integration story with the broader Azure ecosystem is genuine.

The problem is not whether Azure API Management works. It does. The problem is where the configuration and audit data actually lives. Frameworks like DORA, GDPR, HIPAA, and NIS2 require configuration data and audit logs to stay inside a defined infrastructure boundary that the regulated entity operates. Azure API Management's architecture makes that boundary difficult to enforce in a way that satisfies a technical auditor.

This post is not an argument against Azure. Zerq runs on Azure. The argument is about which layer of the stack you control, where configuration persists, and where audit records write. For regulated industries, those questions have specific answers that Azure API Management cannot provide.

Why Azure API Management's architecture creates a compliance gap

Azure API Management's configuration state lives in Microsoft's Azure infrastructure. That is not a flaw in the product — it is the product's design. Managed services manage things on your behalf, and the management plane is one of those things.

The self-hosted gateway option partially addresses this: you deploy a gateway container in your own environment and runtime traffic proxies through your infrastructure. But the self-hosted gateway polls Azure's management endpoints periodically to receive policy updates. API definitions, rate limit policies, product configurations, and subscription keys all persist in Azure's management plane. The gateway in your environment is a runtime execution engine, not a configuration store. Pull connectivity to Azure is a hard requirement, not optional.

This has two consequences. First, your gateway's rules are stored in infrastructure you do not operate. A technical auditor reviewing data flows will trace API configuration writes to Azure's infrastructure, outside your defined perimeter. Second, the self-hosted gateway cannot function in an air-gapped environment. If your security architecture requires no outbound internet connectivity from the application layer, the self-hosted gateway does not fit.

The audit trail problem is separate but compounds the first. Azure APIM sends logs to Azure Monitor and Application Insights by default. Both are Microsoft's infrastructure. Routing logs to your own SIEM requires building an ingestion pipeline: Event Hubs for streaming, a consumer that reads from Event Hubs, transformation logic, and a write path into your SIEM. That pipeline adds latency, adds operational complexity, and introduces another failure mode. During the streaming window, raw audit records exist in Azure Monitor, outside your defined perimeter. A compliance team drawing a data flow boundary has to account for that transit.

How Zerq solves it

Zerq runs as a single Go binary with no external control plane dependency. You deploy it inside your Azure subscription on AKS. Configuration writes to your MongoDB. Audit logs write to your MongoDB. Runtime state for rate limiting and caching writes to your Redis. Nothing in the critical path requires a connection outside your subscription boundary.

The platform has three services: the gateway backend (Go, handles all proxying, auth enforcement, rate limiting, audit writing), the management UI (Next.js, admin console), and the developer portal (Next.js, consumer-facing API catalog). MongoDB stores all persistent state. Redis or KeyDB handles distributed cache and coordinates multi-pod rate limit state via pub/sub. An OIDC provider — in your Azure deployment, that is your Entra ID tenant — handles management authentication. All of these run inside your subscription. See Zerq's architecture for a full breakdown.

Deploy Zerq on AKS: the core configuration

Every environment variable resolves to infrastructure inside your Azure subscription:

# Required before startup
MONGODB_URI=mongodb://mongo.zerq.svc.cluster.local:27017/zerq
DB_NAME=zerq
ENCRYPTION_KEY=<32-byte-random-key>
JWT_SECRET=<random-secret>

# Management authentication via Entra ID OIDC
OIDC_ENABLED=true
OIDC_ISSUER_URL=https://login.microsoftonline.com/<tenant-id>/v2.0
OIDC_AUDIENCE=<your-app-registration-client-id>
OIDC_ROLE_CLAIM_PATH=roles
OIDC_VIEWER_ROLES=api-viewer
OIDC_MODIFIER_ROLES=api-operator
OIDC_AUDITOR_ROLES=api-auditor
OIDC_ADMIN_ROLES=api-admin

# Rate limit and caching state
CACHE_TYPE=redis
REDIS_URL=redis://redis.zerq.svc.cluster.local:6379

# Public URL for MCP and portal callback flows
BACKEND_PUBLIC_URL=https://api.yourdomain.com
MCP_PATH=/mcp
MCP_MANAGEMENT_PATH=/api/v1/mcp

MONGODB_URI points to your MongoDB deployment — a pod in the same cluster, or Azure Cosmos DB for MongoDB API if you want a managed data tier that still stays in your subscription. OIDC_ISSUER_URL points to your Entra ID tenant at your tenant ID. The role claim names map directly to Entra ID app roles you define in your app registration. No configuration value points outside your Azure subscription.

Standing up the platform: from zero to serving traffic

Apply backend secrets. Create a Kubernetes Secret containing MONGODB_URI, ENCRYPTION_KEY, and JWT_SECRET. Reference it via envFrom in the backend Deployment spec so secret values never appear in pod specs or ConfigMaps.
Apply the backend Deployment pointing to the Zerq backend image. Pull envFrom from the secret for sensitive values and from a ConfigMap for non-sensitive variables like CACHE_TYPE, OIDC_ENABLED, and BACKEND_PUBLIC_URL.
Verify backend health with kubectl rollout status deployment/zerq-backend -n zerq. Once the rollout completes, confirm the /health endpoint returns 200 before proceeding.
Apply the management UI Deployment with NEXT_PUBLIC_API_BASE_URL pointing to the backend Service URL inside the cluster.
Apply the developer portal Deployment with NEXT_PUBLIC_API_URL and BACKEND_API_URL set to the backend Service URL.
Apply Ingress rules. The gateway backend handles runtime traffic at api.yourdomain.com/* and admin API traffic at api.yourdomain.com/api/v1/*. The management UI and developer portal get separate hostnames with their own TLS certificates under your control.
Confirm all pods are ready with kubectl get pods -n zerq. Confirm ingress resolves with kubectl get ingress -n zerq.
Log in to the management UI with an Entra ID account that carries the api-admin role. Create your first collection and proxy.

After step 8, the gateway is serving traffic and writing audit logs to your MongoDB, inside your AKS cluster, inside your Azure subscription.

Access control that produces auditable per-consumer identity

Zerq's access control model has three layers. A client is a named API consumer assigned to one or more collections and optionally a policy. A profile is an authentication configuration tied to a client: the auth type (token, JWT, OIDC, mTLS, or none), allowed HTTP methods, an IP allowlist, and an active or inactive toggle. A policy defines rate limits at 1-minute, 5-minute, and 1-hour intervals, and quotas at 1-day, 7-day, and 30-day intervals.

Every request that reaches the gateway carries X-Client-ID and X-Profile-ID headers. The gateway enforces collection scope, authentication, IP restrictions, method restrictions, rate limits, and quotas in sequence. Every audit log entry shows exactly which consumer made each call, under which profile, against which collection.

Collection-level scoping means a client assigned to account-read-v2 cannot call payment-initiation. The gateway returns 403 with {"error": "forbidden"}. This scope enforcement happens at the data layer, not through convention, which is what technical auditors want to see.

A banking deployment illustrates how the model composes:

TPP clients: mTLS profile, 1,000 requests per minute
Internal services: token profile, no quota
AI agents: OIDC profile, 100 requests per minute, IP allowlist
Sandbox partners: token profile, 200 requests per minute, 50,000 requests per month quota

Each AI agent gets its own client record. Revoking one agent does not affect others. The management team can toggle a single profile inactive, which immediately stops that consumer from authenticating without touching any other configuration. See access control for the full model.

What the audit trail actually contains

Every request produces a structured record that writes directly to MongoDB:

{
  "timestamp": "2026-06-11T09:14:32.781Z",
  "client_id": "tpp-acmecorp-prod",
  "profile_id": "tpp-acmecorp-mtls",
  "collection": "account-read-v2",
  "method": "GET",
  "path": "/accounts/GB29NWBK601613319428/balance",
  "status": 200,
  "latency_ms": 47,
  "upstream": "core-banking-service",
  "ip": "203.0.113.14"
}

The auditor RBAC role gives compliance teams read-only access to these logs through the management UI or management API, without configuration modification rights. The four roles in the system are viewer, modifier, auditor, and admin. Auditors can query logs and export data. They cannot change API configurations or client assignments. That separation satisfies the least-privilege requirement that most compliance frameworks impose on audit log access.

Because the logs write directly to MongoDB in your cluster, a compliance team can also query MongoDB directly, export to a SIEM using a direct database read, or stream from Cosmos DB change feed if you use Cosmos DB for the data tier. There is no transit window where audit data exists in external infrastructure. See observability capabilities for the full log schema and query options.

Questions to ask any API gateway vendor before signing

Before committing to any API gateway for a regulated environment, ask these eight questions and require written answers:

Where does gateway configuration persist at rest? The answer must be: inside my infrastructure.
Does the runtime gateway require a connection to the vendor's infrastructure during normal operation?
Where do request audit logs write by default? Can they write exclusively to infrastructure I control?
Is there an air-gapped deployment option where no outbound internet connection is required?
What happens to the gateway if the vendor's control plane is unavailable?
Where is the developer portal hosted? Who controls the TLS certificate and DNS?
Does the gateway support per-consumer audit identity, not just a shared API key pool?
Can each API consumer have independent rate limits and collection scope?

Azure API Management answers questions 1, 2, and 5 in ways that are incompatible with strict data residency requirements. Configuration persists in Azure's management plane (question 1). The self-hosted gateway requires outbound connectivity to Azure (question 2). If Azure's management plane is unavailable, policy updates cannot reach the gateway (question 5). Zerq answers all eight affirmatively.

What this looks like in practice

A banking team on Azure starts with Azure API Management: 12 partners, straightforward rate limiting, one or two product tiers. The initial deployment takes a week. Everything works. Two years later the team has 50 TPP connections operating under PSD2. The compliance team reviews the data flow and flags that request logs are transiting Azure Monitor before reaching the SIEM. The security team flags a separate issue: Azure APIM groups consumers by subscription key, so the audit log identifies the subscription but not the individual TPP. Twenty-three TPPs share one subscription key tier. There is no per-TPP audit identity.

The team replaces Azure APIM with Zerq on the existing AKS cluster. MongoDB runs on Azure Cosmos DB with the MongoDB API, which keeps the data tier managed while keeping data inside the subscription. The SIEM reads directly from Cosmos DB change feed. No intermediate pipeline exists, which means no intermediate pipeline can fail. Each of the 50 TPPs gets a dedicated client record with an mTLS profile, a per-TPP rate limit, and collection scope restricted to the PSD2 APIs that TPP is authorized to call.

When the regulator asks for a 30-day access audit for a specific TPP, the compliance team queries by client_id and date range, exports the result, and delivers it. The data has never left the Azure subscription. There is no phone call to a vendor asking for log exports, no dependency on external infrastructure availability, and no gap in the audit chain. See banking and open banking use cases for more context on PSD2 and open banking deployments, and Gateway MCP for AI agents if you are also routing AI agent traffic through the same gateway.

Closing

Azure API Management is capable for teams that do not have strict data residency requirements. For regulated industries — banking, healthcare, government — the control plane problem is architectural and cannot be configured away. Configuration and audit data need to stay in infrastructure you operate. Zerq runs inside your Azure subscription: config writes to your MongoDB, audit logs write to your MongoDB, the developer portal runs in your cluster, each consumer has an auditable identity, and the gateway has no dependency on any external control plane.

Zerq is an enterprise API gateway built for regulated industries — one platform for API management, AI agent access, compliance audit, and developer portal, running entirely in your own infrastructure. See how it works or request a demo to walk through your specific requirements.