Deploy a Multi-Container App on Azure Kubernetes Service (AKS)

Deploying a multi-container application on Azure Kubernetes Service (AKS) is less about “getting pods running” and more about creating a repeatable path from container images to a secure, observable, scalable service. AKS removes much of the control plane management burden, but the core Kubernetes responsibilities remain: how you package workloads, expose traffic, manage configuration and secrets, and operate upgrades over time.

This how-to is written for IT administrators and system engineers who want a practical, production-minded deployment flow. It assumes you already understand basic Kubernetes concepts (pod, deployment, service) but may not have assembled them end-to-end on AKS with Azure-native building blocks such as Azure Container Registry (ACR), managed identities, and Azure networking.

The example application is intentionally realistic: a web API plus a sidecar container (for logging/metrics shipping or reverse proxying) and supporting dependencies. The goal isn’t a toy “Hello World,” but a pattern you can adapt to internal line-of-business apps.

What “multi-container” means on Kubernetes (and when to use it)

In Kubernetes, “multi-container application” can mean either multiple containers in one Pod or multiple Pods that together form an application. Both are common, but they solve different problems.

A Pod is the smallest deployable unit in Kubernetes: a set of one or more containers that share the same network namespace (one IP) and can share storage volumes. Putting multiple containers in one Pod is appropriate when they must be co-scheduled, scale together, and tightly coordinate. Classic examples are a main application container plus a sidecar that handles log shipping, proxying (for mTLS), or file synchronization.

If components can scale independently or should be upgraded independently, they usually belong in separate Deployments and Pods (for example, a web frontend Deployment and a backend API Deployment). In this guide you’ll deploy both patterns: a Pod with multiple containers (app + sidecar) and additional Kubernetes objects that represent the application as a whole.

A useful rule: prefer separate Pods unless you specifically need shared fate (same lifecycle and scheduling). Sidecars are the primary exception.

Architecture you will deploy on AKS

To keep the steps concrete, you will deploy:

An AKS cluster with a system node pool and a user node pool.
An Azure Container Registry to store images.
A namespace to isolate application resources.
A Deployment that runs two containers in one Pod: - api: the main web API container - sidecar: a lightweight log shipper or proxy container that reads shared files
A ClusterIP Service that fronts the Deployment.
An Ingress controller (NGINX Ingress is used here for portability) and an Ingress resource to publish HTTP(S).
Configuration and secrets using ConfigMaps and Secrets, with guidance on using external secret managers.
Health probes, resource requests/limits, horizontal pod autoscaling, and basic operational patterns.

Along the way, you’ll see three real-world scenarios that commonly affect deployments:

A shared-infrastructure cluster where multiple teams deploy apps and need isolation and predictable resource behavior.
A regulated environment where private networking and image governance matter.
A bursty API workload that needs autoscaling and safe rollouts.

Prerequisites and planning decisions

Before you run commands, decide a few foundational items. These choices affect everything that follows, especially networking and identity.

You will need:

An Azure subscription with permission to create resource groups, AKS, ACR, and networking resources.
A workstation with Azure CLI installed and authenticated.
kubectl installed.
Optional but recommended: helm for installing the Ingress controller.

Pick a region and resource naming strategy

AKS and ACR should typically live in the same region for latency and egress cost reasons. Pick a region close to users and dependent services. Use a naming scheme that makes it obvious which environment a resource belongs to (dev/test/prod), and keep names DNS-friendly.

In examples below, you’ll use:

Resource group: rg-aks-multi-prod
AKS cluster: aks-multi-prod
ACR: acrmultiprod<unique> (ACR name must be globally unique)
Namespace: app-multi

Networking: basic vs private vs existing VNet

AKS can be created with basic settings (Azure will create a VNet) or attached to an existing VNet (advanced networking control). For production, attaching to a pre-planned VNet is common because it allows:

Controlled IP addressing
Integration with existing firewalls/NVAs
Private endpoints to Azure PaaS services

This guide uses a straightforward managed VNet approach to stay focused on Kubernetes deployment, but the application-layer steps remain the same if you bring your own VNet.

Identity for pulling images and accessing Azure services

There are two separate identity concerns:

Image pulls from ACR: AKS needs permission to pull images from your registry. The simplest approach is to attach ACR to AKS so the cluster can pull without extra imagePullSecrets.
Workload access to Azure services: if your pods need to access Azure resources (Key Vault, Storage, etc.), use modern Kubernetes-to-Azure identity patterns (for example, Azure Workload Identity) rather than embedding secrets.

You’ll configure the ACR integration in the cluster setup. For workload identity, you’ll set the stage and show how it fits into the deployment model.

Create Azure resources (resource group, ACR, AKS)

Start by creating a resource group and an ACR. Then create AKS and attach it to ACR.

Create a resource group and ACR

Use Azure CLI:


# Set variables

LOCATION="eastus"
RG="rg-aks-multi-prod"
ACR_NAME="acrmultiprod$RANDOM"  

# ensure uniqueness; adjust for your org naming

az group create --name "$RG" --location "$LOCATION"

az acr create \
  --resource-group "$RG" \
  --name "$ACR_NAME" \
  --sku Standard \
  --admin-enabled false

Disabling the admin user is a good default for governance. You can still authenticate via Azure AD for CI/CD and use AKS-to-ACR attachment for runtime pulls.

Create AKS with a separate user node pool

AKS uses a system node pool for cluster-critical components. For most production clusters, create at least one user node pool where your workloads run, so system components aren’t competing with application pods.

bash
AKS_NAME="aks-multi-prod"

az aks create \
  --resource-group "$RG" \
  --name "$AKS_NAME" \
  --location "$LOCATION" \
  --node-count 3 \
  --enable-managed-identity \
  --generate-ssh-keys \
  --network-plugin azure \
  --node-vm-size "Standard_DS3_v2" \
  --enable-oidc-issuer \
  --enable-workload-identity

Notes on the flags:

--enable-managed-identity uses a managed identity for the cluster rather than service principals.
--enable-oidc-issuer and --enable-workload-identity prepare the cluster for Azure Workload Identity, which is useful when your pods need Azure access without storing long-lived secrets.
--network-plugin azure uses Azure CNI. Many enterprises prefer it because pod IPs are allocated from the VNet space; it can simplify routing and firewalling. Make sure your subnet sizing is appropriate.

Now add a user node pool:

bash
az aks nodepool add \
  --resource-group "$RG" \
  --cluster-name "$AKS_NAME" \
  --name "userpool1" \
  --node-count 3 \
  --mode User \
  --node-vm-size "Standard_DS3_v2"

Attach ACR to AKS

Attaching ACR grants the AKS cluster identity the AcrPull role on the registry, so your pods can pull images without an imagePullSecret.

bash
az aks update \
  --resource-group "$RG" \
  --name "$AKS_NAME" \
  --attach-acr "$ACR_NAME"

Get credentials for kubectl

bash
az aks get-credentials --resource-group "$RG" --name "$AKS_NAME"

kubectl get nodes

At this point you have a working cluster and a registry. Next, you’ll build and push images so Kubernetes has something real to run.

Build and push multi-container images to ACR

A multi-container Pod typically means multiple images. You can build locally and push, or use az acr build (server-side build). For many teams, server-side builds are convenient because they avoid Docker installation on build agents and keep artifacts in Azure.

For demonstration, assume you have two Dockerfiles:

./api/Dockerfile
./sidecar/Dockerfile

Build images with ACR Tasks (server-side)

bash
ACR_LOGIN_SERVER=$(az acr show -n "$ACR_NAME" -g "$RG" --query loginServer -o tsv)

# Build API image

az acr build \
  --registry "$ACR_NAME" \
  --image "multi/api:1.0.0" \
  --file ./api/Dockerfile \
  ./api

# Build sidecar image

az acr build \
  --registry "$ACR_NAME" \
  --image "multi/sidecar:1.0.0" \
  --file ./sidecar/Dockerfile \
  ./sidecar

az acr repository list --name "$ACR_NAME" -o table

In production CI/CD, you’ll likely tag with a git SHA and also push an environment tag (for example :prod). Keep in mind that “latest” is convenient but undermines traceability.

Real-world scenario: regulated environments and image governance

In regulated environments, the image pipeline is often as important as the cluster. You may need to prove that images are scanned, signed, and that only approved registries are used. Even if you don’t implement signing on day one, you should establish a policy baseline: images come from ACR, tags are immutable from a release perspective, and deployments reference exact tags (or digests) to make rollbacks deterministic.

AKS supports admission control and policy enforcement with Azure Policy for Kubernetes (and Kubernetes-native policy engines in general). The deployment steps in this guide work regardless of policy choice, but keep governance in mind because it influences how you structure namespaces, service accounts, and CI permissions.

Create a namespace and baseline Kubernetes objects

Namespaces provide scoping for names and a unit of isolation for RBAC, network policies, and quotas. Even in a single-application cluster, using namespaces is a healthy habit.

bash
kubectl create namespace app-multi

kubectl config set-context --current --namespace=app-multi

With the namespace set, subsequent kubectl apply commands will default to app-multi, reducing the chance of accidentally deploying into default.

Apply resource quotas and limit ranges (optional but recommended)

In shared clusters, quotas prevent a team from exhausting cluster resources accidentally. A ResourceQuota caps total resource consumption; a LimitRange sets default requests/limits so pods don’t end up with “unbounded” CPU/memory.

Create quota.yaml:

yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: app-multi-quota
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "30"
---
apiVersion: v1
kind: LimitRange
metadata:
  name: app-multi-limits
spec:
  limits:
  - type: Container
    defaultRequest:
      cpu: 100m
      memory: 128Mi
    default:
      cpu: 500m
      memory: 512Mi

Apply it:

bash
kubectl apply -f quota.yaml

Real-world scenario: shared clusters and predictable behavior

In a platform team model, you might host many teams on one AKS cluster. The failure mode you want to avoid is “one deployment without requests/limits consumes the node,” causing noisy-neighbor outages. Quotas and limit ranges don’t solve every contention issue, but they establish predictable scheduling and make capacity planning possible.

The remainder of the guide assumes you have at least resource requests set on containers, because autoscaling and stability depend on them.

Create ConfigMaps and Secrets for configuration

Most applications need runtime configuration: environment name, feature flags, connection strings, API keys, and so on. Kubernetes provides ConfigMaps for non-sensitive configuration and Secrets for sensitive data.

A key operational point: Secrets in Kubernetes are base64-encoded, not encrypted by default in the etcd data store unless you enable encryption-at-rest for Kubernetes secrets. Treat them as sensitive and prefer external secret stores for high-security environments.

ConfigMap for non-sensitive settings

Create configmap.yaml:

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
data:
  APP_ENV: "prod"
  LOG_LEVEL: "info"
  HTTP_PORT: "8080"

Apply it:

bash
kubectl apply -f configmap.yaml

Secret for sensitive settings (example)

Create secret.yaml:

yaml
apiVersion: v1
kind: Secret
metadata:
  name: api-secrets
type: Opaque
data:


# echo -n 'supersecret' | base64

  DB_PASSWORD: c3VwZXJzZWNyZXQ=

Apply it:

bash
kubectl apply -f secret.yaml

In production, avoid committing secret.yaml to source control with real values. Use your CI/CD system to generate or inject secrets, or integrate with an external secrets operator.

Workload identity as a longer-term secret strategy

If your API needs to access Azure Key Vault, Storage, or Service Bus, the most robust approach is to use Azure Workload Identity (federated identity) so the pod can get Azure access tokens without embedding a client secret. The cluster creation earlier enabled the OIDC issuer and workload identity to support this. The exact wiring depends on your app and Azure resource, but the deployment patterns below (service accounts, labels, and environment injection) are designed to accommodate it.

Deploy the multi-container Pod using a Deployment

Now you’ll create the core workload: a Deployment with two containers in each Pod. The containers share a volume so the sidecar can process logs produced by the main container.

This is a common pattern when you need to ship logs to a destination but don’t want to bake log-forwarding logic into the application image. Many organizations now prefer node-level logging agents, but sidecars are still relevant when logs are produced in a specific format, need per-app routing, or you need an in-pod proxy.

Define the Deployment

Create deployment.yaml and replace <ACR_LOGIN_SERVER> with your registry’s login server (for example, acrmultiprod12345.azurecr.io).

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: multi-api
  labels:
    app: multi-api
spec:
  replicas: 3
  revisionHistoryLimit: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: multi-api
  template:
    metadata:
      labels:
        app: multi-api
    spec:
      containers:
      - name: api
        image: <ACR_LOGIN_SERVER>/multi/api:1.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 8080
        envFrom:
        - configMapRef:
            name: api-config
        - secretRef:
            name: api-secrets
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: "1"
            memory: 1Gi
        readinessProbe:
          httpGet:
            path: /health/ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 2
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /health/live
            port: http
          initialDelaySeconds: 20
          periodSeconds: 20
          timeoutSeconds: 2
          failureThreshold: 3
        volumeMounts:
        - name: shared-logs
          mountPath: /var/log/app
      - name: sidecar
        image: <ACR_LOGIN_SERVER>/multi/sidecar:1.0.0
        imagePullPolicy: IfNotPresent
        env:
        - name: LOG_DIR
          value: /var/log/app
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            cpu: 200m
            memory: 256Mi
        volumeMounts:
        - name: shared-logs
          mountPath: /var/log/app
      volumes:
      - name: shared-logs
        emptyDir: {}

Apply it:

bash

# Get your ACR login server value

ACR_LOGIN_SERVER=$(az acr show -n "$ACR_NAME" -g "$RG" --query loginServer -o tsv)

# Replace placeholder (example using sed on macOS/Linux)

sed "s|<ACR_LOGIN_SERVER>|$ACR_LOGIN_SERVER|g" deployment.yaml | kubectl apply -f -

kubectl rollout status deploy/multi-api
kubectl get pods -l app=multi-api -o wide

A few important details are embedded here:

RollingUpdate with maxUnavailable: 0 ensures no downtime during rolling upgrades if you have enough capacity.
Readiness and liveness probes protect service stability. Readiness gates traffic; liveness restarts hung containers.
Resource requests/limits make scheduling deterministic and enable autoscaling decisions.
emptyDir volume is ephemeral and tied to the Pod lifecycle. That’s appropriate for transient log files or socket sharing, not for durable data.

Validate that both containers run

bash
kubectl describe pod -l app=multi-api | sed -n '/Containers:/,/Conditions:/p'

kubectl logs -l app=multi-api -c api --tail=50
kubectl logs -l app=multi-api -c sidecar --tail=50

If the sidecar depends on files created by the API, make sure the API writes logs to /var/log/app inside the container, and that file permissions allow the sidecar to read them.

Expose the application internally with a Service

In Kubernetes, a Service provides a stable virtual IP and DNS name for a set of Pods selected by labels. For internal traffic inside the cluster, a ClusterIP Service is the default.

Create service.yaml:

yaml
apiVersion: v1
kind: Service
metadata:
  name: multi-api-svc
  labels:
    app: multi-api
spec:
  type: ClusterIP
  selector:
    app: multi-api
  ports:
  - name: http
    port: 80
    targetPort: http

Apply it:

bash
kubectl apply -f service.yaml
kubectl get svc multi-api-svc

At this point, anything inside the cluster can reach your API at http://multi-api-svc (port 80). Next you’ll publish it outside the cluster using Ingress.

Install an Ingress controller and publish HTTP(S)

An Ingress is a Kubernetes resource that defines how external HTTP(S) traffic routes to Services. It requires an Ingress controller, which is the component that actually implements the routing (for example NGINX Ingress, Azure Application Gateway Ingress Controller, Traefik).

This guide uses NGINX Ingress because it’s widely understood and works consistently across environments. In some enterprises, Application Gateway is preferred for WAF integration and centralized TLS policies. The application-layer resources you create (Service, readiness probes) remain useful regardless of controller choice.

Install NGINX Ingress with Helm

First, add the Helm repo and install into a dedicated namespace:

bash
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

kubectl create namespace ingress-nginx

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --set controller.replicaCount=2

Wait for the controller service to get an external IP:

bash
kubectl get svc -n ingress-nginx ingress-nginx-controller -w

On AKS, this typically provisions an Azure Load Balancer. Capture the external IP; you’ll use it for DNS.

Create an Ingress resource

Create ingress.yaml:

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: multi-api-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /(.*)
        pathType: Prefix
        backend:
          service:
            name: multi-api-svc
            port:
              number: 80

Apply it:

bash
kubectl apply -f ingress.yaml
kubectl describe ingress multi-api-ingress

Update DNS so api.example.com points to the external IP of the ingress controller. When DNS propagates, you should be able to hit the API through the Ingress.

Add TLS (recommended)

For TLS, you can either use a pre-provisioned certificate (stored as a Kubernetes TLS secret) or automate certificate issuance with cert-manager and Let’s Encrypt. In production, many organizations terminate TLS at a centralized gateway, but cluster-level TLS is still common.

If you already have a certificate and private key:

bash
kubectl create secret tls multi-api-tls \
  --cert=./tls.crt \
  --key=./tls.key

Then reference it in ingress.yaml:

yaml
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - api.example.com
    secretName: multi-api-tls
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /(.*)
        pathType: Prefix
        backend:
          service:
            name: multi-api-svc
            port:
              number: 80

Re-apply the Ingress.

Real-world scenario: private clusters and controlled ingress

In environments where the cluster API server and workloads must not be reachable from the public internet, you might run a private AKS cluster and publish applications through private load balancers or through an enterprise edge (Azure Front Door, Application Gateway, or a third-party appliance). The Kubernetes resources you created—Service and Ingress—still apply, but the underlying load balancer can be internal-only, and DNS will resolve to private IPs.

The operational takeaway is that you should design your Ingress approach before you standardize manifests across teams. Switching from public NGINX to internal Application Gateway later is possible, but it’s less disruptive if you keep your app’s assumptions minimal (HTTP, proper headers, health endpoints) and isolate controller-specific annotations.

Implement health endpoints and probes correctly

The Deployment manifest used readiness and liveness probes. Those probes are only as good as the endpoints they call.

A readiness endpoint should validate dependencies required to serve traffic. For an API, that might include:

Ability to read configuration and start up correctly
A basic check that the HTTP server loop is running
Optional: connectivity to the database or downstream services (be careful to avoid cascading failure)

A liveness endpoint should be simpler: it should return success if the process is alive and not deadlocked. If liveness checks include heavyweight dependencies, you can end up in restart loops during transient outages.

When you deploy multi-container Pods, remember that each container can have its own probes. A sidecar that occasionally restarts might be acceptable, but if the sidecar is essential to the Pod’s function (for example, it proxies all traffic), you should define probes and resource limits for it too.

Add autoscaling: HPA for pods and node pool scaling

Multi-container deployments often fail under load because they scale too late or not at all. Kubernetes autoscaling has layers:

Horizontal Pod Autoscaler (HPA) scales replicas based on metrics (commonly CPU/memory; can also use custom metrics).
Cluster Autoscaler scales node pools when pods can’t be scheduled due to lack of resources.

AKS supports cluster autoscaler for node pools, and Kubernetes supports HPA. You need both if you want real elasticity.

Enable cluster autoscaler for the user node pool

Enable autoscaler on your user pool:

bash
az aks nodepool update \
  --resource-group "$RG" \
  --cluster-name "$AKS_NAME" \
  --name "userpool1" \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 10

Create an HPA for the Deployment

HPA requires a metrics source. On AKS, the Metrics Server is commonly installed by default in modern clusters, but validate it:

bash
kubectl get deployment -n kube-system metrics-server

If it exists and is healthy, you can create an HPA:

Create hpa.yaml:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: multi-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: multi-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60

Apply it:

bash
kubectl apply -f hpa.yaml
kubectl get hpa multi-api-hpa

Real-world scenario: bursty APIs and safe scaling

A common production pattern is an API that is quiet most of the day but spikes during business events (batch windows, payroll runs, product launches). Without HPA, you either overprovision replicas or accept degraded performance during spikes. With HPA but without cluster autoscaler, replicas increase until you run out of node capacity, and pending pods appear.

By configuring both HPA and node pool autoscaling, you create a chain: increased CPU utilization scales pods; if pods can’t schedule, the cluster adds nodes. This is why resource requests are non-negotiable: without them, Kubernetes can’t make good scheduling decisions, and cluster autoscaler won’t know how much capacity to add.

Use rolling deployments and rollback controls

The Deployment strategy chosen earlier (maxUnavailable: 0) aims for zero downtime, but it assumes capacity exists for the surge pod and that readiness probes are correct. Rolling deployments are safer when you also control:

how long Kubernetes waits for readiness
how quickly it proceeds
whether it pauses on failure

You can tune these with progressDeadlineSeconds and the probes. For example, if your app takes 90 seconds to warm up caches, a 5-second readiness initial delay is too short.

To deploy a new version, update the image tag:

bash
kubectl set image deployment/multi-api \
  api="$ACR_LOGIN_SERVER/multi/api:1.0.1" \
  sidecar="$ACR_LOGIN_SERVER/multi/sidecar:1.0.1"

kubectl rollout status deploy/multi-api

If you need to roll back:

bash
kubectl rollout undo deploy/multi-api
kubectl rollout status deploy/multi-api

A disciplined image tagging strategy (unique tags per build) makes rollback reliable. If you reuse tags, you may “roll back” to an image that no longer exists or has changed.

Strengthen security posture: RBAC, pod security, and network controls

After the first successful deployment, security hardening is where production clusters diverge from test clusters. The core idea is to reduce blast radius: if one pod is compromised, it should not easily access secrets, other namespaces, or the node.

Use service accounts deliberately

By default, pods run as the namespace’s default service account, which may be more permissive than you want. Create a dedicated service account for the app and bind minimal permissions.

Create serviceaccount.yaml:

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: multi-api-sa

Update the Deployment to use it:

yaml
spec:
  template:
    spec:
      serviceAccountName: multi-api-sa

If your app doesn’t need to call the Kubernetes API, you can also disable automounting API tokens:

yaml
spec:
  template:
    spec:
      automountServiceAccountToken: false
      serviceAccountName: multi-api-sa

This reduces the risk of token theft from within the container.

Run containers as non-root where possible

Many base images can run as non-root, which reduces the impact of container escapes and filesystem tampering.

Add a security context per container (adjust UID/GID to match your image):

yaml
securityContext:
  runAsNonRoot: true
  runAsUser: 10001
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

Read-only root filesystems are excellent for reducing persistence, but they require you to write temporary files to mounted volumes (like emptyDir). If your API writes to /tmp, mount an emptyDir there.

Network policies (where supported)

A NetworkPolicy restricts pod-to-pod traffic. In many organizations, the desired baseline is “deny by default and allow only what’s needed,” but implementing that requires planning because it can break service discovery if applied indiscriminately.

AKS network policy support depends on your networking mode and configuration. If your cluster supports it, you can start by limiting inbound traffic to only the Ingress controller namespace and allow egress only to required destinations.

The key operational point is that multi-container Pods do not change network policy behavior; policies are applied at the pod level. So if you allow traffic to the API pod, the sidecar shares that access.

Observability: logs, metrics, and basic tracing patterns

Once the app is reachable, the most valuable improvement you can make is visibility into what it’s doing. On AKS, many teams use Azure Monitor (Container insights) for cluster and workload telemetry. Regardless of tooling, your Kubernetes resources should be friendly to observability.

Structured logs and sidecar log handling

If you use a sidecar to ship logs, ensure the contract is clear:

Where logs are written (for example, /var/log/app/app.log)
Rotation strategy (avoid unbounded file growth)
What happens if the shipper is down (buffering vs dropping)

In a multi-container Pod, an emptyDir volume is frequently used as a shared buffer. This is not durable storage; it disappears when the pod is rescheduled. That’s acceptable for log shipping as long as you accept that some logs may be lost during restarts. If you need durable delivery guarantees, ship logs over stdout/stderr to node agents or use a queue.

Metrics that align with autoscaling

HPA based on CPU is a reasonable starting point, but many APIs scale better on request rate or latency. If you later adopt custom metrics, keep readiness probes and resource requests tuned so that scaling changes correlate with real load.

A practical pattern is:

Use CPU-based HPA initially.
Ensure the API exports application metrics (Prometheus format or similar).
Introduce latency SLOs and custom metrics once you trust the baseline deployment.

Storage considerations for multi-container Pods

The example used emptyDir, which is fast and simple. In real applications, you may need persistent storage for uploads, caches, or state that must survive pod restarts.

On AKS, persistent storage is typically provided via Kubernetes PersistentVolumes backed by Azure Disk or Azure Files.

Azure Disk is generally used for single-writer workloads (ReadWriteOnce) such as databases (though running databases on Kubernetes has additional considerations).
Azure Files supports ReadWriteMany and is common for shared file shares.

If your multi-container Pod needs persistent shared storage between its containers, a persistent volume can replace emptyDir. Be mindful that shared storage can become a bottleneck and may introduce latency.

Manage configuration changes safely

ConfigMaps and Secrets can be updated without redeploying images, but pods don’t always pick up changes automatically depending on how they consume them.

If you mount a ConfigMap as a volume, Kubernetes updates the mounted files (with some delay).
If you use environment variables via envFrom, you typically need to restart pods to see the new values.

A common operational pattern is to trigger a rolling restart when config changes:

bash
kubectl rollout restart deployment/multi-api
kubectl rollout status deployment/multi-api

For controlled environments, config changes should be versioned and promoted similarly to application code. If your organization uses GitOps, treat config as code and let the controller manage restarts.

Day-2 operations on AKS that affect deployments

Your manifests may be correct, but AKS is a managed service with lifecycle events: node image upgrades, Kubernetes version upgrades, and scaling events. The way your workload behaves during those events depends on your Kubernetes primitives.

PodDisruptionBudgets (PDBs) for availability during maintenance

A PodDisruptionBudget limits voluntary disruptions (for example, node drain during upgrade) so too many replicas are not evicted at once.

Create pdb.yaml:

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: multi-api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: multi-api

Apply it:

bash
kubectl apply -f pdb.yaml

This works with your rolling update strategy to maintain capacity during upgrades.

Spread replicas across nodes and zones

If your region and cluster support availability zones, consider enabling zonal node pools so replicas can survive a zone outage. Kubernetes can also spread pods across nodes using topology spread constraints.

Even without zones, spreading replicas reduces the chance that a single node failure takes out all replicas.

A topology spread constraint example (add under spec.template.spec):

yaml
topologySpreadConstraints:
- maxSkew: 1
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels:
      app: multi-api

This encourages even distribution across nodes.

Separate node pools for specialized workloads

As your application grows, you may add workloads with different needs: CPU-heavy jobs, memory-heavy caches, or GPU inference. AKS node pools let you isolate these workloads and apply taints/tolerations.

Even for a simple API, separating system and user node pools is a baseline. From there, you can add pools for ingress, batch, or sensitive workloads.

Validate the deployment end-to-end

With the Deployment, Service, and Ingress applied, validate from multiple angles.

Start with Kubernetes state:

bash
kubectl get deploy,rs,pods -l app=multi-api
kubectl get svc multi-api-svc
kubectl get ingress multi-api-ingress

Then validate service routing inside the cluster. Use a temporary pod to curl the service:

bash
kubectl run tmp --image=curlimages/curl:8.5.0 -it --rm --restart=Never -- \
  curl -sS http://multi-api-svc/health/ready

Finally validate externally through DNS and Ingress:

bash
curl -i https://api.example.com/health/ready

If your API returns redirects or requires headers, configure the Ingress accordingly. Many production APIs expect X-Forwarded-Proto and X-Forwarded-For headers, which most ingress controllers provide by default.

Putting it together: three deployment patterns you can reuse

At this stage, you have a working multi-container application deployment on AKS. To make it reusable across environments and teams, it helps to recognize the patterns embedded in the steps.

Pattern 1: Team-shared cluster with quotas and safe defaults

If you run a shared AKS cluster, start your application onboarding with namespace creation, quotas, and default resource requests/limits. Teams can still override requests/limits for specific services, but the baseline prevents uncontrolled consumption.

The multi-container Pod pattern works well here when the sidecar is part of the application contract (for example, an internal proxy that enforces mTLS to downstream services). It keeps the complexity within the team’s namespace rather than in cluster-wide agents.

Pattern 2: Private networking and controlled egress

When the cluster is private and egress is restricted, you will discover dependencies you didn’t know you had: OS package repos during builds, public container registries, external identity endpoints, and telemetry exporters. The safe approach is to:

keep runtime image pulls restricted to ACR
use private endpoints where possible
minimize outbound dependencies in the application image

In that model, the steps in this guide—ACR integration, explicit image references, and deliberate ingress design—become essential rather than optional.

Pattern 3: Bursty workloads with HPA + cluster autoscaler

For unpredictable traffic, the combination of HPA and node pool autoscaling is the difference between “Kubernetes is stable” and “Kubernetes is constantly under pressure.” Your readiness probes, requests/limits, and rollout strategy all feed into this.

A practical rollout for such workloads is to:

ship a new version with a conservative surge
confirm readiness behavior under load
then tune HPA targets and maxReplicas

Multi-container Pods add a capacity consideration: you scale the unit of the pod, so the sidecar’s resource overhead is multiplied by replicas. Make sure sidecar requests are right-sized and that it doesn’t become the hidden driver of scaling cost.

Optional: package the deployment with Helm for environment reuse

As you move from “one-off apply” to repeatable deployments, you’ll likely want templating. Helm is a common choice because it can parameterize:

image tags
replica counts
hostnames and TLS secret names
resource requests/limits

Even if your organization prefers Kustomize or GitOps tools, the underlying manifest structure you built remains valid.

A minimal Helm approach is to template the Deployment and Ingress hostnames while keeping ConfigMaps and Secrets managed externally. This reduces the chance of templating secrets into release artifacts.

Optional: integrate with CI/CD without weakening security

A common operational goal is: build images, push to ACR, update Kubernetes manifests, and deploy to AKS automatically. The secure version of that pipeline typically includes:

Azure AD-based authentication to ACR (no admin user)
least-privilege permissions for the deployment identity
artifact-based promotion (promote the same image from dev to prod)

For AKS access, avoid using personal credentials in CI. Use a workload identity or service principal with scoped permissions, and limit it to the namespaces and actions required.

If you adopt GitOps, the CD system runs in-cluster with access to the Kubernetes API and pulls manifests from a repo. In that model, your “pipeline” builds and pushes images, then updates Git; the cluster reconciler applies changes.

Clean up (if you used a temporary environment)

If you created these resources for a lab environment and want to remove them, deleting the resource group will remove AKS, ACR, and associated resources.

bash
az group delete --name "$RG" --yes --no-wait

For production environments, keep the resource group and manage lifecycle with infrastructure-as-code so upgrades and changes are controlled and auditable.

Deploy a Multi-Container Application on Azure Kubernetes Service (AKS)

What “multi-container” means on Kubernetes (and when to use it)

Architecture you will deploy on AKS

Prerequisites and planning decisions

Pick a region and resource naming strategy

Networking: basic vs private vs existing VNet

Identity for pulling images and accessing Azure services

Create Azure resources (resource group, ACR, AKS)

Create a resource group and ACR

Create AKS with a separate user node pool

Attach ACR to AKS

Get credentials for kubectl

Build and push multi-container images to ACR

Build images with ACR Tasks (server-side)

Real-world scenario: regulated environments and image governance

Create a namespace and baseline Kubernetes objects

Apply resource quotas and limit ranges (optional but recommended)

Real-world scenario: shared clusters and predictable behavior

Create ConfigMaps and Secrets for configuration

ConfigMap for non-sensitive settings

Secret for sensitive settings (example)

Workload identity as a longer-term secret strategy

Deploy the multi-container Pod using a Deployment

Define the Deployment

Validate that both containers run

Expose the application internally with a Service

Install an Ingress controller and publish HTTP(S)

Install NGINX Ingress with Helm

Create an Ingress resource

Add TLS (recommended)

Real-world scenario: private clusters and controlled ingress

Implement health endpoints and probes correctly

Add autoscaling: HPA for pods and node pool scaling

Enable cluster autoscaler for the user node pool

Create an HPA for the Deployment

Real-world scenario: bursty APIs and safe scaling

Use rolling deployments and rollback controls

Strengthen security posture: RBAC, pod security, and network controls

Use service accounts deliberately

Run containers as non-root where possible

Network policies (where supported)

Observability: logs, metrics, and basic tracing patterns

Structured logs and sidecar log handling

Metrics that align with autoscaling

Storage considerations for multi-container Pods

Manage configuration changes safely

Day-2 operations on AKS that affect deployments

PodDisruptionBudgets (PDBs) for availability during maintenance

Spread replicas across nodes and zones

Separate node pools for specialized workloads

Validate the deployment end-to-end

Putting it together: three deployment patterns you can reuse

Pattern 1: Team-shared cluster with quotas and safe defaults

Pattern 2: Private networking and controlled egress

Pattern 3: Bursty workloads with HPA + cluster autoscaler

Optional: package the deployment with Helm for environment reuse

Optional: integrate with CI/CD without weakening security

Clean up (if you used a temporary environment)