Why Kubernetes matters in federal
Federal workloads trend toward Kubernetes for three reasons. First, portability: the same container runs in AWS GovCloud, Azure Government, on-premise OpenShift, and a disconnected classified enclave with minimal changes. For an agency that may move workloads across those boundaries over a 5-year mission lifecycle, that portability matters more than any single optimization. Second, consistent authorization story: Kubernetes platforms provide well-defined control inheritance (pod security, network policy, admission control, audit logging) that maps cleanly to NIST 800-53. Third, ecosystem maturity: the DoD has invested heavily in Platform One and Iron Bank, creating a reference pattern that many agencies now adopt.
The federal Kubernetes stack
- Managed control planes: EKS (AWS GovCloud, FedRAMP High, IL5), AKS (Azure Government, FedRAMP High, IL5), GKE (FedRAMP High). Managed services shift approximately 40 percent of cluster-level control burden to the CSP.
- OpenShift: Red Hat OpenShift Dedicated (FedRAMP High managed), OpenShift Container Platform (self-managed on-premise), ROSA (OpenShift on AWS GovCloud). Strong fit when a federal agency requires commercial support plus opinionated defaults.
- Vanilla Kubernetes: kubeadm, Rancher RKE2, K3s for edge. Maximum flexibility, maximum operational burden.
- DoD Platform One: Iron Bank hardened images, Big Bang Helm chart for full hardened platform, Party Bus as a DevSecOps-as-a-service tier.
Hardening for federal compliance
A default Kubernetes install is not federally authorizable. Hardening steps we apply on every federal cluster:
- etcd encryption at rest with KMS envelope encryption (AWS KMS, Azure Key Vault, or HSM-backed on-premise).
- Pod Security Standards set to Restricted for workload namespaces. Baseline only where legacy workloads require it, with documented exception and remediation plan.
- Network policies default-deny at the namespace level. Cilium or Calico for policy enforcement. Istio or Linkerd for mTLS between services.
- Admission control via OPA Gatekeeper or Kyverno. Policies block privileged containers, host path mounts, root containers, and known-bad image registries.
- Image supply chain anchored in a signed image registry (Quay, Harbor, Artifactory, or DoD Iron Bank). Cosign signature verification at admission time. SBOM generation on every build.
- Runtime security with Falco or Sysdig detecting syscall-level anomalies.
- Audit logging at the control plane set to full verbosity, shipped to the agency SIEM (Splunk, Elastic, Chronicle).
- RBAC principle of least privilege, tied to agency identity provider via OIDC.
Kubernetes STIG implementation
DISA publishes a Kubernetes STIG that codifies hardening requirements for DoD networks. We implement STIG controls via Helm values, GitOps-managed policy, and automated compliance scanning. Every finding in the STIG checklist gets a documented implementation or an approved exception with compensating controls. Platform One's Big Bang ships with STIG-compliant defaults, and we use it as the starting point for IL4/IL5 work.
GitOps for federal delivery
GitOps (Argo CD or Flux) is our default delivery pattern for federal Kubernetes. The cluster state is declared in Git, reconciled continuously, auditable to the commit. This gives us: full rollback to any point in history, no out-of-band cluster changes, clean separation between developer intent and platform enforcement, and a clear artifact trail for authorization. For disconnected environments we mirror to an internal Git server and push signed manifests via an approved transfer process.
Serving AI and ML workloads
Kubernetes is increasingly the substrate for federal ML training and inference. We integrate: NVIDIA GPU Operator for managed driver and device plugin installation, Kueue for gang-scheduled training jobs, KServe or Seldon Core for model serving, Ray for distributed training, and Volcano for batch scheduling on HPC-style workloads. See federal MLOps for how this ties into our broader ML platform approach.
Air-gapped and classified deployments
Classified environments do not get to call out to the internet. Every federal Kubernetes distribution we work with supports disconnected installs: OpenShift via mirror registry, Rancher RKE2 with bundled binaries, K3s with local container runtime. Image mirroring runs through Harbor or Quay deployed inside the enclave. GitOps tooling reconciles from an internal git mirror. Helm charts are pre-packaged and signed.
Multi-cluster and federation
Large federal programs often span multiple clusters (production, staging, regional DR, classified/unclassified splits). Patterns we use: Cluster API for declarative cluster lifecycle, Argo CD ApplicationSets for fleet-wide deployment, service mesh federation for cross-cluster connectivity where network policy permits, and Open Policy Agent policies published centrally.
Who we build Kubernetes for
- DoD — IL4/IL5 workloads on Platform One and AWS GovCloud.
- HHS — FedRAMP High clusters for health data workloads.
- VA — multi-tenant Kubernetes serving claims and health applications.
- GSA — cloud.gov Pages and FedRAMP-aligned PaaS workloads.
- NASA — science computing platforms with GPU workloads.