Keymate Logo
← Back to Blog

Beyond the Default Image: Hardening Keycloak for Enterprise Production

Keymate Team
April 2026
Beyond the Default Image: Hardening Keycloak for Enterprise Production

Estimated read: 8–9 minutes

TL;DR

  • The default Keycloak image is a baseline, not a hardened production runtime. Keycloak's own production guides treat it that way too — TLS, hostname, --optimized, and JVM tuning are positioned as deployment-time concerns. The starting baseline ships a general-purpose UBI base, root group membership, no JVM security tuning, and dozens of OS-level CVEs at any given moment.
  • Wolfi OS by Chainguard cuts the OS CVE count to zero. Daily source rebuilds, Sigstore signing, glibc compatibility, and roughly 65 packages in the final runtime.
  • Application-layer CVEs need a separate strategy. A Maven dependencyManagement override patches transitive JARs (Vert.x, Netty, Jackson, JDBC) at build time, and Copa retrofits OS packages post-build when a fresh CVE lands.
  • The runtime is Kubernetes-native. Non-root UID 2000, restricted Pod Security Standard, JGroups DNS clustering, SmallRye health endpoints, and a Quarkus-optimized startup under 10 seconds.
  • The point is operational risk, not CVE math alone. A pre-hardened image collapses patch toil, shortens incident exposure windows, and gives platform teams a predictable supply chain — essential for regulated and air-gapped environments where ad-hoc runtime patching is off the table.

The Problem: Defaults Are Not Production

To be clear up front: the upstream Keycloak container is a competent, well-maintained starting point, and Keycloak's own production deployment documentation treats it as one — --optimized, hostname, TLS, reverse proxy, and JVM tuning are walked through as deployment-time concerns rather than baked into the image. We are not arguing that the upstream image is broken. We are arguing that the gap between that baseline and a regulated production posture is real, specific, and worth closing in one place.

The same reasoning applies to the FIPS-validated variants. FIPS 140 certifies cryptographic modules; it does not certify container attack surface, package count, or runtime user model. A FIPS-compliant Keycloak image still rides the same general-purpose base layer with the same baseline package surface as the standard one. Hardening sits at a different layer of the stack, and that is the layer this post is about.

Keycloak handles authentication tokens, session data, and user credentials — the most sensitive assets in any system. IAM sits at the trust root of every service that delegates authentication to it, which means a compromised Keycloak is never a single-system incident — it hands an attacker the keys to every downstream system that relies on its verdicts.

The default container image was not designed for that threat model. It ships with:

  1. A general-purpose UBI base carrying hundreds of packages and the CVE exposure that comes with them.
  2. Root group membership (GID 0) for the container process, which collapses defense-in-depth the moment a runtime escape happens.
  3. No JVM security hardening. Default entropy source, no OOM behavior tuning, no GC tuning, no flight-recording integration.
  4. Build-time artifacts left in the runtime image — JDK bits, Maven traces, source distribution, anything that helps an attacker reconnoiter and pivot.

The upstream Keycloak image is itself multi-stage built on UBI. None of these issues come from the Keycloak distribution. They come from the general-purpose base layer underneath it. Hardening means swapping that layer out and applying a systematic set of controls on top.

Every package, every permission, and every open port that exists without a clear operational purpose is an attack surface waiting to be exploited.

Diagram showing how the hardened runtime stage layers Keycloak distribution onto Wolfi OS, applies CVE patching, creates a non-root user, and produces a Quarkus-optimized image with zero build tools

Beyond CVE Count: The Operational Case

A hardened image looks like a security artifact, but its largest impact is operational. The CVE count on a scanner dashboard is a downstream signal; what platform and IAM teams actually feel is the cadence of unplanned work that surrounds it.

On the default image, a critical CVE in the base layer or a transitive Java dependency creates a familiar pattern: a Friday-evening scanner alert, an ad-hoc rebuild, a coordination thread with security, an unplanned change window. Multiply that by the number of times it happens in a year and the cost is on-call attrition, not just engineering hours. A pre-hardened image with a daily-rebuilt base, a versioned Maven override flow, and Copa-style post-build patching folds most of that work into the normal pipeline. The incident exposure window shrinks from weeks to hours, and patching becomes a scheduled event rather than a fire drill.

The compliance side benefits from the same property. SBOM, signed base, OpenVEX-documented exceptions, and a reproducible build digest mean auditor questions — which CVEs are open?, how did you assess this finding?, show me the supply chain — have machine-readable answers that already live alongside the image. The artifact arrives at the audit with its own evidence package.

Air-gapped operations are where these properties stop being convenient and start being mandatory. With no outbound path to the internet, every runtime apk add, every transitive Maven fetch, every fallback to an upstream package mirror is an outage waiting to happen. A self-contained, fully patched image with a complete SBOM and a deterministic update cadence is the only model that survives — and the one that lets a platform team plan around a published image rather than around the day a CVE happens to drop.

Regulatory Context: Why Hardened Images Matter Now

The market for secure, minimal container images did not appear by accident. It is the operational answer to a regulatory wave that has redrawn what "shipping software" means in the last five years.

European Union

The Cyber Resilience Act (Regulation (EU) 2024/2847) entered into force in December 2024, with full applicability in December 2027. Its scope is broad — any "product with digital elements" placed on the EU market — and its essential requirements (Annex I) map directly onto how a container image is built and shipped:

  • No known exploitable vulnerabilities at time of release — addressed by Tier 1 (Wolfi rebuilds), Tier 2 (Maven dependencyManagement overrides for Java CVEs), and Tier 2b (Copa post-build patching).
  • Secure by default configuration — non-root UID 2000, restricted Pod Security Standard, HTTPS-only listener, no plaintext HTTP opt-in.
  • Limited attack surface — roughly 65 packages, no package manager retained at runtime, no build tools in the runtime image.
  • Vulnerability handling and SBOM (Annex II) — Trivy + Syft SBOM generation, OpenVEX for documented exceptions, Sigstore-signed base.

NIS2 (Directive (EU) 2022/2555) reinforces the same posture for "essential" and "important" entities by making the operator accountable for supply-chain risk, not just the vendor.

United States

The U.S. trajectory is policy-led rather than regulation-led, but the destination is similar:

  • Executive Order 14028 (May 2021) launched the federal supply-chain push and seeded everything that followed.
  • NIST SP 800-218 (SSDF) codifies the secure software development practices federal procurement now expects.
  • The current federal procurement model is risk-based. EO 14306 (June 2025) and OMB M-26-05 (January 2026) frame the active federal posture: each agency validates vendor software security through a comprehensive risk assessment tailored to its mission, with NIST SSDF and SBOM serving as the technical baseline through sector procurement reviews.
  • Sector-specific regimes — FedRAMP Rev 5, HIPAA Security Rule, PCI-DSS v4, NYDFS Part 500 — have all converged on the same primitives: SBOM, attack-surface minimization, vulnerability disclosure cadence, secure default configuration.

Sector-Specific Mandates That Force Isolation

For some operators the requirement is not "secure by default" but "demonstrably isolated." A few of the frameworks our customers run under spell this out explicitly:

  • NIS2 essential entities (Annex I — energy, transport, drinking water, digital infrastructure, public administration) treat network segmentation as a baseline risk-management control. In practice many essential-entity operators implement that segmentation as air-gapped or one-way-gateway deployments for identity-critical systems.
  • IEC 62443 (industrial automation and control systems) defines a zone-and-conduit model in which higher security levels effectively dictate strict isolation between operational technology and corporate networks.
  • SWIFT Customer Security Programme (CSP) mandates segregation of SWIFT-related infrastructure from the general enterprise network; many financial institutions implement this as a fully isolated zone with dedicated identity infrastructure inside it.

In each of these environments an IAM image must be self-contained by construction — no runtime package fetches, no external CDN dependencies, no upstream-managed update channel. The hardening properties described later in this post are not aesthetic improvements there; they are entry requirements.

The shape of the requirements is consistent across both jurisdictions and these sector mandates: prove what is in the image, prove it is patched, prove it is configured securely, and prove it stays that way over time. That is the gap a hardened image is built to close, and it is why this category exists at all.

Why We Built Our Own

There is a healthy market for hardened Keycloak images today. Chainguard ships one, Bitnami has secure variants, and Red Hat's build of Keycloak (RHBK) is maintained by Red Hat, a major contributor to upstream Keycloak. These are real products built by competent teams, and we recommend evaluating them on their merits.

We chose to build our own anyway, for one reason: the layer of the stack we are unwilling to outsource is the one we know best. We operate Keycloak under regulated workloads every day. We patch transitive Java dependencies before upstream Keycloak releases catch up — the Tier 2 Maven dependencyManagement flow described later in this post is something an off-the-shelf image cannot offer, because it requires building Keycloak from source on our schedule, not the upstream's. For a system that holds session tokens and credentials, that level of application-layer ownership is the difference between a product and a platform.

This post is not an argument against the existing market. It is an explanation of why, for an identity team that lives inside Keycloak, building the runtime ourselves is the right call.

Choosing the Base: Why Wolfi OS

The base image sets the ceiling on your container's security posture. Most Keycloak deployments use Ubuntu, Debian, or Alpine. We chose Wolfi OS by Chainguard.

Criteria Ubuntu/Debian Alpine Wolfi OS
Package count 400+ 50–100 ~65
Known CVEs (typical) 20–50+ 5–15 0
Supply chain signing Partial Partial Full (Sigstore)
glibc compatibility Yes No (musl) Yes
Image size 200–400 MB 50–100 MB ~245 MB
Update cadence Monthly Weekly Daily

Why Not Alpine?

Alpine uses musl libc instead of glibc. The size win is real, but musl introduces subtle compatibility issues with Java around DNS resolution, thread handling, and locale support. Keycloak runs on Quarkus and Vert.x, both of which depend on glibc-specific behavior that musl does not fully replicate. We have hit this in practice; we do not chase it any longer.

Why Wolfi?

Wolfi is purpose-built for containers by Chainguard, the company behind Sigstore. It gives us:

  • Zero known unpatched CVEs at the OS layer at any given moment. Packages are rebuilt from source with patches applied proactively. Newly disclosed CVEs land in Wolfi within hours, not weeks.
  • Full Sigstore supply chain verification. Every package is cryptographically signed and verifiable.
  • glibc compatibility. No musl-related Java quirks.
  • Minimal package surface. Around 65 packages in our final runtime, verifiable with apk info | wc -l.

Wolfi is not a general-purpose Linux distribution. It does not ship a package manager in the final runtime if you strip APK. That is a feature, not a limitation — it prevents unauthorized package installation at runtime.

# Runtime stage — Wolfi OS by Chainguard
FROM cgr.dev/chainguard/wolfi-base:latest
# `:latest` is intentional here, not an anti-pattern.
# Wolfi rebuilds packages from source daily and applies CVE patches within hours
# of disclosure. Reproducibility is preserved at the CI/CD layer via immutable
# calendar-versioned tags (YYYY.M.D). For compliance-mandated environments
# (SOC 2, ISO 27001, PCI-DSS), pin the base image by SHA-256 digest instead.

RUN apk update && apk upgrade && \
    apk add --no-cache \
      openjdk-21-jre \
      bash \
      ca-certificates-bundle && \
    rm -rf /var/cache/apk/*

Three packages. That is the entire runtime dependency surface:

  • openjdk-21-jre — Java runtime. Not the full JDK. No compiler, no debugging tools.
  • bash — required by Keycloak startup scripts.
  • ca-certificates-bundle — TLS certificate verification.

curl is used during the build stage only; it is not retained in the runtime image.

About latest and Reproducibility

Pinning to a snapshot like wolfi-base:20260115 would freeze your base at that date and silently accumulate unpatched vulnerabilities until you bump the tag. For a security-critical workload like Keycloak — where the entire purpose of the image is to minimize attack surface — the cost of a stale base far exceeds the cost of non-deterministic builds.

Reproducibility lives one layer up:

  • Build reproducibility. Each pipeline run produces an immutable image tag like 2026.4.7 that points to a specific image digest.
  • Audit trail. The built image manifest pins the exact Wolfi base digest used at build time. docker inspect or crane manifest retrieves it.
  • Rollback. If a Wolfi update introduces a regression, deploy the previous immutable pipeline tag while you investigate. The latest tag only affects new builds, not running containers.
  • Compliance. For SOC 2, ISO 27001, or PCI-DSS, replace latest with a SHA-256 digest: cgr.dev/chainguard/wolfi-base@sha256:abc123... This buys byte-level reproducibility while still allowing controlled updates.

The stronger long-term default is digest pinning combined with an automated update bot — Renovate, Dependabot, or an internal equivalent. The Dockerfile references a specific digest, so every build is byte-for-byte reproducible. The bot opens a pull request whenever Chainguard publishes a new digest, which triggers the normal CI pipeline (Trivy scan, tests, manual review) before merge.

{
  "$schema": "https://docs.renovatebot.com/renovate-schema.json",
  "extends": ["config:base"],
  "pinDigests": true,
  "docker": { "enabled": true },
  "schedule": ["before 6am on monday"],
  "packageRules": [
    {
      "matchPackageNames": ["cgr.dev/chainguard/wolfi-base"],
      "groupName": "wolfi base image",
      "automerge": false
    }
  ]
}

With this pattern in place, :latest no longer appears anywhere in the image graph. Reproducibility comes from the digest, freshness comes from the bot, governance comes from the pull request review.

Non-Root Execution and Least Privilege

Running containers as root is one of the most common — and most dangerous — misconfigurations. If an attacker escapes the container process, root execution hands them host-level privileges.

The hardened image creates a dedicated keycloak user with explicit UID/GID and tightens file permissions:

ARG USER=keycloak
ARG UID=2000
ARG GID=2000

RUN addgroup -g ${GID} ${USER} && \
    adduser -u ${UID} -G ${USER} -s /bin/bash -D ${USER}

COPY --from=builder --chown=${USER}:${USER} /tmp/keycloak-dist/ ${KEYCLOAK_HOME}/

# Principle of least privilege
RUN find ${KEYCLOAK_HOME} -type d -exec chmod 755 {} + && \
    find ${KEYCLOAK_HOME} -type f -exec chmod 644 {} + && \
    chmod 755 ${KEYCLOAK_HOME}/bin/*.sh

USER ${USER}

UID 2000 (rather than 1000) avoids collisions with default user accounts on host systems. That matters in Kubernetes environments where runAsNonRoot and runAsUser security contexts enforce non-root execution from the orchestrator side as well.

This image is compatible with:

  • Kubernetes Pod Security Standards (restricted profile). The pod manifest must set runAsNonRoot: true, allowPrivilegeEscalation: false, and drop ALL capabilities.
  • CIS Docker Benchmark Section 4.1 (ensure a user has been created for the container).
  • NIST SP 800-190 container security recommendations.

JVM Security and Performance

The entrypoint script configures the JVM with security-first defaults. Each parameter has a specific defensive or operational purpose.

Memory Safety

-XX:+ExitOnOutOfMemoryError
-XX:MaxRAMPercentage=70
-XX:InitialRAMPercentage=50
-XX:MetaspaceSize=96M
-XX:MaxMetaspaceSize=256m
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/heap.hprof
Parameter Purpose
ExitOnOutOfMemoryError Terminates the JVM on OOM instead of running in a degraded state. Kubernetes restarts the pod cleanly. A zombie Keycloak that appears alive but cannot allocate memory is worse than a clean restart.
MaxRAMPercentage=70 Allocates 70% of the container memory limit to the JVM heap. The remaining 30% absorbs native memory, thread stacks, and metaspace, keeping the OOM killer at bay.
MetaspaceSize / MaxMetaspaceSize Bounds metaspace between 96 MB and 256 MB to prevent classloader leaks from consuming unbounded memory.
HeapDumpOnOutOfMemoryError Combined with a persistent volume mounted at /tmp, gives you a post-mortem .hprof artifact before Kubernetes restarts the pod.

A flag-semantics note: -XX:MinRAMPercentage is used by the JVM to compute max heap in small containers, not to enforce a minimum heap. If you want a fixed heap, set only MaxRAMPercentage and InitialRAMPercentage to the same value, or use -Xms/-Xmx explicitly.

Cryptographic Entropy

-Djava.security.egd=file:/dev/urandom

The default /dev/random blocks when the entropy pool is exhausted. In containerized environments — especially freshly spawned pods — the pool is often insufficient at boot. Keycloak hangs during startup while generating cryptographic keys for sessions and tokens.

/dev/urandom provides cryptographically secure random numbers without blocking, as confirmed by Linux kernel maintainers and NIST SP 800-90A. Since kernel 5.6+, /dev/urandom is equivalent to /dev/random for all practical cryptographic purposes. The blocking behavior of /dev/random provides no additional security benefit on modern kernels.

Garbage Collection and Profiling

-XX:+UseZGC
-XX:+ZGenerational
-XX:StartFlightRecording=filename=/var/log/jfr/recording.jfr,settings=profile,maxsize=512m,maxage=1h,dumponexit=true,disk=true

Generational ZGC is the recommended garbage collector for Java 21, especially for latency-sensitive workloads like an IAM system. G1GC targets 200 ms pauses by default; Generational ZGC delivers sub-millisecond pauses, which keeps GC out of the authentication latency budget.

Java Flight Recorder is configured for production:

  • Absolute path writes to a persistent volume rather than ephemeral container storage, so a sidecar collector can pick it up.
  • Rolling buffer (maxsize=512m, maxage=1h, disk=true) prevents the JFR file from growing indefinitely.
  • Dump on exit ensures diagnostic data is flushed to disk even on unexpected JVM termination.

JFR is disabled by default and gated behind ENABLE_JFR=true at deployment time. Keep it off in steady-state production and turn it on for targeted diagnostic windows. The default JFR stack depth of 64 frames is sufficient for most scenarios; raising it adds linear CPU and memory overhead per event.

Signal Handling

exec "${JAVA_BIN}" ${JAVA_OPTS} \
    -cp "${QUARKUS_CP}" \
    "${MAIN_CLASS}" \
    ${CONFIG_FILE_ARG} \
    ${KC_START_CMD}

The exec command replaces the shell process with the Java process, making it PID 1. Three things follow:

  • SIGTERM reaches the Java process directly — graceful shutdown works.
  • SIGKILL terminates the correct process — no orphaned shells.
  • Kubernetes terminationGracePeriodSeconds works as expected.

Without exec, the shell remains PID 1 and may not forward signals, leading to forceful pod termination and potential session loss.

Build-Time vs Runtime: The --optimized Flag

Keycloak runs on Quarkus, which separates configuration into build-time (immutable) and runtime (environment-specific) phases. Knowing which knob lives where is critical for both security and startup performance.

Diagram of Quarkus build-time configuration baked into the image (database driver, feature flags, cache configuration, health endpoints) being consumed by kc.sh build and producing an optimized artifact, with runtime configuration like database URL, hostname, admin credentials, and cluster DNS query injected at start time

Build-Time (Immutable)

# Runs during image build, results are cached
RUN bin/kc.sh build

This pre-compiles the Quarkus application with the selected database driver (PostgreSQL), feature flags, and cache configuration. At startup, start --optimized skips compilation entirely.

  • Security benefit: build-time configuration cannot be tampered with at runtime. An attacker who gains environment-variable access cannot switch the database driver or enable preview features.
  • Performance benefit: startup drops from 30–60 seconds to under 10 seconds because Quarkus skips augmentation.

Runtime (Environment Variables)

KC_DB_URL=jdbc:postgresql://db.example.com:5432/keycloak
KC_DB_USERNAME=keycloak_user
KC_DB_PASSWORD=<from-secret-manager>
KC_HOSTNAME=auth.example.com
KC_BOOTSTRAP_ADMIN_USERNAME=admin
KC_BOOTSTRAP_ADMIN_PASSWORD=<from-secret-manager>

Never hardcode credentials in the Dockerfile or entrypoint. Use Kubernetes Secrets, HashiCorp Vault, or your cloud provider's secret manager to inject sensitive values at runtime.

A note on HTTP vs HTTPS: the hardened image does not ship KC_HTTP_ENABLED=true. Keycloak's upstream default is HTTPS-only and we keep that default to avoid exposing a plaintext listener by accident. Deployments that terminate TLS at a sidecar or ingress and need plain HTTP on the pod-local interface must opt in explicitly, ideally with mTLS between the ingress and the pod. For direct HTTPS, configure KC_HTTPS_CERTIFICATE_FILE and KC_HTTPS_CERTIFICATE_KEY_FILE and leave HTTP off.

Vulnerability Management: Three Tiers

Achieving zero CVEs is not a one-time event; it is an ongoing process. The hardened image implements a three-tier vulnerability management strategy.

Three-tier vulnerability management diagram showing Tier 1 source elimination via Wolfi OS, Tier 2 active patching via Maven dependencyManagement and Copa post-build, and Tier 3 transparent documentation via OpenVEX with audit trail

Tier 1 — Eliminate at the Source (Wolfi)

By starting from Wolfi, we begin with zero OS-level vulnerabilities. Chainguard rebuilds packages from source daily, applying patches before CVEs are even published. Result: 0 OS-level CVEs across roughly 65 packages.

Tier 2 — Patch Application Vulnerabilities at Build Time

When a CVE is discovered in a Keycloak application dependency — a transitive Java library like Vert.x, Netty, Jackson, or a JDBC driver — we patch it before the vulnerable JAR ever reaches the image.

The mechanism is a Maven dependencyManagement override applied to Keycloak's source pom.xml at build time. This replaces an earlier post-build pattern that downloaded a patched JAR with curl, verified its SHA-256, and overwrote the old filename in place. That earlier pattern worked, but SBOM tooling reported the stale filename even though the content was patched. With the override, Maven resolves the patched version natively, so Quarkus packaging produces a correctly named JAR (for example io.vertx.vertx-core-4.5.24.jar), and every downstream tool — Trivy, Syft, Grype, auditor filesystem inspection — agrees on the version.

The override lives as a small, reviewable patch file in the build context (pom-vertx-override.patch):

--- a/pom.xml
+++ b/pom.xml
@@ -312,6 +312,13 @@
     <dependencyManagement>
         <dependencies>
+            <!-- Tier 2 CVE patches (override Quarkus BOM transitive versions) -->
+            <!-- CVE-2026-1002: vertx-core 4.5.23 -> 4.5.24 -->
+            <dependency>
+                <groupId>io.vertx</groupId>
+                <artifactId>vertx-core</artifactId>
+                <version>4.5.24</version>
+            </dependency>
             <dependency>
                 <groupId>org.infinispan</groupId>
                 <artifactId>infinispan-bom</artifactId>

The builder stage applies it before the Maven build runs:

RUN curl -L ... && sha256sum -c - && tar xz ...

# Tier 2 application-layer CVE patches via Maven dependencyManagement override.
# Applied BEFORE the Maven build so packaged JARs carry the correct filename.
COPY pom-vertx-override.patch /tmp/pom-vertx-override.patch
RUN cd /tmp/keycloak && patch -p1 < /tmp/pom-vertx-override.patch

# Maven build resolves io.vertx:vertx-core:4.5.24 natively
RUN mvn clean install -DskipTests -Pdistribution --strict-checksums

Three guarantees fall out of this approach:

  1. Truthful SBOM. The resulting artifact is io.vertx.vertx-core-4.5.24.jar. Tools that derive version from filename (Syft) and tools that read pom.properties inside the JAR (Trivy, Grype) all report 4.5.24. No divergence between what the image contains and what scanners report.
  2. Reviewable change. The override is a single unified-diff file under version control. Reviewers see exactly which dependency versions are being forced, with CVE references inline, without digging through Dockerfile RUN layers.
  3. Reproducibility. Maven Central checksum verification is enforced via --strict-checksums, and the resolved version is pinned via dependencyManagement so every build pulls the same JAR.

Downloading artifacts from external repositories during a Docker build introduces a supply chain dependency. Without checksum verification, a compromised CDN or repository mirror could inject malicious code into your image and the build would succeed silently. Always verify SHA-256 checksums for any externally fetched binary.

Tier 2b — Post-Build OS Patching with Copa

The Maven override closes the window for application-layer CVEs at build time. Copa (project-copacetic) handles the other direction: OS-level CVEs disclosed after an image has already been built and deployed. Copa does not rebuild the image. It invokes the distribution's native package manager (apk on Wolfi or Alpine, apt on Debian, dnf on UBI) inside a buildkit session, upgrades only the vulnerable packages named in a Trivy report, and appends the resulting filesystem diff as a thin layer on top of the existing image. A patch run takes seconds rather than the minutes a full rebuild would take.

The two mechanisms are complementary, not alternatives:

  • Maven dependencyManagement override (Tier 2, build time): Java libraries inside the Keycloak distribution. Copa cannot touch these because they are not apk packages — they live inside the distribution tarball.
  • Copa patching (Tier 2b, post-build): OS-level packages (openjdk-21-jre, glibc, openssl, ca-certificates-bundle) and retrofitting already-deployed images when a fresh CVE is disclosed between release and the next scheduled rebuild.

A minimal CI integration that runs Copa as a safety net after the main image build:

# Scan the freshly built image
trivy image --format json --output scan.json keycloak-hardened:${TAG}

# If HIGH/CRITICAL CVEs remain, patch them with Copa
copa patch \
    -i keycloak-hardened:${TAG} \
    -r scan.json \
    -t ${TAG}-patched \
    -a ${BUILDKIT_ADDR} \
    --push

# Rescan and gate the pipeline on the patched image
trivy image --exit-code 1 --severity HIGH,CRITICAL \
    --openvex openvex.json \
    keycloak-hardened:${TAG}-patched

Because Wolfi already delivers a near-zero OS CVE baseline, Copa typically has nothing to do on a steady-state run. It becomes load-bearing only when a same-day disclosure lands between the base-image rebuild and the next scheduled pipeline run.

Tier 3 — Document False Positives with OpenVEX

Not every CVE flag from a scanner represents a real vulnerability. The OpenVEX standard provides a machine-readable format to document these exceptions transparently:

{
  "@context": "https://openvex.dev/ns/v0.2.0",
  "author": "SecurityTeam",
  "statements": [
    {
      "vulnerability": { "name": "CVE-2025-59250" },
      "products": [{ "@id": "pkg:maven/com.microsoft.sqlserver/mssql-jdbc@13.2.1" }],
      "status": "not_affected",
      "justification": "vulnerable_code_not_present",
      "impact_statement": "Version 13.2.1.jre11 is in use but Trivy incorrectly parses the version string. This version is patched and safe."
    }
  ]
}

This approach buys three things at once:

  • Auditors can verify that every CVE exception has a documented justification.
  • CI/CD pipelines can filter known false positives without suppressing real findings.
  • Trivy (or any VEX-compatible scanner) automatically excludes documented non-issues:
trivy image --openvex openvex.json keycloak-hardened:latest
Tier Strategy Example Outcome
Source Wolfi OS 0 OS-level CVEs Attack surface removed
Active patching Maven override + Copa post-build CVE-2026-1002 (Vert.x) Vulnerability fixed, supply chain verified
Transparent docs OpenVEX CVE-2025-59250 (FP) Audit trail preserved

Kubernetes-Ready Configuration

The hardened image is designed to operate as a first-class citizen in Kubernetes.

Clustering with JGroups DNS

DJGROUPS_DNS_QUERY=keycloak-headless.keycloak-ns.svc.cluster.local

This lets Keycloak instances discover each other via Kubernetes headless services, forming a distributed cache cluster for session replication.

Health Probes

With --health-enabled=true at build time, Keycloak exposes SmallRye Health endpoints:

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

A note on the Dockerfile HEALTHCHECK directive: we intentionally omit it. Kubernetes does not consult HEALTHCHECK; the kubelet runs its own liveness and readiness probes. Adding one is dead configuration outside Kubernetes and only expands the runtime toolchain by forcing curl or wget into the runtime image. For container runtimes that do consult HEALTHCHECK (Docker, Podman standalone), prefer a tiny statically linked health binary over curl.

Pod Security Context

securityContext:
  runAsNonRoot: true
  runAsUser: 2000
  runAsGroup: 2000
  fsGroup: 2000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

This enforces the restricted Pod Security Standard, the strictest level available in Kubernetes.

readOnlyRootFilesystem: true requires writable volumes for /tmp (Quarkus class generation, JFR recordings) and /opt/keycloak/data (cluster state, tenant config). Without these, the pod fails with a read-only filesystem error. A minimal working pair is an emptyDir with medium: Memory for /tmp and an emptyDir for /opt/keycloak/data.

CI/CD Pipeline: Multi-Architecture Builds

The build pipeline produces native images for both AMD64 and ARM64 architectures, with security scanning gating every promotion.

CI/CD pipeline diagram showing branch trigger fanning out to AMD64 and ARM64 build jobs in parallel, both feeding a multi-arch manifest, pushed to the registry, then scanned by Trivy with OpenVEX, gating deployment

Key pipeline features:

  • Docker BuildKit for efficient layer caching.
  • Calendar versioningYYYY.M.D (e.g., 2026.4.7).
  • Change detection — pipeline runs only when files in keycloak-base/hardened/ change.
  • Trivy scanning with OpenVEX integration before deployment.
  • Architecture-specific scans. Trivy is run per platform (--platform linux/amd64 and linux/arm64) rather than against the manifest list alone, because architecture-dependent binaries and native libraries can carry CVEs that are invisible from a single-platform scan.

Results

Metric Default Keycloak Image Hardened Image
OS-level CVEs 15–30+ 0
Total packages 200–400+ 65
Runs as root Yes No (UID 2000)
Build tools in runtime Yes (JDK, Maven traces) No
JVM security tuning None Full
OOM behavior Hang / degrade Clean exit + restart
Startup time (optimized) 30–60s < 10s
Multi-architecture Single AMD64 + ARM64
CVE documentation None OpenVEX

A reasonable starting point in Kubernetes is 1 CPU / 2 GB RAM for development and 2 CPU / 4 GB RAM for production workloads. MaxRAMPercentage=70 automatically allocates 70% of the container memory limit to the JVM heap. Tune from there based on JFR and Prometheus metrics under real traffic.

Licensing and Attribution

Wolfi OS is an open-source project initiated and maintained by Chainguard. The Wolfi build recipes are released under the Apache License 2.0. Individual packages bundled in the image carry their own upstream licenses — Apache-2.0, MIT, GPL-2.0, GPL-3.0, LGPL-2.1, and GCC runtime exceptions among them — and the complete per-package license inventory ships in the image SBOM.

"Wolfi", "Chainguard", and "Sigstore" are trademarks of their respective owners. References here are for attribution only and do not imply endorsement, sponsorship, or affiliation. If you build downstream images on Wolfi, track package-level license obligations through your SBOM tooling rather than relying on a base-image disclaimer to cover them.

Coming Next: A CRA Scorecard for This Image

This post opens a series on running IAM in regulated production. Future installments will cover IAM observability with OpenTelemetry, gateway-level authorization enforcement, multi-tenant operations, and air-gapped identity deployments. The immediate follow-up is narrower: we will grade this image against the Cyber Resilience Act's Annex I essential requirements line by line — what we already meet, what needs additional controls in the deployment layer, and where the open questions are. If you want to read it when it lands, subscribe to the Keymate newsletter or follow us on LinkedIn.

References


Keymate builds identity infrastructure that goes beyond authentication. If you are running Keycloak in regulated production and want a hardened image that comes with the security posture described above already wired in, we would love to talk.

Need a hardened Keycloak in production?

Keymate ships a production-grade Keycloak image with zero OS CVEs, non-root execution, and JVM hardening built in. Talk to us about deploying it in your stack.

Stay updated with our latest insights and product updates

Frequently Asked Questions