Production Hardening

Goal

Apply security hardening practices to your Keymate deployment to meet production security requirements. At the end of this guide, your deployment has hardened identity configuration, encrypted service communication, automated TLS, API gateway protection, network isolation, audit logging, and secure credential management.

Audience

Security engineers, platform engineers, and operators responsible for the security posture of production Keymate deployments.

Prerequisites

A running Keymate platform (Helm-based or GitOps-based)
Deployment Best Practices applied (HA, resource limits, monitoring)
Administrative access to the Kubernetes cluster and platform components

Before You Start

Production hardening builds on top of a properly deployed and operationally sound platform. Complete the Deployment Best Practices guide before applying security hardening. Hardening a misconfigured deployment adds complexity without improving security.

warning

Security hardening is not a one-time activity. Review these practices after every major upgrade and whenever compliance requirements change.

Steps

1. Harden the identity provider

The identity provider is the authentication entry point for all users. Hardening it reduces the attack surface for credential-based attacks.

Recommended actions:

Area	Action
Admin console	Restrict admin console access to internal networks or VPN. Never expose the admin console publicly
Brute-force protection	Enable brute-force detection with account lockout after repeated failed attempts
Session policies	Set session timeouts: idle timeout (15-30 minutes), max session lifetime (8-12 hours)
Token lifetime	Set access token lifetime based on your security requirements (5-15 minutes for high-security environments)
Password policies	Enforce minimum length (12+ characters), complexity requirements, and password history
Unused flows	Disable authentication flows you do not use (e.g., direct grant if not needed)
Default accounts	Change or disable default administrative accounts after initial setup

2. Enable service mesh encryption (mTLS)

The service mesh provides mutual TLS between all platform services, encrypting and authenticating all inter-service communication.

Recommended actions:

Area	Action
mTLS mode	Set the service mesh to strict mTLS mode — reject any unencrypted inter-service traffic
Certificate rotation	Verify the service mesh automatically rotates mTLS certificates on a regular schedule
Peer authentication	Confirm that all platform namespaces enforce peer authentication policies

Strict mTLS means that even if an attacker gains access to the cluster network, they cannot intercept or tamper with traffic between Keymate services without valid certificates.

3. Automate TLS for external endpoints

Secure all external-facing endpoints (login pages, API gateway, admin interfaces) with TLS and valid certificates.

Recommended actions:

Area	Action
Automated provisioning	Use certificate automation to provision and renew certificates before they expire
Certificate monitoring	Set up alerts for certificates that expire within 14 days
Protocol version	Enforce TLS 1.2 as minimum; prefer TLS 1.3 where supported
Cipher suites	Disable weak cipher suites (RC4, DES, 3DES, MD5-based)
HSTS	Enable HTTP Strict Transport Security headers on all external endpoints

4. Harden the API gateway

The API gateway sits at the network edge of the platform, handling all inbound traffic and enforcing access policies.

Recommended actions:

Area	Action
Rate limiting	Configure rate limits per client, per Tenant, and per endpoint to prevent abuse
Request validation	Enable request size limits and header validation to block malformed requests
IP restrictions	Restrict access to known IP ranges where applicable (admin endpoints, internal APIs)
CORS policies	Configure strict CORS policies — allow only the specific origins that need access
Error handling	Verify error responses do not leak internal details (stack traces, internal hostnames, component versions)

5. Enforce network isolation

Restrict network traffic to only the paths that platform components require.

Recommended actions:

Area	Action
Network policies	Apply Kubernetes NetworkPolicies to restrict ingress and egress per namespace
Data layer isolation	Block direct external access to databases, caches, and message brokers — only application services should reach them
Namespace separation	Ensure each deployment layer runs in its own namespace with scoped access
Egress control	Restrict outbound cluster traffic to only required destinations (DNS, certificate authorities, external integrations)

6. Configure audit logging

Capture security-relevant events for compliance, forensics, and anomaly detection.

What to audit:

Event category	Examples
Authentication	Login attempts (success and failure), password changes, account lockouts
Authorization	Access decisions (granted and denied), policy changes, role assignments
Administration	Tenant creation, user provisioning, configuration changes
System	Component restarts, certificate rotations, health check failures

Where audit data goes:

Audit events flow through the OpenTelemetry pipeline alongside other telemetry. They appear in the Observability dashboards, and you can export them to external tools for compliance archival.

tip

Configure audit log retention based on your compliance requirements. Many regulations require 1-7 years of audit data retention. Export audit logs to long-term storage rather than relying on the in-cluster observability stack for retention.

7. Secure credential management

Protect all credentials used by the platform.

Recommended actions:

Area	Action
Kubernetes Secrets	Store all credentials in Kubernetes Secrets with encryption at rest enabled
External secrets	Use an external secrets operator to sync credentials from a centralized vault
Rotation schedule	Establish a rotation schedule for database passwords, API keys, and service accounts
Least privilege	Grant each component only the permissions it requires — no shared admin credentials
Git exclusion	Never commit credentials to Git repositories, even in encrypted form unless using sealed secrets

8. Review and validate

After applying all hardening steps, validate the security posture.

Validation checklist:

Check	How to verify
Admin console not publicly accessible	Attempt to access admin URL from outside the allowed network
mTLS enforced	Deploy a test pod without a sidecar and verify it cannot reach platform services
TLS valid on all endpoints	Run a TLS scanner against all external endpoints
Rate limiting active	Send requests exceeding the rate limit and verify they are rejected
Network policies enforced	Attempt to connect directly to a database pod from an unauthorized namespace
Audit logging operational	Trigger a test event and verify it appears in the audit log

Validation Scenario

Scenario

A security engineer reviews a Keymate deployment before go-live to ensure it meets the organization's production security requirements.

Expected Result

No platform service allows unauthenticated access
mTLS encrypts all inter-service traffic
External endpoints use valid TLS certificates with strong cipher suites
Rate limiting prevents API abuse
Network policies block unauthorized traffic paths
Audit logs capture authentication, authorization, and administration events
All credentials reside in Kubernetes Secrets, not in configuration files

How to Verify

Run a port scan against the cluster to identify exposed services
Attempt unauthenticated access to all endpoints
Verify mTLS by inspecting service mesh configuration
Check certificate validity and cipher suites with an SSL testing tool
Review audit log output for completeness

Troubleshooting

Services fail after enabling strict mTLS. Some components may lack sidecar proxies. Verify all platform pods have the service mesh sidecar and that namespace labeling is correct.
TLS certificate renewal fails. Check certificate automation logs. Common causes: DNS challenge failures, rate limiting from the certificate authority, and expired credentials for DNS providers.
Network policies block legitimate traffic. Start with allow-list policies and add restrictions incrementally. Use network policy logging (if available) to identify blocked traffic before enforcement.
Audit log volume is too high. Tune the audit level to capture security events without logging every routine operation. Focus on authentication, authorization decisions, and configuration changes.

Next Steps

Observability Overview — Monitor security events through the telemetry pipeline
Deployment Best Practices — Review operational best practices alongside security hardening
Business Continuity & SLO Guardrails — Plan for failure scenarios and recovery

Deployment Best Practices

Operational readiness before hardening

Observability Overview

Monitor security events and audit logs

Export & Tooling Portability

Export audit logs to compliance tools

Pre-Deployment Checklist

Verify readiness before deployment

Goal​

Audience​

Prerequisites​

Before You Start​

Steps​

1. Harden the identity provider​

2. Enable service mesh encryption (mTLS)​

3. Automate TLS for external endpoints​

4. Harden the API gateway​

5. Enforce network isolation​

6. Configure audit logging​

7. Secure credential management​

8. Review and validate​

Validation Scenario​

Scenario​

Expected Result​

How to Verify​

Troubleshooting​

Next Steps​

Related Docs​

Deployment Best Practices

Observability Overview

Export & Tooling Portability

Pre-Deployment Checklist

Goal

Audience

Prerequisites

Before You Start

Steps

1. Harden the identity provider

2. Enable service mesh encryption (mTLS)

3. Automate TLS for external endpoints

4. Harden the API gateway

5. Enforce network isolation

6. Configure audit logging

7. Secure credential management

8. Review and validate

Validation Scenario

Scenario

Expected Result

How to Verify

Troubleshooting

Next Steps

Related Docs