FGA Backend Deployment Model

Summary

The FGA Backend deployment model defines how to deploy, scale, and integrate the fine-grained authorization backend with the Keymate platform. This includes standalone development deployments, production-ready configurations, and integration patterns with the FGA Engine.

Why It Exists

Authorization backends must meet strict availability and latency requirements because every permission check depends on them. A well-designed deployment model ensures:

Low-latency authorization queries
High availability for critical authorization paths
Scalability as relationship data grows
Clear operational boundaries

Where It Fits in Keymate

The FGA Backend runs as a separate service that the FGA Engine connects to. The deployment topology affects authorization latency and system resilience.

Boundaries

What it covers:

Deployment topologies (development, production)
Service endpoints and ports
Storage configuration options
High availability patterns

What it does not cover:

Application-level integration code
Authorization model design
Tuple management workflows

How It Works

Deployment Modes

Mode	Use Case	Characteristics
Development	Local testing	In-memory storage, single instance
Production	Live environments	Persistent storage, multiple instances

Service Endpoints

The FGA Backend exposes these endpoints:

Protocol	Default Port	Purpose
HTTP	8082	REST API for authorization operations
gRPC	8083	High-performance API for production use
Playground	3002	Interactive UI for model testing (development only)

Health Check Endpoints

Endpoint	Method	Purpose
`/healthz`	GET	Liveness probe — returns 200 if the service is running
`/ready`	GET	Readiness probe — returns 200 if the service can accept traffic

Use these endpoints for Kubernetes probes and load balancer health checks.

Storage Options

The backend supports different storage backends:

Storage	Use Case	Characteristics
In-memory	Development, testing	Fast, non-persistent, single-node only
Relational Database	Production	Durable, supports clustering, recommended for production

For production deployments, OpenFGA persists relationship data to a relational database. The database stores:

Authorization models and their versions
Relationship tuples
Store metadata

note

Configure storage at backend deployment time. See OpenFGA documentation for supported database versions and connection parameters.

Development Deployment

For local development and testing:

Single instance with in-memory storage
Playground UI enabled for interactive testing
No authentication required
Suitable for model development and integration testing

Production Deployment

For production environments:

Multiple instances behind a load balancer
Persistent storage with backup strategy
Authentication enabled for API access
Health checks for service discovery
Horizontal scaling based on query volume

Scaling Considerations

Scale the backend based on these metrics:

Metric	Threshold	Action
Query latency (p99)	> 50ms	Add instances or optimize database
CPU utilization	> 70% sustained	Add instances
Memory utilization	> 80%	Add instances or increase instance size
Active connections	Near connection pool limit	Add instances

Capacity planning factors:

Tuple count: Performance remains stable up to millions of tuples per store with proper indexing
Query complexity: Deep relationship chains (> 5 hops) increase latency
Concurrent requests: Each instance handles hundreds of concurrent check requests

note

These thresholds are general guidance. Exact values depend on your infrastructure, workload patterns, and latency requirements. Monitor query latency percentiles and scale proactively before hitting limits.

Integration with FGA Engine

The FGA Engine connects to the backend using:

Configuration	Description
API URL	Backend service endpoint
Store ID	Target store for operations
Model ID	Authorization model version

Initialization Flow

When deploying the backend:

Start the backend service
Create a store for authorization data
Upload the authorization model
Write initial relationship tuples (if needed)
Configure FGA Engine with store and model IDs

Diagram

Example Scenario

Scenario

An operator deploys the FGA Backend for a new Keymate environment and initializes it with the required authorization model.

Input

Actor: Platform operator
Resource: FGA Backend deployment
Action: Initial deployment and configuration
Context: New production environment

Expected Outcome

Backend service running with persistent storage
Store created with unique identifier
Authorization model uploaded and versioned
FGA Engine configured to connect to the backend
Health checks passing

Common Misunderstandings

In-memory mode is production-ready — In-memory storage loses all data on restart. Use persistent storage for production.
Single instance is sufficient — For production availability, deploy multiple instances with load balancing.

warning

Always verify store and model IDs after deployment. Misconfigured IDs cause authorization failures that affect all permission checks.

Design Notes / Best Practices

Use infrastructure-as-code for repeatable deployments
Implement health check endpoints in service discovery
Monitor query latency and error rates
Plan storage capacity based on expected tuple volume
Test failover scenarios before going live

tip

Start with a development deployment to validate your authorization model, then promote the tested model to production.

Initial platform deployment with FGA backend
Scaling authorization capacity for growing workloads
Environment promotion (dev → staging → production)

FGA Backend Overview

Backend abstraction concepts

Current OpenFGA Backend

OpenFGA implementation details

FGA Engine

Platform component for ReBAC

Operations

Platform operational guides

Summary​

Why It Exists​

Where It Fits in Keymate​

Boundaries​

How It Works​

Deployment Modes​

Service Endpoints​

Health Check Endpoints​

Storage Options​

Development Deployment​

Production Deployment​

Scaling Considerations​

Integration with FGA Engine​

Initialization Flow​

Diagram​

Example Scenario​

Scenario​

Input​

Expected Outcome​

Common Misunderstandings​

Design Notes / Best Practices​

Related Use Cases​

Related Docs​