FGA Backend Deployment Model
Summary
The FGA Backend deployment model defines how to deploy, scale, and integrate the fine-grained authorization backend with the Keymate platform. This includes standalone development deployments, production-ready configurations, and integration patterns with the FGA Engine.
Why It Exists
Authorization backends must meet strict availability and latency requirements because every permission check depends on them. A well-designed deployment model ensures:
- Low-latency authorization queries
- High availability for critical authorization paths
- Scalability as relationship data grows
- Clear operational boundaries
Where It Fits in Keymate
The FGA Backend runs as a separate service that the FGA Engine connects to. The deployment topology affects authorization latency and system resilience.
Boundaries
What it covers:
- Deployment topologies (development, production)
- Service endpoints and ports
- Storage configuration options
- High availability patterns
What it does not cover:
- Application-level integration code
- Authorization model design
- Tuple management workflows
How It Works
Deployment Modes
| Mode | Use Case | Characteristics |
|---|---|---|
| Development | Local testing | In-memory storage, single instance |
| Production | Live environments | Persistent storage, multiple instances |
Service Endpoints
The FGA Backend exposes these endpoints:
| Protocol | Default Port | Purpose |
|---|---|---|
| HTTP | 8082 | REST API for authorization operations |
| gRPC | 8083 | High-performance API for production use |
| Playground | 3002 | Interactive UI for model testing (development only) |
Health Check Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
/healthz | GET | Liveness probe — returns 200 if the service is running |
/ready | GET | Readiness probe — returns 200 if the service can accept traffic |
Use these endpoints for Kubernetes probes and load balancer health checks.
Storage Options
The backend supports different storage backends:
| Storage | Use Case | Characteristics |
|---|---|---|
| In-memory | Development, testing | Fast, non-persistent, single-node only |
| Relational Database | Production | Durable, supports clustering, recommended for production |
For production deployments, OpenFGA persists relationship data to a relational database. The database stores:
- Authorization models and their versions
- Relationship tuples
- Store metadata
Configure storage at backend deployment time. See OpenFGA documentation for supported database versions and connection parameters.
Development Deployment
For local development and testing:
- Single instance with in-memory storage
- Playground UI enabled for interactive testing
- No authentication required
- Suitable for model development and integration testing
Production Deployment
For production environments:
- Multiple instances behind a load balancer
- Persistent storage with backup strategy
- Authentication enabled for API access
- Health checks for service discovery
- Horizontal scaling based on query volume
Scaling Considerations
Scale the backend based on these metrics:
| Metric | Threshold | Action |
|---|---|---|
| Query latency (p99) | > 50ms | Add instances or optimize database |
| CPU utilization | > 70% sustained | Add instances |
| Memory utilization | > 80% | Add instances or increase instance size |
| Active connections | Near connection pool limit | Add instances |
Capacity planning factors:
- Tuple count: Performance remains stable up to millions of tuples per store with proper indexing
- Query complexity: Deep relationship chains (> 5 hops) increase latency
- Concurrent requests: Each instance handles hundreds of concurrent check requests
These thresholds are general guidance. Exact values depend on your infrastructure, workload patterns, and latency requirements. Monitor query latency percentiles and scale proactively before hitting limits.
Integration with FGA Engine
The FGA Engine connects to the backend using:
| Configuration | Description |
|---|---|
| API URL | Backend service endpoint |
| Store ID | Target store for operations |
| Model ID | Authorization model version |
Initialization Flow
When deploying the backend:
- Start the backend service
- Create a store for authorization data
- Upload the authorization model
- Write initial relationship tuples (if needed)
- Configure FGA Engine with store and model IDs
Diagram
Example Scenario
Scenario
An operator deploys the FGA Backend for a new Keymate environment and initializes it with the required authorization model.
Input
- Actor: Platform operator
- Resource: FGA Backend deployment
- Action: Initial deployment and configuration
- Context: New production environment
Expected Outcome
- Backend service running with persistent storage
- Store created with unique identifier
- Authorization model uploaded and versioned
- FGA Engine configured to connect to the backend
- Health checks passing
Common Misunderstandings
- In-memory mode is production-ready — In-memory storage loses all data on restart. Use persistent storage for production.
- Single instance is sufficient — For production availability, deploy multiple instances with load balancing.
Always verify store and model IDs after deployment. Misconfigured IDs cause authorization failures that affect all permission checks.
Design Notes / Best Practices
- Use infrastructure-as-code for repeatable deployments
- Implement health check endpoints in service discovery
- Monitor query latency and error rates
- Plan storage capacity based on expected tuple volume
- Test failover scenarios before going live
Start with a development deployment to validate your authorization model, then promote the tested model to production.
Related Use Cases
- Initial platform deployment with FGA backend
- Scaling authorization capacity for growing workloads
- Environment promotion (dev → staging → production)