Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-007: System Key Manager HA via Raft

Status: Accepted Date: 2026-04-17 Context: I-K12, escalation point 7, B-ADV-3

Decision

The system key manager is a dedicated Raft group (3 or 5 members) running as kiseki-keyserver on dedicated nodes. It stores system master keys (one per epoch) and derives per-chunk DEKs via HKDF at runtime (ADR-003).

Architecture

kiseki-keyserver (3-5 nodes, Raft)
  ├── Stores: system master keys (one per epoch, ~32 bytes each)
  ├── Derives: system DEK = HKDF(master_key, chunk_id) — stateless
  ├── Manages: epoch lifecycle (create, rotate, retain, destroy)
  └── Audits: all key events to audit log

Rationale

  • System key manager is the highest-severity SPOF (P0 if unavailable)
  • Must be at least as available as the Log
  • Raft provides consensus + replication + leader election
  • Separate from shard Raft groups (independent failure domain)
  • Dedicated nodes: key material never co-located with tenant data
  • Master key storage is trivial (epochs × 32 bytes)
  • DEK derivation is stateless and fast (HKDF, ~microseconds)

Deployment

  • 3 nodes for standard deployments, 5 for high-criticality
  • Dedicated hardware (or at minimum, dedicated processes on control-plane nodes)
  • Key material in memory only (mlock’d, guard pages)
  • On-disk: Raft log + snapshot of epoch state (encrypted with node-local key)

Consequences

  • Adds a deployment component (kiseki-keyserver)
  • Key manager must be deployed and healthy before any data operations
  • Cross-site: each site has its own system key manager (federation doesn’t share system keys — only tenant keys cross sites via tenant KMS)