ADR-012: Merge Conflict Grace Period with Journal-Wins Fallback

During network partitions, agents may accumulate local state changes (admin reconfigurations via pact shell, emergency mode, manual intervention). On reconnect, these local changes may conflict with journal state on the same config keys.

The system must balance correctness (don’t silently overwrite admin work) with availability (the node must eventually converge to declared state).

Decision

Implement a three-phase conflict resolution protocol:

Phase 1: Feed-back (CR1)

On reconnect, the agent reports unpromoted local drift to the journal BEFORE accepting the journal’s current state. This ensures no local changes are lost.

Phase 2: Pause and flag (CR2)

If local changes conflict with journal state on the same keys, the agent pauses convergence for those keys and flags a merge conflict. Non-conflicting keys sync normally. The node remains operational but not fully converged.

Phase 3: Grace period with fallback (CR3)

Admin has a grace period (default: commit window duration) to resolve conflicts via pact diff and pact commit. If unresolved within the grace period, the system falls back to journal-wins: the journal’s declared state overwrites local changes. All overwritten values are logged for audit.

No silent data loss: local changes are always fed back and logged before any overwrite.
Availability preserved: grace period timeout ensures convergence eventually happens even without admin intervention.
Admin agency: operators get time to review and decide, not just informed after the fact.
Complexity: agent must track per-key conflict state and grace period timers.

Alternatives Considered

Immediate journal-wins: rejected — silently discards admin work done during partition, violates trust in the audit trail.
Require manual resolution always: rejected — node never converges if admin is unavailable (vacation, off-hours).
Last-write-wins by timestamp: rejected — clock skew between agent and journal makes this unreliable; admin-committed changes should take precedence over auto-converge.

References

specs/invariants.md (CR1–CR5)
specs/failure-modes.md (F13: Merge conflict on reconnect, F14: Promote conflicts)
specs/assumptions.md (A-Q2: Partition conflict replay, A-C2: Timestamp ordering)

PACT Documentation

ADR-012: Merge Conflict Grace Period with Journal-Wins Fallback

Status

Context

Decision

Phase 1: Feed-back (CR1)

Phase 2: Pause and flag (CR2)

Phase 3: Grace period with fallback (CR3)

Admin notification (CR5)

Promote integration (CR4)

Consequences

Alternatives Considered

References

Keyboard shortcuts

PACT Documentation