ADR-007: No SSH — Pact Shell Replaces Remote Access
Status: Accepted
Context
SSH on diskless HPC nodes creates untracked, unaudited, unrestricted root shells. Every SSH session is a potential source of unacknowledged configuration drift. The drift detection system exists largely because SSH enables uncontrolled changes.
Decision
pact shell and pact exec are the sole remote access mechanisms for compute nodes. SSH is not installed on compute node images. BMC/Redfish console (via OpenCHAMI) is the out-of-band fallback for pact-agent failures.
Design
pact exec (single command)
pact exec node042 -- nvidia-smi
pact exec node042 -- dmesg --since "5 minutes ago"
pact exec node042 -- cat /etc/resolv.conf
- Authenticated via OIDC token
- Authorized against caller’s role + target node’s vCluster
- Command checked against whitelist (configurable per vCluster)
- stdout/stderr streamed back to caller
- Full command + output logged to journal
pact shell (interactive session)
pact shell node042
- Opens a restricted bash session on the node (not a custom shell)
- Authenticated + authorized (higher privilege than exec — separate permission)
- Restriction via environment control, not command parsing:
rbash(restricted bash): prevents changing PATH, running/absolute/paths, redirecting output to files- PATH limited to whitelisted commands via session-specific directory
PROMPT_COMMANDhook logs each executed command to pact audit- Optional mount namespace hides sensitive paths
- Session-level cgroup for resource limits
- State changes detected by the existing drift observer (eBPF + inotify + netlink) and trigger commit windows — the shell doesn’t pre-classify commands
- Session start/end recorded in journal
Why restricted bash, not a custom shell
Implementing a shell that interprets pipes, redirects, globbing, quoting,
subshells, environment variables, job control, and signal handling is
reimplementing bash — poorly. And parsing commands before execution to
classify them is a security problem: $(evil), backticks, eval, and
argument injection make pre-execution parsing unreliable.
Instead, pact controls what bash can reach (PATH, namespace, cgroup) and observes what happened (PROMPT_COMMAND audit, drift detection). Bash handles interactive shell semantics — it’s been doing that for 35 years.
Whitelist model
- Implemented as PATH restriction: only whitelisted binaries are symlinked into the session’s bin directory
- Default whitelist: common diagnostics (nvidia-smi, dmesg, lspci, ip, ss, cat, journalctl, mount, df, free, top, ps, lsmod, sysctl -a, etc.)
- Learning mode: “command not found” errors are captured by the agent, which suggests adding the command to the vCluster whitelist
- vCluster-scoped: regulated vClusters may have tighter whitelists
- Platform admins: broader PATH (but still logged via PROMPT_COMMAND)
State-changing command detection
The agent does not classify commands before execution in shell mode. Instead, the existing drift observer (eBPF probes, inotify, netlink) detects actual state changes and triggers commit windows. This is the same mechanism used for any other source of drift — the shell session is not special.
For pact exec (single commands), the agent does classify commands upfront because it controls the full invocation (no shell interpretation involved).
Fallback
When pact-agent is unresponsive:
- Admin uses OpenCHAMI/Manta to access BMC console (Redfish)
- BMC console provides regular bash (unrestricted, not pact-managed)
- Admin diagnoses and restarts pact-agent if needed
- Changes made via BMC appear as unattributed drift once agent recovers
- If the node is unrecoverable, admin triggers re-image via OpenCHAMI
Trade-offs
- Admins lose the flexibility of arbitrary SSH access
- Whitelist maintenance is ongoing operational work (mitigated by learning mode)
- Slightly higher latency than direct SSH for some operations
- Requires trust in pact-agent reliability (mitigated by BMC fallback)
- rbash restrictions can be bypassed by some binaries (e.g. vi, python, less
with
!cmd) — whitelisted commands must be audited for shell escape vectors
Security Benefits
- All remote access is authenticated (OIDC) and authorized (RBAC)
- Every command is logged with authenticated identity
- State changes are tracked and require commitment
- No unrestricted root shell — pact controls the environment via PATH, rbash, and optional mount namespace
- Attack surface reduced: no sshd, no SSH key management, no SSH vulnerabilities