Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Getting Started

Overview

Lattice is a distributed workload scheduler for HPC and AI infrastructure. It schedules both batch jobs (training runs, simulations) and long-running services (inference endpoints, monitoring) on shared GPU-accelerated clusters.

If you’re coming from Slurm, most concepts map directly — see the Slurm migration guide for a quick comparison.

Prerequisites

  • A running Lattice cluster (ask your admin for the API endpoint)
  • The lattice CLI installed on your workstation or login node
  • Your tenant credentials (OIDC token or mTLS certificate)

Installing the CLI

# Determine architecture
ARCH=$(uname -m | sed 's/aarch64/arm64/')

# Download from GitHub Releases
curl -sSfL "https://github.com/witlox/lattice/releases/latest/download/lattice-${ARCH}.tar.gz" | tar xz
sudo mv lattice /usr/local/bin/

# Or build from source
cargo build --release -p lattice-cli
sudo cp target/release/lattice /usr/local/bin/

Configuration

Create ~/.config/lattice/config.yaml:

endpoint: "lattice-api.example.com:50051"
tenant: "my-team"
# Optional: default vCluster
vcluster: "gpu-batch"

Or use environment variables:

export LATTICE_ENDPOINT="lattice-api.example.com:50051"
export LATTICE_TENANT="my-team"

Your First Job

Submit a batch script

lattice submit train.sh
# Submitted allocation a1b2c3d4

Check status

lattice status
# ID        NAME           STATE    NODES  WALLTIME   ELAPSED    VCLUSTER
# a1b2c3d4  train.sh       Running  4      24:00:00   00:12:34   gpu-batch

View logs

lattice logs a1b2c3d4
# [2026-03-05T10:00:12Z] Epoch 1/100, loss=2.341
# [2026-03-05T10:01:45Z] Epoch 2/100, loss=1.892

Cancel a job

lattice cancel a1b2c3d4

Next Steps