Architecture

JACO is a multi-node container orchestrator built on hashicorp/raft, embedded Caddy, WireGuard, and per-(deployment, network) docker bridges with nftables-enforced isolation. It ships as two binaries:

  • jacod — the long-running daemon, managed by systemd. Listens on a unix socket for local CLI control and on TCP for peer + remote control.
  • jaco — the operator and developer CLI. Talks to a local jacod over the unix socket and to peer jacods over TCP for cross-host control.

This page is the architectural overview.

Verticals

Every node runs the same set of verticals inside one jacod process. Each has its own package doc under internal/.

verticalresponsibility
control-planeraft group, replicated state machine, command admission, watch fan-out, audit log, cluster CA, leader-only voter-set reconciler (Cluster lifecycle → Voter-set policy)
schedulerleader-only desired-state reconciler; placement, rolling updates, drain, restart-after-3, pressure-based rebalancer (ADR 0002)
runtimeper-node docker engine driver; image pull, container lifecycle, healthcheck observation, log tail, cgroup v2 pressure collector
discoveryper-node bridges, /24 IPAM, WireGuard mesh, nftables isolation, per-bridge DNS
ingressembedded Caddy on :80, :443; ACME issuance + renewal via raft-backed CertMagic storage
daemonjacod itself: config loading, lifecycle, goroutine orchestration, admission gating
clioperator + developer subcommands
packagingrelease pipeline, install, jaco self-upgrade

Data flow at a glance

  1. The CLI submits a write (e.g. Deploy.Apply) to any node.
  2. The admission interceptor resolves the bearer token to an identity (or trusts the unix-socket peer); attaches it to the request context.
  3. The handler validates the payload, builds a Command{} proto, and submits it to raft. Non-leaders forward to the leader via Internal.Submit.
  4. Raft replicates the command to a majority and applies it on every node's FSM. The FSM mutates the typed entity store and writes an AuditEvent.
  5. The watch broker publishes a typed Event<T> to every subscriber (scheduler, runtime, ingress, discovery, jaco status -w).
  6. Subscribers react: the scheduler diffs ReplicaDesired, the runtime starts/stops containers, ingress rebuilds Caddy config, discovery materializes bridges and DNS responders.

See Cluster lifecycle, Scheduling, and Status and errors for the moving parts of that flow.

Replicated state

Canonical entities held in the raft FSM (see proto/jaco/v1/entities.proto):

  • ClusterMeta — cluster id, CA cert, CA key (singleton).
  • Node — one per cluster member; hostname, addresses, WG pubkey, status, plus the latest gossiped CPU + memory pressure sample (cpu_pressure, memory_pressure, last_pressure_at) consumed by the rebalancer.
  • Deployment — one per jaco apply; carries the literal jaco.yaml
    • compose bytes plus the parsed ServiceSpec list.
  • ReplicaDesired — one per <deployment, service, index>; the scheduler's writable view.
  • ReplicaObserved — one per replica; the runtime writes state transitions (pending, pulling, running, …) back through here.
  • Route — HTTP(S) ingress entries.
  • TCPRoute — raw-TCP listeners derived from compose ports:.
  • Cert + CertBlob + ChallengeToken — TLS material for managed domains.
  • Token — operator-token records (identity + hashed secret).
  • JoinToken — single-use cluster-membership tokens.
  • Subnet — per-(deployment, network, host) /24 allocation.
  • RolloutPlan, ReplicaCounter, RestartCounter — scheduler bookkeeping.
  • AuditEvent — typed audit log.

The set is closed: there is no plugin mechanism for new entities in v1.

Per-node gossip

Each daemon ticks pressureHeartbeat (cadence node_status_interval, default 30 s) and gossips a NodeStatusUpdate{IncludePressure: true} carrying its local cgroup v2 + /proc/meminfo utilisation. The FSM patches the heartbeating node's cpu_pressure, memory_pressure, and last_pressure_at fields without touching status (a pressure-only heartbeat leaves status under the membership / firewall reconciler's control). The leader's rebalancer reads the patched fields through a freshness-gated StateBackedSource. See Scheduling → Pressure-based rebalancing.

Project status

Tagged releases through v0.2.1, functional for single-host and multi-host clusters via the two-binary path described above. The earlier open gaps are now implemented:

  • Cross-host gRPC TLS — the listener serves a node certificate signed by the cluster CA; the CLI and peer daemons verify against the CA (cert pinning), with the operator bearer token authenticating the caller on top.
  • Follower → leader forwarding of ReplicaObserved updates.
  • The Caddy v2 ingress reload loop integrated with the rebuild debounce window.
  • Rollout state-machine integration with the scheduler's reconcile.
  • The drain step machine for jaco node remove.

Known remaining item (this is the canonical list; other pages link here instead of repeating it): the raft transport (:7001) is still plaintext TCP — run it over a private network or overlay you control. A few bootstrap hops (a node join before it holds the CA, and follower → leader submit/log forwarding) negotiate TLS without verifying the peer.

A handful of CLI subcommands (rollback, delete, token *, node list) currently require --server; the unix-socket path for those RPCs is planned. See the CLI pages for the exact contract today.

See also