Skip to main content
This page covers the most frequent problems you may encounter when setting up and running orun nodes, along with the steps to diagnose and fix each one.
Bootstrap opens an SSH connection to the remote host using the key you provided. If it cannot connect, check the following:
  • SSH key path: confirm the private key file exists at the path you entered and that it corresponds to an authorized key on the remote host.
  • Host reachability: verify the IP address or hostname is correct and the host is online.
  • SSH user permissions: the user must have sudo or root access to install packages, copy files, and create systemd units.
Test the connection manually before re-running bootstrap:
ssh -i ~/.ssh/id_ed25519 root@<host>
If this command succeeds, bootstrap should be able to connect. If it fails, fix the SSH configuration first.
orun polls the manifest git repository on a configurable interval (default: 5 seconds). If changes are not appearing, check the following:
  • Deploy key: the node needs read access to the manifest repository. Make sure a deploy key is configured in your git host and the corresponding private key is available on the node.
  • Branch name: the gitBranch field in your node manifest must match the branch you are pushing changes to.
  • systemd service: confirm the orun service is running on the node:
    systemctl status orun
    
  • Logs: inspect the agent logs for git sync errors:
    journalctl -u orun -f
    
With the default 5-second poll interval, committed changes should appear on the node within seconds of a successful fetch.
When a deployment enters the errored state, the most common causes are:
  • Image not pullable: verify the image name and tag are correct and that the node can reach the registry. Check for typos and confirm the image exists.
  • Port conflict: another process on the host may already be using the host port you configured. Check for conflicts with ss -tlnp.
  • Health probe misconfiguration: if you have a readiness or liveness probe configured, make sure the URL is correct and the container actually exposes that endpoint.
Check the orun agent logs for the Docker error message:
journalctl -u orun -f
You can also query the status API to see the error field on the failing deployment.
orun uses the embedded Caddy server to obtain TLS certificates via Let’s Encrypt. Certificate provisioning will fail if:
  • DNS is not pointing to the node: Let’s Encrypt performs an HTTP-01 challenge, which requires that the domain resolves to the node’s public IP. Update your DNS records and wait for propagation before enabling ssl: true.
  • Ports 80 and 443 are blocked: the node’s firewall must allow inbound traffic on both ports. Check your firewall rules and any cloud provider security groups.
  • Domain is misspelled: double-check the ingress.domain value in your Service manifest matches the DNS record exactly.
Let’s Encrypt enforces rate limits on certificate issuance. If you repeatedly attempt to provision a certificate that fails, you may be temporarily rate-limited. Fix the root cause before retrying.
orun uses SOPS with age encryption for secrets. If encrypted environment variables are not being decrypted, check the following:
  • Key file location: the age private key must exist at /opt/orun/keys/age.key by default. If you store it elsewhere, set the SOPS_AGE_KEY_FILE environment variable or pass --age-key-path to orun start.
  • Correct key: confirm the key was used to encrypt the secrets file. If you encrypted with a different key, the decryptor will fail silently on those values.
  • Agent logs: orun logs a warning at startup if the decryptor is unavailable:
    journalctl -u orun | grep -i decrypt
    
If the decryptor is unavailable, orun continues running normally. Encrypted environment variables remain in their encrypted form and are passed as-is to the container — they will not be usable by your application until the key is available and the agent restarts.
Local builds require both the source code and a Dockerfile to be present on the node in the expected location. Check the following:
  • codeDir path: the --code-dir flag (default: /var/lib/orun/code) must point to the directory containing your source code on the node. Confirm the path exists and is populated.
  • Build context subdirectory: the build.context field in your deployment manifest is a subdirectory relative to codeDir. Make sure that subdirectory exists on the node.
  • Dockerfile: confirm a Dockerfile exists at the expected path within the build context.
Set buildMode: watch in your deployment manifest to trigger automatic rebuilds whenever source files change. Using build-once will only build when the manifest itself changes, not when source files are updated.
The status API binds to localhost:9100 by default. If you cannot reach it, check the following:
  • Agent is running: confirm the orun systemd service is active:
    systemctl status orun
    
  • Port is not blocked: although the API only listens on localhost, confirm that the configured port (--status-port, default 9100) is not blocked by a local firewall rule such as iptables or nftables.
  • Remote access requires a tunnel: the API does not listen on public interfaces. To query it from your local machine, open an SSH tunnel first:
    ssh -L 9100:localhost:9100 root@<host>
    
    Then query the API locally:
    curl -X POST http://localhost:9100/healthz