Setting up Vivaria using Docker Compose
We've tested that this works on Linux and macOS. Windows may require some additional setup.
Prerequisites
Container runtime installation
- macOS: We recommend OrbStack for better filesystem performance and lower memory usage compared to Docker Desktop.
- Linux & Windows: Use the official Docker Installation Guide.
Install script (macOS and Linux only)
curl -fsSL https://raw.githubusercontent.com/METR/vivaria/main/scripts/install.sh | bash -
Manual setup (macOS, Linux and Windows)
- Clone Vivaria: https://github.com/METR/vivaria
- Enter the vivaria directory:
cd vivaria
- Generate
.env.db
and.env.server
- macOS/Linux:
./scripts/setup-docker-compose.sh
- Windows PowerShell:
.\scripts\setup-docker-compose.ps1
- macOS/Linux:
- Add an LLM provider's API key to
.env.server
(make sure to only use one provider) - Start Vivaria:
docker compose up --pull always --detach --wait
(make sure to setVIVARIA_DOCKER_GID
if needed, see here)
Make sure Vivaria is running correctly
Check that the containers are running:
docker compose ps
You should at least have these containers (their names usually end with -1
):
vivaria-server
vivaria-database
vivaria-ui
vivaria-background-process-runner
If you still have vivaria-run-migrations
and you don't yet have vivaria-server
, then you might have to wait 20 seconds, or perhaps look at the logs to see if the migrations are stuck (see this section below).
Visit the UI
Open https://localhost:4000 in your browser.
- Certificate errors are expected since Vivaria generates a self-signed certificate for local use.
- You'll be asked to provide an access token and ID token (get them from
.env.server
)
Install the viv CLI
You can use the viv CLI to start task environments and agent runs.
Create a virtualenv
Make sure you have python3.11 or above used in your shell
If you need a newer python version and you're using Mac or Linux, we recommend using pyenv.
Create virtualenv: macOS/Linux
mkdir -p ~/.venvs && python3 -m venv ~/.venvs/viv && source ~/.venvs/viv/bin/activate
Create virtualenv: Windows PowerShell
mkdir $home\.venvs && python3 -m venv $home\.venvs\viv && & "$home\.venvs\viv\scripts\activate.ps1"
Install the CLI and its dependencies
pip install -e cli
Configure the CLI to use Docker Compose
Optional: backup the previous configuration
If your CLI is already installed and pointing somewhere else, you can back up the current
configuration, which is in ~/.config/viv-cli/config.json
.
Configure the CLI: macOS/Linux
./scripts/configure-cli-for-docker-compose.sh
Configure the CLI: Windows PowerShell
.\scripts\configure-cli-for-docker-compose.ps1
SSH
To have Vivaria give you access to task environments and agent containers via SSH:
viv register-ssh-public-key path/to/ssh/public/key
This will let you run viv ssh
and viv task ssh
to access the containers. Alternatively, you can use docker exec
to access the containers directly.
Next steps
Known issues
Rootless Docker mode in Linux
On Linux, Vivaria expects a Docker socket at /var/run/docker.sock
. If you're running Docker in rootless mode, create a symlink from there to the actual Docker socket location.
Docker GID on macOS/Linux (logs say permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock
)
On macOS/Linux, you may need to make sure VIVARIA_DOCKER_GID
matches your system's number before running docker compose up
. On Linux you can get this using getent group docker
. Once you have the group ID, either export it as an environment variable or run docker like this:
VIVARIA_DOCKER_GID=<number> docker compose up --pull always --detach --wait
macOS Docker Desktop and SSH access
On macOS, Docker Desktop doesn't allow direct access to containers using their IP addresses on Docker networks. Therefore, viv ssh/scp/code
and viv task ssh/scp/code
don't work out of the box. docker-compose.dev.yml
defines a jumphost container on macOS to get around this. For it to work correctly, you need to provide it with a public key for authentication.
- By default it assumes your public key is at
~/.ssh/id_rsa.pub
, but you can override this by settingSSH_PUBLIC_KEY_PATH
in.env
. - Generate an SSH key: You can use the GitHub tutorial. However:
- You don't need to "Add the SSH public key to your account on GitHub".
- You do need
~/.ssh/id_ed25519
to exist and be added to your keychain.
- Add
SSH_PUBLIC_KEY_PATH=~/.ssh/id_ed25519
to.env
- This isn't the default because of legacy reasons.
A script hangs or you get the error The system cannot find the file specified
Make sure the Docker Engine/daemon is running and not paused or in "Resource Saver" mode. (Did you install Docker in the recommended way above?)
The migration container gets an error when it tries to run
Try removing the DB container:
docker container ls --all # get the container name, e.g. vivaria-database-1
docker rm vivaria-database-1 --force
Then try running docker compose up
again.
If that didn't work, you can remove the Docker volumes too, which would also reset the DB:
docker compose down --volumes
Why: If setup-docker-compose.sh
ran after the DB container was created, it might have randomized a new
DB_READONLY_PASSWORD
(or maybe something else randomized for the DB), and if the DB container
wasn't recreated, then it might still be using the old password.
Browser error: Unable to transform response from server
Clear your browser's local storage for https://localhost:4000, then refresh https://localhost:4000 and re-enter the access and ID tokens from .env.server
.
Can't start runs with CLI because x-evals-token is incorrect
If you can access the web interface at https://localhost:4000, copy the evals token using the button in the top right corner. Then set it with the CLI:
viv config set evalsToken <token>
overlay over xfs with 'pquota' mount option
If you get Error response from daemon: --storage-opt is supported only for overlay over xfs with 'pquota' mount option
, this can be fixed by setting the TASK_ENVIRONMENT_STORAGE_GB
environment variable to -1 in .env.server
. See the Agent sandboxing options for details.
agents can only use this model on full_internet tasks if the run is interactive
For full_internet
tasks without human supervision, you'll need to grant the model you're using access by putting it in the NON_INTERVENTION_FULL_INTERNET_MODELS
environment variable, which you can set in .env.server
. See the Agent sandboxing options for details.