Skip to content

How to run an agent on a task

To run an agent on a specific task, use the viv run command.

A simple example

For example, to run the modular-public agent on the count_odds example task:

# Clone the modular-public example agent
cd ..
git clone https://github.com/poking-agents/modular-public
cd vivaria

# Use the `viv run` command to run the agent on count_odds
viv run count_odds/main --task-family-path task-standard/examples/count_odds --agent-path ../modular-public

Running your own agent and task

There are two ways to run agents on tasks, depending on if your Vivaria instance has Git support set up or not.

Upload your task and agent directly to Vivaria

This works whether or not your Vivaria instance has Git support, and is the method used in our example above.

viv run count_odds/main --task-family-path path/to/count_odds --agent-path path/to/my-agent-repo

Vivaria will create two zip files, one containing the task code in the folder path/to/count_odds and another containing the agent in path/to/my-agent-repo. It'll upload both zip files to Vivaria, which will start a task environment based on the task code and run the agent in it.

Push your agent to a Git remote

This only works if your Vivaria instance has Git support.

cd path/to/my-agent-repo
viv run count_odds/main

Vivaria will commit and push any uncommitted agent changes from your computer to your Git hosting service. Then, it'll look up the task count_odds/main in your Vivaria instance's tasks Git repo, start a task environment based on that task, and run your agent in it.

Running the specific version of a task

If the task repository you're running from has version tags enabled, then you can run specific versions of tasks. The current task versioning scheme is per family, and the required tag format is [email protected].

To run a task on a specific version, you can run with

viv run count_odds/main@count_odds/v1.2.3

Other features

Run viv run --help to see a full list of flags for viv run.

Intervention mode

You can use viv run <task> -i (or --intervention) to enable human input on certain agent actions, if the agent code supports this by calling the rate_options or get_input functions from pyhooks.

Run-time agent settings arguments

You can pass arbitrary run-time arguments to your agent in several ways. The following are equivalent:

viv run count_odds/main --agent_settings_override="\"{\"actor\": {\"model\": \"gpt-4o\"}\""
echo "{\"actor\": {\"model\": \"gpt-4o\"}" > settings.json
viv run count_odds/main --agent_settings_override "settings_override.json"

You can also store this information inside a manifest.json file inside the agent (see modular-public/manifest.json for an example)

// manifest.json
{
  "defaultSettingsPack": "my_settings",
  "settingsPacks": {
    "my_settings": {
      "actor": {
        "model": "gpt-4o"
      }
    },
    ...
  }
}

And refer to it like this:

viv run count_odds/main --agent_settings_pack my_settings

Lastly, you can an agent from a previous state. You can copy the state by clicking "Copy agent state json" in the Vivaria UI and then pasting it into some file (state.json in this example). the agent will then reload this state if you use the following argument:

viv run count_odds/main --agent_starting_state_file state.json

If you use multiple of these options, the override takes highest priority, then the settings pack, then the agent state, and lastly the default settings pack.