[Arch] Test EventStreamRuntime to ensure its feature parity with ServerRuntime (#3157)

* Remove global config from memory

* Remove runtime global config

* Remove from storage

* Remove global config

* Fix event stream tests

* Fix sandbox issue

* Change config

* Removed transferred tests

* Add swe env box

* Fixes on testing

* Fixed some tests

* Merge with stashed changes

* Fix typing

* Fix ipython test

* Revive function

* Make temp_dir fixture

* Remove test to avoid circular import

* fix eventstream filestore for test_runtime

* fix parse arg issue that cause integration test to fail

* support swebench pull from custom namespace

* add back simple tests for runtime

* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

* Apply stash

* fix run as other user;
make test async;

* fix test runtime for run as od

* add run-as-devin to all the runtime tests

* handle the case when username is root

* move all run-as-devin tests from sandbox;
only tests a few cases on different user to save time;

* move over multi-line echo related tests to test_runtime

* fix user-specific jupyter by fixing the pypoetry virtualenv folder

* make plugin's init async;
chdir at initialization of jupyter plugin;
move ipy simple testcase to test runtime;

* support agentskills import in
move tests for jupyter pwd tests;
overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter;
make agentskills read env var lazily, in case env var is updated;

* fix ServerRuntime agentskills issue

* move agnostic image test to test_runtime

* merge runtime tests in CI

* fix enable auto lint as env var

* update warning message

* update warning message

* test for different container images

* change parsing output as debug

* add exception handling for update_pwd_decorator

* fix unit test indentation

* add plugins as default input to Runtime class;
remove init_sandbox_plugins;
implement add_env_var (include jupyter) in the base class;

* fix server runtime auto lint

* Revert "add exception handling for update_pwd_decorator"

This reverts commit 2b668b1506e02145cb8f87e321aad62febca3d50.

* tries to print debugging info for agentskills

* explictly setting uid (try fix permission issue)

* Revert "tries to print debugging info for agentskills"

This reverts commit 8be4c86756f0e3fc62957b327ba2ac4999c419de.

* set sandbox user id during testing to hopefully fix the permission issue

* add browser tools for server runtime

* try to debug for old pwd

* update debug cmd

* only test agnostic runtime when TEST_RUNTIME is Server

* fix temp dir mkdir

* load TEST_RUNTIME at the beginning

* remove ipython tests

* only log to file when DEBUG

* default logging to project root

* temporarily remove log to file

* fix LLM logger dir

* fix logger

* make set pwd an optional aux action

* fix prev pwd

* fix infinity recursion

* simplify

* do not import the whole od library to avoid logger folder by jupyter

* fix browsing

* increase timeout

* attempt to fix agentskills yet again

* clean up in testcases, since CI maybe run as non-root

* add _cause attribute for event.id

* remove parent

* add a bunch of debugging statement again for CI :(

* fix temp_dir fixture

* change all temp dir to follow pytest's tmp_path_factory

* remove extra bracket

* clean up error printing a bit

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* add typing for tmp dir fixture

* clear the directory before running the test to avoid weird CI temp dir

* remove agnostic test case for server runtime

* Revert "remove agnostic test case for server runtime"

This reverts commit 30e2181c3fc1410e69596c2dcd06be01f1d016b3.

* disable agnostic tests in CI

* fix test

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
This commit is contained in:
Xingyao Wang 2024-07-31 04:30:59 +08:00 committed by GitHub
parent c8fd039173
commit bd68249fba
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
26 changed files with 1308 additions and 1091 deletions

View File

@ -1,265 +0,0 @@
name: Build Publish and Test Runtime Image
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
on:
push:
branches:
- main
tags:
- '*'
pull_request:
workflow_dispatch:
inputs:
reason:
description: 'Reason for manual trigger'
required: true
default: ''
jobs:
ghcr_build_runtime:
runs-on: ubuntu-latest
outputs:
tags: ${{ steps.capture-tags.outputs.tags }}
permissions:
contents: read
packages: write
strategy:
matrix:
image: ["od_runtime"]
base_image: ["ubuntu:22.04"]
platform: ["amd64", "arm64"]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "poetry"
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Create source distribution and Dockerfile
run: poetry run python3 opendevin/runtime/utils/runtime_build.py --base_image ${{ matrix.base_image }} --build_folder containers/runtime
- name: Build and export image
id: build
run: ./containers/build.sh ${{ matrix.image }} ${{ github.repository_owner }} ${{ matrix.platform }}
- name: Capture tags
id: capture-tags
run: |
tags=$(cat tags.txt)
echo "tags=$tags"
echo "tags=$tags" >> $GITHUB_OUTPUT
- name: Upload Docker image as artifact
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
path: /tmp/${{ matrix.image }}_image_${{ matrix.platform }}.tar
test-for-runtime:
name: Test for Runtime
runs-on: ubuntu-latest
needs: ghcr_build_runtime
env:
PERSIST_SANDBOX: "false"
steps:
- uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# when set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
swap-storage: true
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "poetry"
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Download Runtime Docker image
uses: actions/download-artifact@v4
with:
name: od_runtime-docker-image-amd64
path: /tmp/
- name: Load Runtime image and run runtime tests
run: |
# Load the Docker image and capture the output
output=$(docker load -i /tmp/od_runtime_image_amd64.tar)
# Extract the first image name from the output
image_name=$(echo "$output" | grep -oP 'Loaded image: \K.*' | head -n 1)
# Print the full name of the image
echo "Loaded Docker image: $image_name"
SANDBOX_CONTAINER_IMAGE=$image_name TEST_IN_CI=true poetry run pytest --cov=agenthub --cov=opendevin --cov-report=xml -s ./tests/unit/test_runtime.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
ghcr_push:
runs-on: ubuntu-latest
# don't push if runtime tests fail
needs: [ghcr_build_runtime, test-for-runtime]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
env:
tags: ${{ needs.ghcr_build_runtime.outputs.tags }}
permissions:
contents: read
packages: write
strategy:
matrix:
image: ["od_runtime"]
platform: ["amd64", "arm64"]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Login to GHCR
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Download Docker images
uses: actions/download-artifact@v4
with:
name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
path: /tmp/${{ matrix.platform }}
- name: List downloaded files
run: |
ls -la /tmp/${{ matrix.platform }}
file /tmp/${{ matrix.platform }}/*
- name: Load images and push to registry
run: |
mv /tmp/${{ matrix.platform }}/${{ matrix.image }}_image_${{ matrix.platform }}.tar ./${{ matrix.image }}_image_${{ matrix.platform }}.tar
if ! loaded_image=$(docker load -i ${{ matrix.image }}_image_${{ matrix.platform }}.tar | grep "Loaded image:" | head -n 1 | awk '{print $3}'); then
echo "Failed to load Docker image"
exit 1
fi
echo "loaded image = $loaded_image"
tags=$(echo ${tags} | tr ' ' '\n')
image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
echo "image name = $image_name"
for tag in $tags; do
echo "tag = $tag"
if [ -n "$image_name" ]; then
docker tag $loaded_image $image_name:${tag}_${{ matrix.platform }}
docker push $image_name:${tag}_${{ matrix.platform }}
else
echo "Skipping tag and push due to empty image_name"
fi
done
create_manifest:
runs-on: ubuntu-latest
needs: [ghcr_build_runtime, ghcr_push]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
env:
tags: ${{ needs.ghcr_build_runtime.outputs.tags }}
strategy:
matrix:
image: ["od_runtime"]
permissions:
contents: read
packages: write
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Login to GHCR
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create and push multi-platform manifest
run: |
image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
echo "image name = $image_name"
tags=$(echo ${tags} | tr ' ' '\n')
for tag in $tags; do
echo 'tag = $tag'
docker buildx imagetools create --tag $image_name:$tag \
$image_name:${tag}_amd64 \
$image_name:${tag}_arm64
done

View File

@ -1,4 +1,4 @@
name: Build Publish and Test Docker Image
name: Build Publish and Test Runtime Image
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
@ -77,14 +77,47 @@ jobs:
name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
path: /tmp/${{ matrix.image }}_image_${{ matrix.platform }}.tar
test-for-sandbox:
name: Test for Sandbox
ghcr_build_runtime:
runs-on: ubuntu-latest
needs: ghcr_build
env:
PERSIST_SANDBOX: "false"
outputs:
tags: ${{ steps.capture-tags.outputs.tags }}
permissions:
contents: read
packages: write
strategy:
matrix:
image: ["od_runtime"]
base_image: ["ubuntu:22.04"]
platform: ["amd64", "arm64"]
steps:
- uses: actions/checkout@v4
- name: Checkout
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install poetry via pipx
run: pipx install poetry
@ -98,16 +131,89 @@ jobs:
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Download sandbox Docker image
- name: Create source distribution and Dockerfile
run: poetry run python3 opendevin/runtime/utils/runtime_build.py --base_image ${{ matrix.base_image }} --build_folder containers/runtime
- name: Build and export image
id: build
run: ./containers/build.sh ${{ matrix.image }} ${{ github.repository_owner }} ${{ matrix.platform }}
- name: Capture tags
id: capture-tags
run: |
tags=$(cat tags.txt)
echo "tags=$tags"
echo "tags=$tags" >> $GITHUB_OUTPUT
- name: Upload Docker image as artifact
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
path: /tmp/${{ matrix.image }}_image_${{ matrix.platform }}.tar
test_runtime:
name: Test Runtime
runs-on: ubuntu-latest
needs: [ghcr_build_runtime, ghcr_build]
env:
PERSIST_SANDBOX: "false"
strategy:
matrix:
runtime_type: ["eventstream", "server"]
steps:
- uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# when set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
swap-storage: true
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "poetry"
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Download Runtime Docker image
if: matrix.runtime_type == 'eventstream'
uses: actions/download-artifact@v4
with:
name: od_runtime-docker-image-amd64
path: /tmp/
- name: Download Sandbox Docker image
if: matrix.runtime_type == 'server'
uses: actions/download-artifact@v4
with:
name: sandbox-docker-image-amd64
path: /tmp/
- name: Load sandbox image and run sandbox tests
- name: Load Runtime image and run runtime tests
run: |
# Load the Docker image and capture the output
output=$(docker load -i /tmp/sandbox_image_amd64.tar)
if [ "${{ matrix.runtime_type }}" == "eventstream" ]; then
output=$(docker load -i /tmp/od_runtime_image_amd64.tar)
else
output=$(docker load -i /tmp/sandbox_image_amd64.tar)
fi
# Extract the first image name from the output
image_name=$(echo "$output" | grep -oP 'Loaded image: \K.*' | head -n 1)
@ -115,14 +221,14 @@ jobs:
# Print the full name of the image
echo "Loaded Docker image: $image_name"
SANDBOX_CONTAINER_IMAGE=$image_name TEST_IN_CI=true poetry run pytest --cov=agenthub --cov=opendevin --cov-report=xml -s ./tests/unit/test_sandbox.py
TEST_RUNTIME=${{ matrix.runtime_type }} SANDBOX_USER_ID=$(id -u) SANDBOX_CONTAINER_IMAGE=$image_name TEST_IN_CI=true poetry run pytest --cov=agenthub --cov=opendevin --cov-report=xml -s ./tests/unit/test_runtime.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
integration-tests-on-linux:
integration_tests_on_linux:
name: Integration Tests on Linux
runs-on: ubuntu-latest
needs: ghcr_build
@ -174,10 +280,11 @@ jobs:
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
ghcr_push:
runs-on: ubuntu-latest
# don't push if integration tests or sandbox tests fail
needs: [ghcr_build, integration-tests-on-linux, test-for-sandbox]
needs: [ghcr_build, test_runtime, integration_tests_on_linux]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
env:
@ -223,6 +330,78 @@ jobs:
docker push $image_name:${tag}_${{ matrix.platform }}
done
ghcr_push_runtime:
runs-on: ubuntu-latest
# don't push if runtime tests fail
needs: [ghcr_build_runtime, test_runtime, integration_tests_on_linux]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
env:
tags: ${{ needs.ghcr_build_runtime.outputs.tags }}
permissions:
contents: read
packages: write
strategy:
matrix:
image: ["od_runtime"]
platform: ["amd64", "arm64"]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Login to GHCR
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Download Docker images
uses: actions/download-artifact@v4
with:
name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
path: /tmp/${{ matrix.platform }}
- name: List downloaded files
run: |
ls -la /tmp/${{ matrix.platform }}
file /tmp/${{ matrix.platform }}/*
- name: Load images and push to registry
run: |
mv /tmp/${{ matrix.platform }}/${{ matrix.image }}_image_${{ matrix.platform }}.tar ./${{ matrix.image }}_image_${{ matrix.platform }}.tar
if ! loaded_image=$(docker load -i ${{ matrix.image }}_image_${{ matrix.platform }}.tar | grep "Loaded image:" | head -n 1 | awk '{print $3}'); then
echo "Failed to load Docker image"
exit 1
fi
echo "loaded image = $loaded_image"
tags=$(echo ${tags} | tr ' ' '\n')
image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
echo "image name = $image_name"
for tag in $tags; do
echo "tag = $tag"
if [ -n "$image_name" ]; then
docker tag $loaded_image $image_name:${tag}_${{ matrix.platform }}
docker push $image_name:${tag}_${{ matrix.platform }}
else
echo "Skipping tag and push due to empty image_name"
fi
done
create_manifest:
runs-on: ubuntu-latest
needs: [ghcr_build, ghcr_push]
@ -261,3 +440,42 @@ jobs:
$image_name:${tag}_amd64 \
$image_name:${tag}_arm64
done
create_manifest_runtime:
runs-on: ubuntu-latest
needs: [ghcr_build_runtime, ghcr_push_runtime]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
env:
tags: ${{ needs.ghcr_build_runtime.outputs.tags }}
strategy:
matrix:
image: ["od_runtime"]
permissions:
contents: read
packages: write
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Login to GHCR
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create and push multi-platform manifest
run: |
image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
echo "image name = $image_name"
tags=$(echo ${tags} | tr ' ' '\n')
for tag in $tags; do
echo 'tag = $tag'
docker buildx imagetools create --tag $image_name:$tag \
$image_name:${tag}_amd64 \
$image_name:${tag}_arm64
done

View File

@ -123,9 +123,8 @@ def get_console_handler():
return console_handler
def get_file_handler(log_dir=None):
def get_file_handler(log_dir):
"""Returns a file handler for logging."""
log_dir = os.path.join(os.getcwd(), 'logs') if log_dir is None else log_dir
os.makedirs(log_dir, exist_ok=True)
timestamp = datetime.now().strftime('%Y-%m-%d')
file_name = f'opendevin_{timestamp}.log'
@ -159,16 +158,21 @@ sys.excepthook = log_uncaught_exceptions
opendevin_logger = logging.getLogger('opendevin')
opendevin_logger.setLevel(logging.INFO)
LOG_DIR = os.path.join(
# parent dir of opendevin/core (i.e., root of the repo)
os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
'logs',
)
if DEBUG:
opendevin_logger.setLevel(logging.DEBUG)
opendevin_logger.addHandler(get_file_handler())
# default log to project root
opendevin_logger.info('DEBUG logging is enabled. Logging to %s', LOG_DIR)
opendevin_logger.addHandler(get_file_handler(LOG_DIR))
opendevin_logger.addHandler(get_console_handler())
opendevin_logger.addFilter(SensitiveDataFilter(opendevin_logger.name))
opendevin_logger.propagate = False
opendevin_logger.debug('Logging initialized')
opendevin_logger.debug(
'Logging to %s', os.path.join(os.getcwd(), 'logs', 'opendevin.log')
)
# Exclude LiteLLM from logging output
logging.getLogger('LiteLLM').disabled = True
@ -194,7 +198,7 @@ class LlmFileHandler(logging.FileHandler):
self.session = datetime.now().strftime('%y-%m-%d_%H-%M')
else:
self.session = 'default'
self.log_directory = os.path.join(os.getcwd(), 'logs', 'llm', self.session)
self.log_directory = os.path.join(LOG_DIR, 'llm', self.session)
os.makedirs(self.log_directory, exist_ok=True)
if not DEBUG:
# Clear the log directory if not in debug mode

View File

@ -87,9 +87,13 @@ async def run_agent_controller(
# runtime and tools
runtime_cls = get_runtime_cls(config.runtime)
runtime = runtime_cls(config=config, event_stream=event_stream, sandbox=sandbox)
runtime = runtime_cls(
config=config,
event_stream=event_stream,
sandbox=sandbox,
plugins=controller.agent.sandbox_plugins,
)
await runtime.ainit()
runtime.init_sandbox_plugins(controller.agent.sandbox_plugins)
runtime.init_runtime_tools(
controller.agent.runtime_tools,
is_async=False,

View File

@ -13,6 +13,8 @@ import argparse
import asyncio
import os
import re
import subprocess
from contextlib import asynccontextmanager
from pathlib import Path
import pexpect
@ -35,6 +37,7 @@ from opendevin.events.observation import (
ErrorObservation,
FileReadObservation,
FileWriteObservation,
IPythonRunCellObservation,
Observation,
)
from opendevin.events.serialization import event_from_dict, event_to_dict
@ -48,8 +51,6 @@ from opendevin.runtime.plugins import (
from opendevin.runtime.server.files import insert_lines, read_lines
from opendevin.runtime.utils import split_bash_commands
app = FastAPI()
class ActionRequest(BaseModel):
action: dict
@ -60,19 +61,81 @@ class RuntimeClient:
It is responsible for executing actions received from OpenDevin backend and producing observations.
"""
def __init__(self, plugins_to_load: list[Plugin], work_dir: str) -> None:
self._init_bash_shell(work_dir)
def __init__(
self, plugins_to_load: list[Plugin], work_dir: str, username: str, user_id: int
) -> None:
self.plugins_to_load = plugins_to_load
self.username = username
self.user_id = user_id
self.pwd = work_dir # current PWD
self._init_user(self.username, self.user_id)
self._init_bash_shell(self.pwd, self.username)
self.lock = asyncio.Lock()
self.plugins: dict[str, Plugin] = {}
self.browser = BrowserEnv()
for plugin in plugins_to_load:
plugin.initialize()
async def ainit(self):
for plugin in self.plugins_to_load:
await plugin.initialize(self.username)
self.plugins[plugin.name] = plugin
logger.info(f'Initializing plugin: {plugin.name}')
def _init_bash_shell(self, work_dir: str) -> None:
self.shell = pexpect.spawn('/bin/bash', encoding='utf-8', echo=False)
if isinstance(plugin, JupyterPlugin):
await self.run_ipython(
IPythonRunCellAction(code=f'import os; os.chdir("{self.pwd}")')
)
# This is a temporary workaround
# TODO: refactor AgentSkills to be part of JupyterPlugin
# AFTER ServerRuntime is deprecated
if 'agent_skills' in self.plugins and 'jupyter' in self.plugins:
obs = await self.run_ipython(
IPythonRunCellAction(
code=(
'import sys\n'
'sys.path.insert(0, "/opendevin/code/opendevin/runtime/plugins/agent_skills")\n'
'from agentskills import *'
)
)
)
logger.info(f'AgentSkills initialized: {obs}')
def _init_user(self, username: str, user_id: int) -> None:
"""Create user if not exists."""
# Skip root since it is already created
if username == 'root':
return
# Add sudoer
sudoer_line = r"echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers"
output = subprocess.run(sudoer_line, shell=True, capture_output=True)
if output.returncode != 0:
raise RuntimeError(f'Failed to add sudoer: {output.stderr.decode()}')
logger.debug(f'Added sudoer successfully. Output: [{output.stdout.decode()}]')
# Add user
output = subprocess.run(
(
f'useradd -rm -d /home/{username} -s /bin/bash '
f'-g root -G sudo -g root -G sudo -u {user_id} {username}'
),
shell=True,
capture_output=True,
)
if output.returncode != 0:
raise RuntimeError(
f'Failed to create user {username}: {output.stderr.decode()}'
)
logger.debug(
f'Added user {username} successfully. Output: [{output.stdout.decode()}]'
)
def _init_bash_shell(self, work_dir: str, username: str) -> None:
self.shell = pexpect.spawn(
f'su - {username}',
encoding='utf-8',
echo=False,
)
self.__bash_PS1 = r'[PEXPECT_BEGIN] \u@\h:\w [PEXPECT_END]'
# This should NOT match "PS1=\u@\h:\w [PEXPECT]$" when `env` is executed
@ -85,8 +148,11 @@ class RuntimeClient:
self.shell.sendline(f'cd {work_dir}')
self.shell.expect(self.__bash_expect_regex)
logger.debug(
f'Bash initialized. Working directory: {work_dir}. Output: {self.shell.before}'
)
def _get_bash_prompt(self):
def _get_bash_prompt_and_update_pwd(self):
ps1 = self.shell.after
# begin at the last occurence of '[PEXPECT_BEGIN]'.
@ -103,6 +169,8 @@ class RuntimeClient:
matched is not None
), f'Failed to parse bash prompt: {ps1}. This should not happen.'
username, hostname, working_dir = matched.groups()
self._prev_pwd = self.pwd
self.pwd = working_dir
# re-assemble the prompt
prompt = f'{username}@{hostname}:{working_dir} '
@ -112,20 +180,25 @@ class RuntimeClient:
prompt += '$'
return prompt + ' '
def _execute_bash(self, command: str, keep_prompt: bool = True) -> tuple[str, int]:
def _execute_bash(
self,
command: str,
keep_prompt: bool = True,
timeout: int = 300,
) -> tuple[str, int]:
logger.debug(f'Executing command: {command}')
self.shell.sendline(command)
self.shell.expect(self.__bash_expect_regex)
self.shell.expect(self.__bash_expect_regex, timeout=timeout)
output = self.shell.before
if keep_prompt:
output += '\r\n' + self._get_bash_prompt()
output += '\r\n' + self._get_bash_prompt_and_update_pwd()
logger.debug(f'Command output: {output}')
# Get exit code
self.shell.sendline('echo $?')
logger.debug(f'Executing command for exit code: {command}')
self.shell.expect(self.__bash_expect_regex)
self.shell.expect(self.__bash_expect_regex, timeout=timeout)
_exit_code_output = self.shell.before
logger.debug(f'Exit code Output: {_exit_code_output}')
exit_code = int(_exit_code_output.strip().split()[0])
@ -134,7 +207,6 @@ class RuntimeClient:
async def run_action(self, action) -> Observation:
action_type = action.action
observation = await getattr(self, action_type)(action)
observation._parent = action.id
return observation
async def run(self, action: CmdRunAction) -> CmdOutputObservation:
@ -164,7 +236,18 @@ class RuntimeClient:
async def run_ipython(self, action: IPythonRunCellAction) -> Observation:
if 'jupyter' in self.plugins:
_jupyter_plugin: JupyterPlugin = self.plugins['jupyter'] # type: ignore
return await _jupyter_plugin.run(action)
# This is used to make AgentSkills in Jupyter aware of the
# current working directory in Bash
if not hasattr(self, '_prev_pwd') or self.pwd != self._prev_pwd:
reset_jupyter_pwd_code = (
f'import os; os.environ["JUPYTER_PWD"] = "{self.pwd}"\n\n'
)
_aux_action = IPythonRunCellAction(code=reset_jupyter_pwd_code)
_ = await _jupyter_plugin.run(_aux_action)
obs: IPythonRunCellObservation = await _jupyter_plugin.run(action)
return obs
else:
raise RuntimeError(
'JupyterRequirement not found. Unable to run IPython action.'
@ -272,6 +355,10 @@ if __name__ == '__main__':
parser.add_argument('port', type=int, help='Port to listen on')
parser.add_argument('--working-dir', type=str, help='Working directory')
parser.add_argument('--plugins', type=str, help='Plugins to initialize', nargs='+')
parser.add_argument(
'--username', type=str, help='User to run as', default='opendevin'
)
parser.add_argument('--user-id', type=int, help='User ID to run as', default=1000)
# example: python client.py 8000 --working-dir /workspace --plugins JupyterRequirement
args = parser.parse_args()
@ -282,16 +369,34 @@ if __name__ == '__main__':
raise ValueError(f'Plugin {plugin} not found')
plugins_to_load.append(ALL_PLUGINS[plugin]()) # type: ignore
client = RuntimeClient(plugins_to_load, work_dir=args.working_dir)
client: RuntimeClient | None = None
@asynccontextmanager
async def lifespan(app: FastAPI):
global client
client = RuntimeClient(
plugins_to_load,
work_dir=args.working_dir,
username=args.username,
user_id=args.user_id,
)
await client.ainit()
yield
# Clean up & release the resources
client.close()
app = FastAPI(lifespan=lifespan)
@app.middleware('http')
async def one_request_at_a_time(request: Request, call_next):
assert client is not None
async with client.lock:
response = await call_next(request)
return response
@app.post('/execute_action')
async def execute_action(action_request: ActionRequest):
assert client is not None
try:
action = event_from_dict(action_request.action)
if not isinstance(action, Action):

View File

@ -44,10 +44,12 @@ class EventStreamRuntime(Runtime):
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
container_image: str | None = None,
plugins: list[PluginRequirement] | None = None,
container_image: str | None = None,
):
super().__init__(config, event_stream, sid) # will initialize the event stream
super().__init__(
config, event_stream, sid, plugins
) # will initialize the event stream
self._port = find_available_tcp_port()
self.api_url = f'http://localhost:{self._port}'
self.session: Optional[aiohttp.ClientSession] = None
@ -139,7 +141,9 @@ class EventStreamRuntime(Runtime):
'PYTHONUNBUFFERED=1 poetry run '
f'python -u -m opendevin.runtime.client.client {self._port} '
f'--working-dir {sandbox_workspace_dir} '
f'--plugins {plugin_names}'
f'--plugins {plugin_names} '
f'--username {"opendevin" if self.config.run_as_devin else "root"} '
f'--user-id {self.config.sandbox.user_id}'
),
network_mode=network_mode,
ports=port_mapping,
@ -206,7 +210,7 @@ class EventStreamRuntime(Runtime):
if isinstance(event, Action):
logger.info(event, extra={'msg_type': 'ACTION'})
observation = await self.run_action(event)
# observation._cause = event.id # type: ignore[attr-defined]
observation._cause = event.id # type: ignore[attr-defined]
logger.info(observation, extra={'msg_type': 'OBSERVATION'})
source = event.source if event.source else EventSource.AGENT
await self.event_stream.add_event(observation, source)
@ -248,7 +252,6 @@ class EventStreamRuntime(Runtime):
except Exception as e:
logger.error(f'Error during command execution: {e}')
obs = ErrorObservation(f'Command execution failed: {str(e)}')
obs._parent = action.id # type: ignore[attr-defined]
return obs
async def run(self, action: CmdRunAction) -> Observation:
@ -277,14 +280,3 @@ class EventStreamRuntime(Runtime):
raise NotImplementedError(
'This method is not implemented in the runtime client.'
)
############################################################################
# Initialization work inside sandbox image
############################################################################
# init_runtime_tools direcctly do as what Runtime do
# Do in the od_runtime_client
# Overwrite the init_sandbox_plugins
def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
pass

View File

@ -11,6 +11,7 @@ from opendevin.events.observation import (
)
from opendevin.events.stream import EventStream
from opendevin.runtime import Sandbox
from opendevin.runtime.plugins import PluginRequirement
from opendevin.runtime.server.files import insert_lines, read_lines
from opendevin.runtime.server.runtime import ServerRuntime
@ -24,9 +25,10 @@ class E2BRuntime(ServerRuntime):
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
plugins: list[PluginRequirement] | None = None,
sandbox: Sandbox | None = None,
):
super().__init__(config, event_stream, sid, sandbox)
super().__init__(config, event_stream, sid, plugins, sandbox)
if not isinstance(self.sandbox, E2BSandbox):
raise ValueError('E2BRuntime requires an E2BSandbox')
self.file_store = E2BFileStore(self.sandbox.filesystem)

View File

@ -41,30 +41,76 @@ CURRENT_LINE = 1
WINDOW = 100
ENABLE_AUTO_LINT = os.getenv('ENABLE_AUTO_LINT', 'false').lower() == 'true'
# This is also used in unit tests!
MSG_FILE_UPDATED = '[File updated (edited at line {line_number}). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]'
# ==================================================================================================
# OPENAI
OPENAI_API_KEY = os.getenv(
'OPENAI_API_KEY', os.getenv('SANDBOX_ENV_OPENAI_API_KEY', '')
)
OPENAI_BASE_URL = os.getenv('OPENAI_BASE_URL', 'https://api.openai.com/v1')
OPENAI_MODEL = os.getenv('OPENAI_MODEL', 'gpt-4o-2024-05-13')
MAX_TOKEN = os.getenv('MAX_TOKEN', 500)
# TODO: Move this to EventStream Actions when EventStreamRuntime is fully implemented
# NOTE: we need to get env vars inside functions because they will be set in IPython
# AFTER the agentskills is imported (the case for EventStreamRuntime)
# ==================================================================================================
def _get_openai_api_key():
return os.getenv('OPENAI_API_KEY', os.getenv('SANDBOX_ENV_OPENAI_API_KEY', ''))
OPENAI_PROXY = f'{OPENAI_BASE_URL}/chat/completions'
client = OpenAI(api_key=OPENAI_API_KEY, base_url=OPENAI_BASE_URL)
def _get_openai_base_url():
return os.getenv('OPENAI_BASE_URL', 'https://api.openai.com/v1')
def _get_openai_model():
return os.getenv('OPENAI_MODEL', 'gpt-4o-2024-05-13')
def _get_max_token():
return os.getenv('MAX_TOKEN', 500)
def _get_openai_client():
client = OpenAI(api_key=_get_openai_api_key(), base_url=_get_openai_base_url())
return client
# ==================================================================================================
# Define the decorator using the functionality of UpdatePwd
def update_pwd_decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
old_pwd = os.getcwd()
jupyter_pwd = os.environ.get('JUPYTER_PWD', None)
try:
old_pwd = os.getcwd()
except FileNotFoundError:
import json
import subprocess
print(
f'DEBUGGING Environment variables: {json.dumps(dict(os.environ), indent=2)}'
)
print(f'DEBUGGING User ID: {os.getuid()}, Group ID: {os.getgid()}')
out = subprocess.run(['pwd'], capture_output=True)
old_pwd = out.stdout.decode('utf-8').strip()
os.chdir(old_pwd)
print(f'DEBUGGING Change to working directory: {old_pwd}')
import tempfile
try:
tempfile.TemporaryFile(dir=old_pwd)
print(f'DEBUGGING Directory {old_pwd} is writable')
except Exception as e:
print(f'DEBUGGING Directory {old_pwd} is not writable: {str(e)}')
# ls -alh
out = subprocess.run(['ls', '-alh', old_pwd], capture_output=True)
print(
f'DEBUGGING OLD working directory contents: {out.stdout.decode("utf-8")}'
)
print(f'DEBUGGING Target JUPYTER pwd: {jupyter_pwd}')
if jupyter_pwd:
os.chdir(jupyter_pwd)
try:
@ -506,7 +552,10 @@ def _edit_file_impl(
shutil.move(temp_file_path, src_abs_path)
# Handle linting
if ENABLE_AUTO_LINT:
# NOTE: we need to get env var inside this function
# because the env var will be set AFTER the agentskills is imported
enable_auto_lint = os.getenv('ENABLE_AUTO_LINT', 'false').lower() == 'true'
if enable_auto_lint:
# BACKUP the original file
original_file_backup_path = os.path.join(
os.path.dirname(file_name),
@ -954,7 +1003,9 @@ def parse_audio(file_path: str, model: str = 'whisper-1') -> None:
try:
# TODO: record the COST of the API call
with open(file_path, 'rb') as audio_file:
transcript = client.audio.translations.create(model=model, file=audio_file)
transcript = _get_openai_client().audio.translations.create(
model=model, file=audio_file
)
print(transcript.text)
except Exception as e:
@ -975,10 +1026,10 @@ def parse_image(
# TODO: record the COST of the API call
try:
base64_image = _base64_img(file_path)
response = client.chat.completions.create(
model=OPENAI_MODEL,
response = _get_openai_client().chat.completions.create(
model=_get_openai_model(),
messages=_prepare_image_messages(task, base64_image),
max_tokens=MAX_TOKEN,
max_tokens=_get_max_token(),
)
content = response.choices[0].message.content
print(content)
@ -1021,10 +1072,10 @@ def parse_video(
print(f'Process the {file_path}, current No. {idx * frame_interval} frame...')
# TODO: record the COST of the API call
try:
response = client.chat.completions.create(
model=OPENAI_MODEL,
response = _get_openai_client().chat.completions.create(
model=_get_openai_model(),
messages=_prepare_image_messages(task, base64_frame),
max_tokens=MAX_TOKEN,
max_tokens=_get_max_token(),
)
content = response.choices[0].message.content
@ -1077,7 +1128,9 @@ __all__ = [
'parse_pptx',
]
if OPENAI_API_KEY and OPENAI_BASE_URL:
# This is called from OpenDevin's side
# If SANDBOX_ENV_OPENAI_API_KEY is set, we will be able to use these tools in the sandbox environment
if _get_openai_api_key() and _get_openai_base_url():
__all__ += ['parse_audio', 'parse_video', 'parse_image']
DOCUMENTATION = ''

View File

@ -3,8 +3,9 @@ import subprocess
import time
from dataclasses import dataclass
from opendevin.core.logger import opendevin_logger as logger
from opendevin.events.action import Action, IPythonRunCellAction
from opendevin.events.observation import IPythonRunCellObservation, Observation
from opendevin.events.observation import IPythonRunCellObservation
from opendevin.runtime.plugins.requirement import Plugin, PluginRequirement
from opendevin.runtime.utils import find_available_tcp_port
@ -29,33 +30,38 @@ class JupyterRequirement(PluginRequirement):
class JupyterPlugin(Plugin):
name: str = 'jupyter'
def initialize(self, kernel_id: str = 'opendevin-default'):
async def initialize(self, username: str, kernel_id: str = 'opendevin-default'):
self.kernel_gateway_port = find_available_tcp_port()
self.kernel_id = kernel_id
self.gateway_process = subprocess.Popen(
[
'/opendevin/miniforge3/bin/mamba',
'run',
'-n',
'base',
'poetry',
'run',
'jupyter',
'kernelgateway',
'--KernelGatewayApp.ip=0.0.0.0',
f'--KernelGatewayApp.port={self.kernel_gateway_port}',
],
(
f"su - {username} -s /bin/bash << 'EOF'\n"
'cd /opendevin/code\n'
'export POETRY_VIRTUALENVS_PATH=/opendevin/poetry;\n'
'/opendevin/miniforge3/bin/mamba run -n base '
'poetry run jupyter kernelgateway '
'--KernelGatewayApp.ip=0.0.0.0 '
f'--KernelGatewayApp.port={self.kernel_gateway_port}\n'
'EOF'
),
stderr=subprocess.STDOUT,
shell=True,
)
# read stdout until the kernel gateway is ready
output = ''
while True and self.gateway_process.stdout is not None:
line = self.gateway_process.stdout.readline().decode('utf-8')
output += line
if 'at' in line:
break
time.sleep(1)
print('Waiting for jupyter kernel gateway to start...')
logger.debug('Waiting for jupyter kernel gateway to start...')
async def run(self, action: Action) -> Observation:
logger.info(
f'Jupyter kernel gateway started at port {self.kernel_gateway_port}. Output: {output}'
)
async def run(self, action: Action) -> IPythonRunCellObservation:
if not isinstance(action, IPythonRunCellAction):
raise ValueError(
f'Jupyter plugin only supports IPythonRunCellAction, but got {action}'

View File

@ -73,8 +73,8 @@ class JupyterKernel:
if os.path.exists('/opendevin/plugins/agent_skills/agentskills.py'):
self.tools_to_run.append('from agentskills import *')
for tool in self.tools_to_run:
# logging.info(f'Tool initialized:\n{tool}')
await self.execute(tool)
res = await self.execute(tool)
logging.info(f'Tool [{tool}] initialized:\n{res}')
self.initialized = True
async def _send_heartbeat(self):

View File

@ -82,13 +82,14 @@ class PluginMixin:
raise RuntimeError(
f'Failed to initialize plugin {requirement.name} with exit code {_exit_code} and output: {total_output.strip()}'
)
logger.info(f'Plugin {requirement.name} initialized successfully')
logger.debug(f'Output: {total_output.strip()}')
else:
if exit_code != 0:
raise RuntimeError(
f'Failed to initialize plugin {requirement.name} with exit code {exit_code} and output: {output}'
)
logger.info(f'Plugin {requirement.name} initialized successfully.')
logger.debug(f'Output: {output}')
logger.info(f'Plugin {requirement.name} initialized successfully')
else:
logger.info('Skipping plugin initialization in the sandbox')

View File

@ -14,7 +14,7 @@ class Plugin:
name: str
@abstractmethod
def initialize(self):
async def initialize(self, username: str):
"""Initialize the plugin."""
pass

View File

@ -28,7 +28,7 @@ from opendevin.events.observation import (
RejectObservation,
)
from opendevin.events.serialization.action import ACTION_TYPE_TO_CLASS
from opendevin.runtime.plugins import PluginRequirement
from opendevin.runtime.plugins import JupyterRequirement, PluginRequirement
from opendevin.runtime.tools import RuntimeTool
from opendevin.storage import FileStore
@ -60,10 +60,13 @@ class Runtime:
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
plugins: list[PluginRequirement] | None = None,
):
self.sid = sid
self.event_stream = event_stream
self.event_stream.subscribe(EventStreamSubscriber.RUNTIME, self.on_event)
self.plugins = plugins if plugins is not None else []
self.config = copy.deepcopy(config)
self.DEFAULT_ENV_VARS = _default_env_vars(config.sandbox)
atexit.register(self.close_sync)
@ -101,10 +104,6 @@ class Runtime:
# Methods we plan to deprecate when we move to new EventStreamRuntime
# ====================================================================
def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
# TODO: deprecate this method when we move to the new EventStreamRuntime
raise NotImplementedError('This method is not implemented in the base class.')
def init_runtime_tools(
self,
runtime_tools: list[RuntimeTool],
@ -117,6 +116,17 @@ class Runtime:
# ====================================================================
async def add_env_vars(self, env_vars: dict[str, str]) -> None:
# Add env vars to the IPython shell (if Jupyter is used)
if any(isinstance(plugin, JupyterRequirement) for plugin in self.plugins):
code = 'import os\n'
for key, value in env_vars.items():
# Note: json.dumps gives us nice escaping for free
code += f'os.environ["{key}"] = {json.dumps(value)}\n'
code += '\n'
obs = await self.run_ipython(IPythonRunCellAction(code))
logger.info(f'Added env vars to IPython: code={code}, obs={obs}')
# Add env vars to the Bash shell
cmd = ''
for key, value in env_vars.items():
# Note: json.dumps gives us nice escaping for free
@ -125,7 +135,7 @@ class Runtime:
return
cmd = cmd.strip()
logger.debug(f'Adding env var: {cmd}')
obs: Observation = await self.run(CmdRunAction(cmd))
obs = await self.run(CmdRunAction(cmd))
if not isinstance(obs, CmdOutputObservation) or obs.exit_code != 0:
raise RuntimeError(
f'Failed to add env vars [{env_vars}] to environment: {obs.content}'
@ -164,7 +174,6 @@ class Runtime:
'Action has been rejected by the user! Waiting for further user input.'
)
observation = await getattr(self, action_type)(action)
observation._parent = action.id # type: ignore[attr-defined]
return observation
# ====================================================================

View File

@ -25,7 +25,7 @@ from opendevin.runtime import (
Sandbox,
)
from opendevin.runtime.browser.browser_env import BrowserEnv
from opendevin.runtime.plugins import PluginRequirement
from opendevin.runtime.plugins import JupyterRequirement, PluginRequirement
from opendevin.runtime.runtime import Runtime
from opendevin.runtime.tools import RuntimeTool
from opendevin.storage.local import LocalFileStore
@ -40,9 +40,10 @@ class ServerRuntime(Runtime):
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
plugins: list[PluginRequirement] | None = None,
sandbox: Sandbox | None = None,
):
super().__init__(config, event_stream, sid)
super().__init__(config, event_stream, sid, plugins)
self.file_store = LocalFileStore(config.workspace_base)
if sandbox is None:
self.sandbox = self.create_sandbox(sid, config.sandbox.box_type)
@ -79,19 +80,29 @@ class ServerRuntime(Runtime):
raise ValueError(f'Invalid sandbox type: {box_type}')
async def ainit(self, env_vars: dict[str, str] | None = None):
# init sandbox plugins
self.sandbox.init_plugins(self.plugins)
# MUST call super().ainit() to initialize both default env vars
# AND the ones in env vars!
await super().ainit(env_vars)
if any(isinstance(plugin, JupyterRequirement) for plugin in self.plugins):
obs = await self.run_ipython(
IPythonRunCellAction(
code=f'import os; os.chdir("{self.config.workspace_mount_path_in_sandbox}")'
)
)
logger.info(
f'Switch to working directory {self.config.workspace_mount_path_in_sandbox} in IPython. Output: {obs.content}'
)
async def close(self):
if hasattr(self, '_is_external_sandbox') and not self._is_external_sandbox:
self.sandbox.close()
if hasattr(self, 'browser') and self.browser is not None:
self.browser.close()
def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
self.sandbox.init_plugins(plugins)
def init_runtime_tools(
self,
runtime_tools: list[RuntimeTool],

View File

@ -7,8 +7,11 @@ def split_bash_commands(commands):
try:
parsed = bashlex.parse(commands)
except bashlex.errors.ParsingError as e:
logger.error(
f'Failed to parse bash commands\n[input]: {commands}\n[error]: {e}'
logger.debug(
f'Failed to parse bash commands\n'
f'[input]: {commands}\n'
f'[warning]: {e}\n'
f'The original command will be returned as is.'
)
# If parsing fails, return the original commands
return [commands]

View File

@ -5,6 +5,7 @@ FROM {{ base_image }}
# START: Build Runtime Image from Scratch
# ================================================================
FROM {{ base_image }}
{% if 'ubuntu' in base_image and (base_image.endswith(':latest') or base_image.endswith(':24.04')) %}
{% set LIBGL_MESA = 'libgl1' %}
{% else %}
@ -20,8 +21,10 @@ RUN apt-get update && \
# Create necessary directories
RUN mkdir -p /opendevin && \
mkdir -p /opendevin/logs && \
chmod 777 /opendevin/logs && \
echo "" > /opendevin/bash.bashrc
mkdir -p /opendevin/poetry && \
chmod 777 -R /opendevin
ENV POETRY_VIRTUALENVS_PATH=/opendevin/poetry
RUN if [ ! -d /opendevin/miniforge3 ]; then \
wget --progress=bar:force -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" && \

View File

@ -51,7 +51,7 @@ class AgentSession:
raise Exception(
'Session already started. You need to close this session and start a new one.'
)
await self._create_runtime(runtime_name, config)
await self._create_runtime(runtime_name, config, agent)
await self._create_controller(
agent,
confirmation_mode,
@ -71,7 +71,7 @@ class AgentSession:
await self.runtime.close()
self._closed = True
async def _create_runtime(self, runtime_name: str, config: AppConfig):
async def _create_runtime(self, runtime_name: str, config: AppConfig, agent: Agent):
"""Creates a runtime instance."""
if self.runtime is not None:
raise Exception('Runtime already created')
@ -79,7 +79,10 @@ class AgentSession:
logger.info(f'Using runtime: {runtime_name}')
runtime_cls = get_runtime_cls(runtime_name)
self.runtime = runtime_cls(
config=config, event_stream=self.event_stream, sid=self.sid
config=config,
event_stream=self.event_stream,
sid=self.sid,
plugins=agent.sandbox_plugins,
)
await self.runtime.ainit()
@ -107,7 +110,6 @@ class AgentSession:
'CodeActAgent requires DockerSSHBox as sandbox! Using other sandbox that are not stateful'
' LocalBox will not work properly.'
)
self.runtime.init_sandbox_plugins(agent.sandbox_plugins)
self.runtime.init_runtime_tools(agent.runtime_tools)
self.controller = AgentController(

View File

@ -2,6 +2,7 @@ import contextlib
import io
import os
import sys
from unittest.mock import patch
import docx
import pytest
@ -488,13 +489,9 @@ def test_open_file_large_line_number_consecutive_diff_window(tmp_path):
assert result == expected
def test_edit_file_by_replace_window(tmp_path, monkeypatch):
# Set environment variable via monkeypatch does NOT work!
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', True
)
content = """def any_int(a, b, c):
def test_edit_file_by_replace_window(tmp_path):
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'True'}):
content = """def any_int(a, b, c):
return isinstance(a, int) and isinstance(b, int) and isinstance(c, int)
def test_any_int():
@ -528,83 +525,83 @@ def check(any_int):
check(any_int)"""
temp_file_path = tmp_path / 'error-test.py'
temp_file_path.write_text(content)
temp_file_path = tmp_path / 'error-test.py'
temp_file_path.write_text(content)
open_file(str(temp_file_path))
open_file(str(temp_file_path))
with io.StringIO() as buf:
with contextlib.redirect_stdout(buf):
edit_file_by_replace(
str(temp_file_path),
to_replace=' assert any_int(1.0, 2, 3) == False',
new_content=' assert any_int(1.0, 2, 3) == False',
with io.StringIO() as buf:
with contextlib.redirect_stdout(buf):
edit_file_by_replace(
str(temp_file_path),
to_replace=' assert any_int(1.0, 2, 3) == False',
new_content=' assert any_int(1.0, 2, 3) == False',
)
result = buf.getvalue()
expected = (
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
+ str(temp_file_path)
+ ':9:9: '
+ 'E999 IndentationError: unexpected indent\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def any_int(a, b, c):\n'
'2| return isinstance(a, int) and isinstance(b, int) and isinstance(c, int)\n'
'3|\n'
'4|def test_any_int():\n'
'5| assert any_int(1, 2, 3) == True\n'
'6| assert any_int(1.5, 2, 3) == False\n'
'7| assert any_int(1, 2.5, 3) == False\n'
'8| assert any_int(1, 2, 3.5) == False\n'
'9| assert any_int(1.0, 2, 3) == False\n'
'10| assert any_int(1, 2.0, 3) == False\n'
'11| assert any_int(1, 2, 3.0) == False\n'
'12| assert any_int(0, 0, 0) == True\n'
'13| assert any_int(-1, -2, -3) == True\n'
'14| assert any_int(1, -2, 3) == True\n'
'15| assert any_int(1.5, -2, 3) == False\n'
'16| assert any_int(1, -2.5, 3) == False\n'
'17|\n'
'18|def check(any_int):\n'
'19| # Check some simple cases\n'
'20| assert any_int(2, 3, 1)==True, "This prints if this assert fails 1 (good for debugging!)"\n'
'21| assert any_int(2.5, 2, 3)==False, "This prints if this assert fails 2 (good for debugging!)"\n'
'(12 more lines below)\n'
'-------------------------------------------------\n'
'\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def any_int(a, b, c):\n'
'2| return isinstance(a, int) and isinstance(b, int) and isinstance(c, int)\n'
'3|\n'
'4|def test_any_int():\n'
'5| assert any_int(1, 2, 3) == True\n'
'6| assert any_int(1.5, 2, 3) == False\n'
'7| assert any_int(1, 2.5, 3) == False\n'
'8| assert any_int(1, 2, 3.5) == False\n'
'9| assert any_int(1.0, 2, 3) == False\n'
'10| assert any_int(1, 2.0, 3) == False\n'
'11| assert any_int(1, 2, 3.0) == False\n'
'12| assert any_int(0, 0, 0) == True\n'
'13| assert any_int(-1, -2, -3) == True\n'
'14| assert any_int(1, -2, 3) == True\n'
'15| assert any_int(1.5, -2, 3) == False\n'
'16| assert any_int(1, -2.5, 3) == False\n'
'17|\n'
'18|def check(any_int):\n'
'19| # Check some simple cases\n'
'20| assert any_int(2, 3, 1)==True, "This prints if this assert fails 1 (good for debugging!)"\n'
'21| assert any_int(2.5, 2, 3)==False, "This prints if this assert fails 2 (good for debugging!)"\n'
'(12 more lines below)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
result = buf.getvalue()
expected = (
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
+ str(temp_file_path)
+ ':9:9: '
+ 'E999 IndentationError: unexpected indent\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def any_int(a, b, c):\n'
'2| return isinstance(a, int) and isinstance(b, int) and isinstance(c, int)\n'
'3|\n'
'4|def test_any_int():\n'
'5| assert any_int(1, 2, 3) == True\n'
'6| assert any_int(1.5, 2, 3) == False\n'
'7| assert any_int(1, 2.5, 3) == False\n'
'8| assert any_int(1, 2, 3.5) == False\n'
'9| assert any_int(1.0, 2, 3) == False\n'
'10| assert any_int(1, 2.0, 3) == False\n'
'11| assert any_int(1, 2, 3.0) == False\n'
'12| assert any_int(0, 0, 0) == True\n'
'13| assert any_int(-1, -2, -3) == True\n'
'14| assert any_int(1, -2, 3) == True\n'
'15| assert any_int(1.5, -2, 3) == False\n'
'16| assert any_int(1, -2.5, 3) == False\n'
'17|\n'
'18|def check(any_int):\n'
'19| # Check some simple cases\n'
'20| assert any_int(2, 3, 1)==True, "This prints if this assert fails 1 (good for debugging!)"\n'
'21| assert any_int(2.5, 2, 3)==False, "This prints if this assert fails 2 (good for debugging!)"\n'
'(12 more lines below)\n'
'-------------------------------------------------\n'
'\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def any_int(a, b, c):\n'
'2| return isinstance(a, int) and isinstance(b, int) and isinstance(c, int)\n'
'3|\n'
'4|def test_any_int():\n'
'5| assert any_int(1, 2, 3) == True\n'
'6| assert any_int(1.5, 2, 3) == False\n'
'7| assert any_int(1, 2.5, 3) == False\n'
'8| assert any_int(1, 2, 3.5) == False\n'
'9| assert any_int(1.0, 2, 3) == False\n'
'10| assert any_int(1, 2.0, 3) == False\n'
'11| assert any_int(1, 2, 3.0) == False\n'
'12| assert any_int(0, 0, 0) == True\n'
'13| assert any_int(-1, -2, -3) == True\n'
'14| assert any_int(1, -2, 3) == True\n'
'15| assert any_int(1.5, -2, 3) == False\n'
'16| assert any_int(1, -2.5, 3) == False\n'
'17|\n'
'18|def check(any_int):\n'
'19| # Check some simple cases\n'
'20| assert any_int(2, 3, 1)==True, "This prints if this assert fails 1 (good for debugging!)"\n'
'21| assert any_int(2.5, 2, 3)==False, "This prints if this assert fails 2 (good for debugging!)"\n'
'(12 more lines below)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result == expected
assert result == expected
# ================================
@ -1253,153 +1250,140 @@ def test_find_file_not_exist_file_specific_path(tmp_path):
assert result.split('\n') == expected.split('\n')
def test_edit_lint_file_pass(tmp_path, monkeypatch):
def test_edit_lint_file_pass(tmp_path):
# Enable linting
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', True
)
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'True'}):
file_path = _generate_test_file_with_lines(tmp_path, 1)
file_path = _generate_test_file_with_lines(tmp_path, 1)
# Test linting functionality
with io.StringIO() as buf:
with contextlib.redirect_stdout(buf):
open_file(str(file_path))
insert_content_at_line(str(file_path), 1, "print('hello')\n")
result = buf.getvalue()
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
f'[File: {file_path} (1 lines total after edit)]\n'
'(this is the beginning of the file)\n'
"1|print('hello')\n"
'(this is the end of the file)\n'
+ MSG_FILE_UPDATED.format(line_number=1)
+ '\n'
)
assert result.split('\n') == expected.split('\n')
# Test linting functionality
with io.StringIO() as buf:
with contextlib.redirect_stdout(buf):
open_file(str(file_path))
insert_content_at_line(str(file_path), 1, "print('hello')\n")
result = buf.getvalue()
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
f'[File: {file_path} (1 lines total after edit)]\n'
'(this is the beginning of the file)\n'
"1|print('hello')\n"
'(this is the end of the file)\n'
+ MSG_FILE_UPDATED.format(line_number=1)
+ '\n'
)
assert result.split('\n') == expected.split('\n')
def test_lint_file_fail_undefined_name(tmp_path, monkeypatch, capsys):
# Enable linting
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', True
)
def test_lint_file_fail_undefined_name(tmp_path, capsys):
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'True'}):
current_line = 1
current_line = 1
file_path = _generate_test_file_with_lines(tmp_path, 1)
file_path = _generate_test_file_with_lines(tmp_path, 1)
open_file(str(file_path), current_line)
insert_content_at_line(str(file_path), 1, 'undefined_name()\n')
open_file(str(file_path), current_line)
insert_content_at_line(str(file_path), 1, 'undefined_name()\n')
result = capsys.readouterr().out
assert result is not None
result = capsys.readouterr().out
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
f"{file_path}:1:1: F821 undefined name 'undefined_name'\n"
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|undefined_name()\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
f"{file_path}:1:1: F821 undefined name 'undefined_name'\n"
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|undefined_name()\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')
def test_lint_file_fail_undefined_name_long(tmp_path, monkeypatch, capsys):
# Enable linting
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', True
)
def test_lint_file_fail_undefined_name_long(tmp_path, capsys):
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'True'}):
num_lines = 1000
error_line = 500
num_lines = 1000
error_line = 500
file_path = _generate_test_file_with_lines(tmp_path, num_lines)
file_path = _generate_test_file_with_lines(tmp_path, num_lines)
error_message = (
f"{file_path}:{error_line}:1: F821 undefined name 'undefined_name'"
)
error_message = f"{file_path}:{error_line}:1: F821 undefined name 'undefined_name'"
open_file(str(file_path))
insert_content_at_line(str(file_path), error_line, 'undefined_name()\n')
open_file(str(file_path))
insert_content_at_line(str(file_path), error_line, 'undefined_name()\n')
result = capsys.readouterr().out
assert result is not None
result = capsys.readouterr().out
assert result is not None
open_lines = '\n'.join([f'{i}|' for i in range(1, WINDOW + 1)])
expected = (
f'[File: {file_path} ({num_lines} lines total)]\n'
'(this is the beginning of the file)\n'
f'{open_lines}\n'
f'({num_lines - WINDOW} more lines below)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
f'ERRORS:\n{error_message}\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(489 more lines above)\n'
+ _numbered_test_lines(error_line - 10, error_line - 1)
+ '500|undefined_name()\n'
+ _numbered_test_lines(error_line + 1, error_line + 10)
+ '(491 more lines below)\n'
+ '-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(489 more lines above)\n'
+ _numbered_test_lines(error_line - 10, error_line + 10)
+ '(490 more lines below)\n'
+ '-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')
open_lines = '\n'.join([f'{i}|' for i in range(1, WINDOW + 1)])
expected = (
f'[File: {file_path} ({num_lines} lines total)]\n'
'(this is the beginning of the file)\n'
f'{open_lines}\n'
f'({num_lines - WINDOW} more lines below)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
f'ERRORS:\n{error_message}\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(489 more lines above)\n'
+ _numbered_test_lines(error_line - 10, error_line - 1)
+ '500|undefined_name()\n'
+ _numbered_test_lines(error_line + 1, error_line + 10)
+ '(491 more lines below)\n'
+ '-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(489 more lines above)\n'
+ _numbered_test_lines(error_line - 10, error_line + 10)
+ '(490 more lines below)\n'
+ '-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')
def test_lint_file_disabled_undefined_name(tmp_path, monkeypatch, capsys):
# Disable linting
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', False
)
def test_lint_file_disabled_undefined_name(tmp_path, capsys):
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'False'}):
file_path = _generate_test_file_with_lines(tmp_path, 1)
file_path = _generate_test_file_with_lines(tmp_path, 1)
open_file(str(file_path))
insert_content_at_line(str(file_path), 1, 'undefined_name()\n')
open_file(str(file_path))
insert_content_at_line(str(file_path), 1, 'undefined_name()\n')
result = capsys.readouterr().out
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
f'[File: {file_path} (1 lines total after edit)]\n'
'(this is the beginning of the file)\n'
'1|undefined_name()\n'
'(this is the end of the file)\n'
+ MSG_FILE_UPDATED.format(line_number=1)
+ '\n'
)
assert result.split('\n') == expected.split('\n')
result = capsys.readouterr().out
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
f'[File: {file_path} (1 lines total after edit)]\n'
'(this is the beginning of the file)\n'
'1|undefined_name()\n'
'(this is the end of the file)\n'
+ MSG_FILE_UPDATED.format(line_number=1)
+ '\n'
)
assert result.split('\n') == expected.split('\n')
def test_parse_docx(tmp_path):
@ -1521,44 +1505,40 @@ def test_parse_pptx(tmp_path):
assert output == expected_output, f'Expected output does not match. Got: {output}'
def test_lint_file_fail_non_python(tmp_path, monkeypatch, capsys):
monkeypatch.setattr(
'opendevin.runtime.plugins.agent_skills.agentskills.ENABLE_AUTO_LINT', True
)
def test_lint_file_fail_non_python(tmp_path, capsys):
with patch.dict(os.environ, {'ENABLE_AUTO_LINT': 'True'}):
current_line = 1
file_path = _generate_ruby_test_file_with_lines(tmp_path, 1)
current_line = 1
file_path = _generate_ruby_test_file_with_lines(tmp_path, 1)
open_file(str(file_path), current_line)
insert_content_at_line(
str(file_path), 1, "def print_hello_world()\n puts 'Hello World'"
)
result = capsys.readouterr().out
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
f'{file_path}:1\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def print_hello_world()\n'
"2| puts 'Hello World'\n"
'(this is the end of the file)\n'
'-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')
open_file(str(file_path), current_line)
insert_content_at_line(
str(file_path), 1, "def print_hello_world()\n puts 'Hello World'"
)
result = capsys.readouterr().out
assert result is not None
expected = (
f'[File: {file_path} (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]\n'
'ERRORS:\n'
f'{file_path}:1\n'
'[This is how your edit would have looked if applied]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|def print_hello_world()\n'
"2| puts 'Hello World'\n"
'(this is the end of the file)\n'
'-------------------------------------------------\n\n'
'[This is the original code before your edit]\n'
'-------------------------------------------------\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'-------------------------------------------------\n'
'Your changes have NOT been applied. Please fix your edit command and try again.\n'
'You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.\n'
'DO NOT re-run the same failed edit command. Running it again will lead to the same error.\n'
)
assert result.split('\n') == expected.split('\n')

View File

@ -114,7 +114,7 @@ def test_jupyter_heredoc():
print('Hello, `World`!
')
EOL
[error]: here-document at line 0 delimited by end-of-file (wanted "'EOL'") (position 75)
[warning]: here-document at line 0 delimited by end-of-file (wanted "'EOL'") (position 75)
TODO: remove this tests after the deprecation of ServerRuntime
"""

View File

@ -1,8 +1,7 @@
import json
import pathlib
import tempfile
import pytest
from pytest import TempPathFactory
from opendevin.events import EventSource, EventStream
from opendevin.events.action import (
@ -13,11 +12,8 @@ from opendevin.storage import get_file_store
@pytest.fixture
def temp_dir(monkeypatch):
# get a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
pathlib.Path(temp_dir).mkdir(parents=True, exist_ok=True)
yield temp_dir
def temp_dir(tmp_path_factory: TempPathFactory) -> str:
return str(tmp_path_factory.mktemp('test_event_stream'))
def collect_events(stream):

View File

@ -1,77 +0,0 @@
import pathlib
import tempfile
from unittest.mock import MagicMock, call, patch
import pytest
from opendevin.core.config import AppConfig, SandboxConfig
from opendevin.events.action import IPythonRunCellAction
from opendevin.events.observation import IPythonRunCellObservation
from opendevin.runtime.server.runtime import ServerRuntime
@pytest.fixture
def temp_dir(monkeypatch):
# get a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
pathlib.Path(temp_dir).mkdir(parents=True, exist_ok=True)
yield temp_dir
@pytest.mark.asyncio
async def test_run_python_backticks():
# Create a mock event_stream
mock_event_stream = MagicMock()
test_code = "print('Hello, `World`!\n')"
# Mock the asynchronous sandbox execute method
mock_sandbox_execute = MagicMock()
mock_sandbox_execute.side_effect = [
(0, ''), # Initial call during DockerSSHBox initialization
(0, ''), # Initial call during DockerSSHBox initialization
(0, ''), # Initial call during DockerSSHBox initialization
(0, ''), # Write command
(0, test_code), # Execute command
]
# Set up the patches for the runtime and sandbox
with patch(
'opendevin.runtime.docker.ssh_box.DockerSSHBox.execute',
new=mock_sandbox_execute,
):
# Initialize the runtime with the mock event_stream
runtime = ServerRuntime(
config=AppConfig(
persist_sandbox=False, sandbox=SandboxConfig(box_type='ssh')
),
event_stream=mock_event_stream,
)
# Define the test action with a simple IPython command
action = IPythonRunCellAction(code=test_code)
# Call the run_ipython method with the test action
result = await runtime.run_action(action)
# Assert that the result is an instance of IPythonRunCellObservation
assert isinstance(result, IPythonRunCellObservation)
# Assert that the execute method was called with the correct commands
expected_write_command = (
"cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n" f'{test_code}\n' 'EOL'
)
expected_execute_command = 'cat /tmp/opendevin_jupyter_temp.py | execute_cli'
mock_sandbox_execute.assert_has_calls(
[
call('mkdir -p /tmp'),
call('git config --global user.name "OpenDevin"'),
call('git config --global user.email "opendevin@all-hands.dev"'),
call(expected_write_command),
call(expected_execute_command),
]
)
assert (
test_code == result.content
), f'The output should contain the expected print output, got: {result.content}'

View File

@ -1,8 +1,8 @@
import logging
import tempfile
from unittest.mock import Mock, patch
import pytest
from pytest import TempPathFactory
from opendevin.controller.agent_controller import AgentController
from opendevin.controller.state.state import State
@ -29,14 +29,17 @@ logging.basicConfig(level=logging.DEBUG)
@pytest.fixture
def event_stream():
with tempfile.TemporaryDirectory() as temp_dir:
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('asdf', file_store)
yield event_stream
def temp_dir(tmp_path_factory: TempPathFactory) -> str:
return str(tmp_path_factory.mktemp('test_is_stuck'))
# clear after each test
event_stream.clear()
@pytest.fixture
def event_stream(temp_dir):
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('asdf', file_store)
yield event_stream
# clear after each test
event_stream.clear()
class TestStuckDetector:

View File

@ -1,10 +1,10 @@
import json
import os
import tempfile
from unittest.mock import MagicMock
import pytest
import yaml
from pytest import TempPathFactory
from agenthub.micro.registry import all_microagents
from opendevin.controller.agent import Agent
@ -17,14 +17,18 @@ from opendevin.storage import get_file_store
@pytest.fixture
def event_stream():
with tempfile.TemporaryDirectory() as temp_dir:
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('asdf', file_store)
yield event_stream
def temp_dir(tmp_path_factory: TempPathFactory) -> str:
return str(tmp_path_factory.mktemp('test_micro_agents'))
# clear after each test
event_stream.clear()
@pytest.fixture
def event_stream(temp_dir):
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('asdf', file_store)
yield event_stream
# clear after each test
event_stream.clear()
def test_all_agents_are_loaded():

View File

@ -2,12 +2,11 @@
import asyncio
import os
import pathlib
import tempfile
import time
from unittest.mock import patch
import pytest
from pytest import TempPathFactory
from opendevin.core.config import AppConfig, SandboxConfig, load_from_env
from opendevin.core.logger import opendevin_logger as logger
@ -41,62 +40,106 @@ def print_method_name(request):
@pytest.fixture
def temp_dir(monkeypatch):
# get a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
pathlib.Path(temp_dir).mkdir(parents=True, exist_ok=True)
yield temp_dir
def temp_dir(tmp_path_factory: TempPathFactory) -> str:
return str(tmp_path_factory.mktemp('test_runtime'))
TEST_RUNTIME = os.getenv('TEST_RUNTIME', 'both')
PY3_FOR_TESTING = '/opendevin/miniforge3/bin/mamba run -n base python3'
# This assures that all tests run together for each runtime, not alternating between them,
# which caused them to fail previously.
@pytest.fixture(scope='module', params=[EventStreamRuntime, ServerRuntime])
@pytest.fixture(scope='module')
def box_class(request):
time.sleep(1)
runtime = TEST_RUNTIME
if runtime.lower() == 'eventstream':
return EventStreamRuntime
elif runtime.lower() == 'server':
return ServerRuntime
else:
return pytest.param([EventStreamRuntime, ServerRuntime])
# TODO: We will change this to `run_as_user` when `ServerRuntime` is deprecated.
# since `EventStreamRuntime` supports running as an arbitrary user.
@pytest.fixture(scope='module', params=[True, False])
def run_as_devin(request):
time.sleep(1)
return request.param
async def _load_runtime(temp_dir, box_class):
@pytest.fixture(scope='module', params=[True, False])
def enable_auto_lint(request):
time.sleep(1)
return request.param
@pytest.fixture(scope='module', params=['ubuntu:22.04', 'debian:11'])
def container_image(request):
time.sleep(1)
return request.param
async def _load_runtime(
temp_dir,
box_class,
run_as_devin: bool = True,
enable_auto_lint: bool = False,
container_image: str | None = None,
):
sid = 'test'
cli_session = 'main_test'
plugins = [JupyterRequirement(), AgentSkillsRequirement()]
# AgentSkills need to be initialized **before** Jupyter
# otherwise Jupyter will not access the proper dependencies installed by AgentSkills
plugins = [AgentSkillsRequirement(), JupyterRequirement()]
config = AppConfig(
workspace_base=temp_dir,
workspace_mount_path=temp_dir,
sandbox=SandboxConfig(
use_host_network=True,
),
sandbox=SandboxConfig(use_host_network=True),
)
load_from_env(config, os.environ)
config.run_as_devin = run_as_devin
config.sandbox.enable_auto_lint = enable_auto_lint
file_store = get_file_store(config.file_store, config.file_store_path)
event_stream = EventStream(cli_session, file_store)
container_image = config.sandbox.container_image
# NOTE: we will use the default container image specified in the config.sandbox
# if it is an official od_runtime image.
if 'od_runtime' not in container_image:
container_image = 'ubuntu:22.04'
logger.warning(
f'`{config.sandbox.container_image}` is not an od_runtime image. Will use `{container_image}` as the container image for testing.'
)
if container_image is not None:
config.sandbox.container_image = container_image
if box_class == EventStreamRuntime:
# NOTE: we will use the default container image specified in the config.sandbox
# if it is an official od_runtime image.
cur_container_image = config.sandbox.container_image
if 'od_runtime' not in cur_container_image:
cur_container_image = 'ubuntu:22.04'
logger.warning(
f'`{config.sandbox.container_image}` is not an od_runtime image. Will use `{cur_container_image}` as the container image for testing.'
)
runtime = EventStreamRuntime(
config=config,
event_stream=event_stream,
sid=sid,
plugins=plugins,
# NOTE: we probably don't have a default container image `/sandbox` for the event stream runtime
# Instead, we will pre-build a suite of container images with OD-runtime-cli installed.
container_image=container_image,
plugins=plugins,
container_image=cur_container_image,
)
await runtime.ainit()
elif box_class == ServerRuntime:
runtime = ServerRuntime(config=config, event_stream=event_stream, sid=sid)
runtime = ServerRuntime(
config=config, event_stream=event_stream, sid=sid, plugins=plugins
)
await runtime.ainit()
runtime.init_sandbox_plugins(plugins)
from opendevin.runtime.tools import (
RuntimeTool, # deprecate this after ServerRuntime is deprecated
)
runtime.init_runtime_tools(
[],
[RuntimeTool.BROWSER],
is_async=False,
runtime_tools_config={},
)
@ -107,9 +150,9 @@ async def _load_runtime(temp_dir, box_class):
@pytest.mark.asyncio
async def test_env_vars_os_environ(temp_dir, box_class):
async def test_env_vars_os_environ(temp_dir, box_class, run_as_devin):
with patch.dict(os.environ, {'SANDBOX_ENV_FOOBAR': 'BAZ'}):
runtime = await _load_runtime(temp_dir, box_class)
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
obs: CmdOutputObservation = await runtime.run_action(
CmdRunAction(command='env')
@ -206,8 +249,8 @@ async def test_env_vars_runtime_add_env_vars_overwrite(temp_dir, box_class):
@pytest.mark.asyncio
async def test_bash_command_pexcept(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
async def test_bash_command_pexcept(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
# We set env var PS1="\u@\h:\w $"
# and construct the PEXCEPT prompt base on it.
@ -237,8 +280,8 @@ async def test_bash_command_pexcept(temp_dir, box_class):
@pytest.mark.asyncio
async def test_simple_cmd_ipython_and_fileop(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
async def test_simple_cmd_ipython_and_fileop(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
# Test run command
action_cmd = CmdRunAction(command='ls -l')
@ -297,16 +340,25 @@ async def test_simple_cmd_ipython_and_fileop(temp_dir, box_class):
else:
assert obs.path == '/workspace/hello.sh'
# clean up
action = CmdRunAction(command='rm -rf hello.sh')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_simple_browse(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
async def test_simple_browse(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
# Test browse
action_cmd = CmdRunAction(command='python -m http.server 8000 > server.log 2>&1 &')
action_cmd = CmdRunAction(
command=f'{PY3_FOR_TESTING} -m http.server 8000 > server.log 2>&1 &'
)
logger.info(action_cmd, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_cmd)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
@ -315,6 +367,12 @@ async def test_simple_browse(temp_dir, box_class):
assert obs.exit_code == 0
assert '[1]' in obs.content
action_cmd = CmdRunAction(command='sleep 5 && cat server.log')
logger.info(action_cmd, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_cmd)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
action_browse = BrowseURLAction(url='http://localhost:8000')
logger.info(action_browse, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_browse)
@ -331,11 +389,65 @@ async def test_simple_browse(temp_dir, box_class):
assert 'Directory listing for /' in obs.content
assert 'server.log' in obs.content
# clean up
action = CmdRunAction(command='rm -rf server.log')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_multiline_commands(temp_dir, box_class):
async def test_single_multiline_command(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='echo \\\n -e "foo"')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0, 'The exit code should be 0.'
assert 'foo' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_multiline_echo(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='echo -e "hello\nworld"')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0, 'The exit code should be 0.'
assert 'hello\r\nworld' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_runtime_whitespace(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='echo -e "\\n\\n\\n"')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0, 'The exit code should be 0.'
assert '\r\n\r\n\r\n' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_multiple_multiline_commands(temp_dir, box_class, run_as_devin):
cmds = [
'ls -l',
'echo -e "hello\nworld"',
@ -365,7 +477,7 @@ world "
]
joined_cmds = '\n'.join(cmds)
runtime = await _load_runtime(temp_dir, box_class)
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
action = CmdRunAction(command=joined_cmds)
logger.info(action, extra={'msg_type': 'ACTION'})
@ -388,9 +500,9 @@ world "
@pytest.mark.asyncio
async def test_no_ps2_in_output(temp_dir, box_class):
async def test_no_ps2_in_output(temp_dir, box_class, run_as_devin):
"""Test that the PS2 sign is not added to the output of a multiline command."""
runtime = await _load_runtime(temp_dir, box_class)
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
action = CmdRunAction(command='echo -e "hello\nworld"')
logger.info(action, extra={'msg_type': 'ACTION'})
@ -406,6 +518,9 @@ async def test_no_ps2_in_output(temp_dir, box_class):
assert 'hello\r\nworld' in obs.content
assert '>' not in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_multiline_command_loop(temp_dir, box_class):
@ -449,3 +564,369 @@ echo "success"
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_cmd_run(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
action = CmdRunAction(command='ls -l')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
assert 'total 0' in obs.content
action = CmdRunAction(command='mkdir test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
action = CmdRunAction(command='ls -l')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
if run_as_devin:
assert 'opendevin' in obs.content
else:
assert 'root' in obs.content
assert 'test' in obs.content
action = CmdRunAction(command='touch test/foo.txt')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
action = CmdRunAction(command='ls -l test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
assert 'foo.txt' in obs.content
# clean up: this is needed, since CI will not be
# run as root, and this test may leave a file
# owned by root
action = CmdRunAction(command='rm -rf test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_run_as_user_correct_home_dir(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
action = CmdRunAction(command='cd ~ && pwd')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
if run_as_devin:
assert '/home/opendevin' in obs.content
else:
assert '/root' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_multi_cmd_run_in_single_line(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='pwd && ls -l')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
assert '/workspace' in obs.content
assert 'total 0' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_stateful_cmd(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='mkdir test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0, 'The exit code should be 0.'
action = CmdRunAction(command='cd test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0, 'The exit code should be 0.'
action = CmdRunAction(command='pwd')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0, 'The exit code should be 0.'
assert '/workspace/test' in obs.content
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_failed_cmd(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
action = CmdRunAction(command='non_existing_command')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code != 0, 'The exit code should not be 0 for a failed command.'
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_ipython_multi_user(temp_dir, box_class, run_as_devin):
runtime = await _load_runtime(temp_dir, box_class, run_as_devin)
# Test run ipython
# get username
test_code = "import os; print(os.environ['USER'])"
action_ipython = IPythonRunCellAction(code=test_code)
logger.info(action_ipython, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_ipython)
assert isinstance(obs, IPythonRunCellObservation)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
if run_as_devin:
assert 'opendevin' in obs.content
else:
assert 'root' in obs.content
# print pwd
test_code = 'import os; print(os.getcwd())'
action_ipython = IPythonRunCellAction(code=test_code)
logger.info(action_ipython, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_ipython)
assert isinstance(obs, IPythonRunCellObservation)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.content.strip() == '/workspace'
# write a file
test_code = "with open('test.txt', 'w') as f: f.write('Hello, world!')"
action_ipython = IPythonRunCellAction(code=test_code)
logger.info(action_ipython, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_ipython)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, IPythonRunCellObservation)
assert obs.content.strip() == '[Code executed successfully with no output]'
# check file owner via bash
action = CmdRunAction(command='ls -alh test.txt')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
if run_as_devin:
# -rw-r--r-- 1 opendevin root 13 Jul 28 03:53 test.txt
assert 'opendevin' in obs.content.split('\r\n')[0]
assert 'root' in obs.content.split('\r\n')[0]
else:
# -rw-r--r-- 1 root root 13 Jul 28 03:53 test.txt
assert 'root' in obs.content.split('\r\n')[0]
# clean up
action = CmdRunAction(command='rm -rf test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.asyncio
async def test_ipython_simple(temp_dir, box_class):
runtime = await _load_runtime(temp_dir, box_class)
# Test run ipython
# get username
test_code = 'print(1)'
action_ipython = IPythonRunCellAction(code=test_code)
logger.info(action_ipython, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action_ipython)
assert isinstance(obs, IPythonRunCellObservation)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.content.strip() == '1'
async def _test_ipython_agentskills_fileop_pwd_impl(
runtime: ServerRuntime | EventStreamRuntime, enable_auto_lint: bool
):
# remove everything in /workspace
action = CmdRunAction(command='rm -rf /workspace/*')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
action = CmdRunAction(command='mkdir test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
action = IPythonRunCellAction(code="create_file('hello.py')")
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, IPythonRunCellObservation)
assert obs.content.replace('\r\n', '\n').strip().split('\n') == (
'[File: /workspace/hello.py (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[File hello.py created.]\n'
).strip().split('\n')
action = CmdRunAction(command='cd test')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, CmdOutputObservation)
assert obs.exit_code == 0
# This should create a file in the current working directory
# i.e., /workspace/test/hello.py instead of /workspace/hello.py
action = IPythonRunCellAction(code="create_file('hello.py')")
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, IPythonRunCellObservation)
assert obs.content.replace('\r\n', '\n').strip().split('\n') == (
'[File: /workspace/test/hello.py (1 lines total)]\n'
'(this is the beginning of the file)\n'
'1|\n'
'(this is the end of the file)\n'
'[File hello.py created.]\n'
).strip().split('\n')
if enable_auto_lint:
# edit file, but make a mistake in indentation
action = IPythonRunCellAction(
code="insert_content_at_line('hello.py', 1, ' print(\"hello world\")')"
)
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, IPythonRunCellObservation)
assert obs.content.replace('\r\n', '\n').strip().split('\n') == (
"""
[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]
ERRORS:
/workspace/test/hello.py:1:3: E999 IndentationError: unexpected indent
[This is how your edit would have looked if applied]
-------------------------------------------------
(this is the beginning of the file)
1| print("hello world")
(this is the end of the file)
-------------------------------------------------
[This is the original code before your edit]
-------------------------------------------------
(this is the beginning of the file)
1|
(this is the end of the file)
-------------------------------------------------
Your changes have NOT been applied. Please fix your edit command and try again.
You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.
DO NOT re-run the same failed edit command. Running it again will lead to the same error.
"""
).strip().split('\n')
# edit file with correct indentation
action = IPythonRunCellAction(
code="insert_content_at_line('hello.py', 1, 'print(\"hello world\")')"
)
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert isinstance(obs, IPythonRunCellObservation)
assert obs.content.replace('\r\n', '\n').strip().split('\n') == (
"""
[File: /workspace/test/hello.py (1 lines total after edit)]
(this is the beginning of the file)
1|print("hello world")
(this is the end of the file)
[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
"""
).strip().split('\n')
action = CmdRunAction(command='rm -rf /workspace/*')
logger.info(action, extra={'msg_type': 'ACTION'})
obs = await runtime.run_action(action)
logger.info(obs, extra={'msg_type': 'OBSERVATION'})
assert obs.exit_code == 0
@pytest.mark.asyncio
async def test_ipython_agentskills_fileop_pwd(temp_dir, box_class, enable_auto_lint):
"""Make sure that cd in bash also update the current working directory in ipython."""
runtime = await _load_runtime(
temp_dir, box_class, enable_auto_lint=enable_auto_lint
)
await _test_ipython_agentskills_fileop_pwd_impl(runtime, enable_auto_lint)
await runtime.close()
await asyncio.sleep(1)
@pytest.mark.skipif(
TEST_RUNTIME.lower() == 'eventstream',
reason='Skip this if we want to test EventStreamRuntime',
)
@pytest.mark.skipif(
os.environ.get('TEST_IN_CI', 'false').lower() == 'true',
# FIXME: There's some weird issue with the CI environment.
reason='Skip this if in CI.',
)
@pytest.mark.asyncio
async def test_ipython_agentskills_fileop_pwd_agnostic_sandbox(
temp_dir, enable_auto_lint, container_image
):
"""Make sure that cd in bash also update the current working directory in ipython."""
runtime = await _load_runtime(
temp_dir,
# NOTE: we only test for ServerRuntime, since EventStreamRuntime is image agnostic by design.
ServerRuntime,
enable_auto_lint=enable_auto_lint,
container_image=container_image,
)
await _test_ipython_agentskills_fileop_pwd_impl(runtime, enable_auto_lint)
await runtime.close()
await asyncio.sleep(1)

View File

@ -1,11 +1,11 @@
import os
import tarfile
import tempfile
from importlib.metadata import version
from unittest.mock import MagicMock, patch
import pytest
import toml
from pytest import TempPathFactory
from opendevin.runtime.utils.runtime_build import (
_generate_dockerfile,
@ -20,9 +20,8 @@ RUNTIME_IMAGE_PREFIX = 'od_runtime'
@pytest.fixture
def temp_dir():
with tempfile.TemporaryDirectory() as temp_dir:
yield temp_dir
def temp_dir(tmp_path_factory: TempPathFactory) -> str:
return str(tmp_path_factory.mktemp('test_runtime_build'))
def test_put_source_code_to_dir(temp_dir):

View File

@ -1,317 +0,0 @@
import os
import pathlib
import tempfile
import pytest
from opendevin.core.config import AppConfig, SandboxConfig
from opendevin.runtime.docker.ssh_box import DockerSSHBox
from opendevin.runtime.plugins import AgentSkillsRequirement, JupyterRequirement
def create_docker_box_from_app_config(
path: str, config: AppConfig | None = None
) -> DockerSSHBox:
if config is None:
config = AppConfig(
sandbox=SandboxConfig(
box_type='ssh',
),
persist_sandbox=False,
)
return DockerSSHBox(
config=config.sandbox,
persist_sandbox=config.persist_sandbox,
workspace_mount_path=path,
sandbox_workspace_dir=config.workspace_mount_path_in_sandbox,
cache_dir=config.cache_dir,
run_as_devin=True,
ssh_hostname=config.ssh_hostname,
ssh_password=config.ssh_password,
ssh_port=config.ssh_port,
)
@pytest.fixture(autouse=True)
def print_method_name(request):
print('\n########################################################################')
print(f'Running test: {request.node.name}')
print('########################################################################')
@pytest.fixture
def temp_dir(monkeypatch):
# get a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
pathlib.Path(temp_dir).mkdir(parents=True, exist_ok=True)
yield temp_dir
def test_ssh_box_run_as_devin(temp_dir):
# get a temporary directory
for box in [
create_docker_box_from_app_config(temp_dir),
]: # FIXME: permission error on mkdir test for exec box
exit_code, output = box.execute('ls -l')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip() == 'total 0'
assert box.workspace_mount_path == temp_dir
exit_code, output = box.execute('ls -l')
assert exit_code == 0, 'The exit code should be 0.'
assert output.strip() == 'total 0'
exit_code, output = box.execute('mkdir test')
assert exit_code == 0, 'The exit code should be 0.'
assert output.strip() == ''
exit_code, output = box.execute('ls -l')
assert exit_code == 0, 'The exit code should be 0.'
assert 'opendevin' in output, "The output should contain username 'opendevin'"
assert 'test' in output, 'The output should contain the test directory'
exit_code, output = box.execute('touch test/foo.txt')
assert exit_code == 0, 'The exit code should be 0.'
assert output.strip() == ''
exit_code, output = box.execute('ls -l test')
assert exit_code == 0, 'The exit code should be 0.'
assert 'foo.txt' in output, 'The output should contain the foo.txt file'
box.close()
def test_ssh_box_multi_line_cmd_run_as_devin(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('pwd && ls -l')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
expected_lines = ['/workspace', 'total 0']
line_sep = '\r\n' if isinstance(box, DockerSSHBox) else '\n'
assert output == line_sep.join(expected_lines), (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()
def test_ssh_box_stateful_cmd_run_as_devin(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('mkdir test')
assert exit_code == 0, 'The exit code should be 0.'
assert output.strip() == ''
exit_code, output = box.execute('cd test')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip() == '', (
'The output should be empty for ' + box.__class__.__name__
)
exit_code, output = box.execute('pwd')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip() == '/workspace/test', (
'The output should be /workspace for ' + box.__class__.__name__
)
box.close()
def test_ssh_box_failed_cmd_run_as_devin(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('non_existing_command')
assert exit_code != 0, (
'The exit code should not be 0 for a failed command for '
+ box.__class__.__name__
)
box.close()
def test_single_multiline_command(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('echo \\\n -e "foo"')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
# FIXME: why is there a `>` in the output? Probably PS2?
assert output == '> foo', (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()
def test_multiline_echo(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('echo -e "hello\nworld"')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
# FIXME: why is there a `>` in the output?
assert output == '> hello\r\nworld', (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()
def test_sandbox_whitespace(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
exit_code, output = box.execute('echo -e "\\n\\n\\n"')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output == '\r\n\r\n\r\n', (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()
def test_sandbox_jupyter_plugin(temp_dir):
box = create_docker_box_from_app_config(temp_dir)
box.init_plugins([JupyterRequirement])
exit_code, output = box.execute('echo "print(1)" | execute_cli')
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output == '1\r\n', (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()
def _test_sandbox_jupyter_agentskills_fileop_pwd_impl(box, config: AppConfig):
box.init_plugins([AgentSkillsRequirement, JupyterRequirement])
exit_code, output = box.execute('mkdir test')
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
exit_code, output = box.execute('echo "create_file(\'hello.py\')" | execute_cli')
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip().split('\r\n') == (
'[File: /workspace/hello.py (1 lines total)]\r\n'
'(this is the beginning of the file)\r\n'
'1|\r\n'
'(this is the end of the file)\r\n'
'[File hello.py created.]\r\n'
).strip().split('\r\n')
exit_code, output = box.execute('cd test')
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
exit_code, output = box.execute('echo "create_file(\'hello.py\')" | execute_cli')
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip().split('\r\n') == (
'[File: /workspace/test/hello.py (1 lines total)]\r\n'
'(this is the beginning of the file)\r\n'
'1|\r\n'
'(this is the end of the file)\r\n'
'[File hello.py created.]\r\n'
).strip().split('\r\n')
if config.sandbox.enable_auto_lint:
# edit file, but make a mistake in indentation
exit_code, output = box.execute(
'echo "insert_content_at_line(\'hello.py\', 1, \' print(\\"hello world\\")\')" | execute_cli'
)
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip().split('\r\n') == (
"""
[Your proposed edit has introduced new syntax error(s). Please understand the errors and retry your edit command.]
ERRORS:
/workspace/test/hello.py:1:3: E999 IndentationError: unexpected indent
[This is how your edit would have looked if applied]
-------------------------------------------------
(this is the beginning of the file)
1| print("hello world")
(this is the end of the file)
-------------------------------------------------
[This is the original code before your edit]
-------------------------------------------------
(this is the beginning of the file)
1|
(this is the end of the file)
-------------------------------------------------
Your changes have NOT been applied. Please fix your edit command and try again.
You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code.
DO NOT re-run the same failed edit command. Running it again will lead to the same error.
"""
).strip().split('\n')
# edit file with correct indentation
exit_code, output = box.execute(
'echo "insert_content_at_line(\'hello.py\', 1, \'print(\\"hello world\\")\')" | execute_cli'
)
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip().split('\r\n') == (
"""
[File: /workspace/test/hello.py (1 lines total after edit)]
(this is the beginning of the file)
1|print("hello world")
(this is the end of the file)
[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
"""
).strip().split('\n')
exit_code, output = box.execute('rm -rf /workspace/*')
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
box.close()
def test_sandbox_jupyter_agentskills_fileop_pwd(temp_dir):
# get a temporary directory
config = AppConfig(
sandbox=SandboxConfig(
box_type='ssh',
enable_auto_lint=False,
),
persist_sandbox=False,
)
assert not config.sandbox.enable_auto_lint
box = create_docker_box_from_app_config(temp_dir, config)
_test_sandbox_jupyter_agentskills_fileop_pwd_impl(box, config)
@pytest.mark.skipif(
os.getenv('TEST_IN_CI') != 'true',
reason='The unittest need to download image, so only run on CI',
)
def test_agnostic_sandbox_jupyter_agentskills_fileop_pwd(temp_dir):
for base_sandbox_image in ['ubuntu:22.04', 'debian:11']:
config = AppConfig(
sandbox=SandboxConfig(
box_type='ssh',
container_image=base_sandbox_image,
enable_auto_lint=False,
),
persist_sandbox=False,
)
assert not config.sandbox.enable_auto_lint
box = create_docker_box_from_app_config(temp_dir, config)
_test_sandbox_jupyter_agentskills_fileop_pwd_impl(box, config)
def test_sandbox_jupyter_plugin_backticks(temp_dir):
config = AppConfig(
sandbox=SandboxConfig(
box_type='ssh',
),
persist_sandbox=False,
)
box = DockerSSHBox(
config=config.sandbox,
persist_sandbox=config.persist_sandbox,
workspace_mount_path=temp_dir,
sandbox_workspace_dir=config.workspace_mount_path_in_sandbox,
cache_dir=config.cache_dir,
run_as_devin=True,
ssh_hostname=config.ssh_hostname,
ssh_password=config.ssh_password,
ssh_port=config.ssh_port,
)
box.init_plugins([JupyterRequirement])
test_code = "print('Hello, `World`!')"
expected_write_command = (
"cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n" f'{test_code}\n' 'EOL'
)
expected_execute_command = 'cat /tmp/opendevin_jupyter_temp.py | execute_cli'
exit_code, output = box.execute(expected_write_command)
exit_code, output = box.execute(expected_execute_command)
print(output)
assert exit_code == 0, 'The exit code should be 0 for ' + box.__class__.__name__
assert output.strip() == 'Hello, `World`!', (
'The output should be the same as the input for ' + box.__class__.__name__
)
box.close()