56 Commits

Author SHA1 Message Date
Boxuan Li
feaae0b7ac
Fix persist_sandbox in Makefile (#2171) 2024-06-01 12:50:31 +08:00
மனோஜ்குமார் பழனிச்சாமி
bf24a0b5c0
Fixed makefile (#2168) 2024-06-01 03:35:43 +05:30
மனோஜ்குமார் பழனிச்சாமி
f3f5768b4f
Install chromium only once (#2100)
* install chromium only once

* Update Makefile

* Update Makefile
2024-05-31 15:39:10 -04:00
Graham Neubig
7a2122ebc2
Default to gpt-4o (#2158)
* Default to gpt-4o

* Fix default
2024-05-31 14:44:07 +00:00
மனோஜ்குமார் பழனிச்சாமி
961c96a2a1
Added ssh_password to config setup (#2139)
Co-authored-by: Aleksandar <isavitaisa@gmail.com>
2024-05-31 07:26:16 +05:30
DaxServer
b118df606f
build: Add poetry command to use Python 3.11 for environment setup (#1972) 2024-05-23 12:05:19 -04:00
Xingyao Wang
2406b901df
feat(SWE-Bench environment) integrate SWE-Bench sandbox (#1468)
* add draft dockerfile for build all

* add rsync for build

* add all-in-one docker

* update prepare scripts

* Update swe_env_box.py

* Add swe_entry.sh (buggy now)

* Parse the test command in swe_entry.sh

* Update README for instance eval in sandbox

* revert specialized config

* replace run_as_devin as an init arg

* set container & run_as_root via args

* update swe entry script

* update env

* remove mounting

* allow error after swe_entry

* update swe_env_box

* move file

* update gitignore

* get swe_env_box a working demo

* support faking user response & provide sandox ahead of time;
also return state for controller

* tweak main to support adding controller kwargs

* add module

* initialize plugin for provided sandbox

* add pip cache to plugin & fix jupyter kernel waiting

* better print Observation output

* add run infer scripts

* update readme

* add utility for getting diff patch

* use get_diff_patch in infer

* update readme

* support cost tracking for codeact

* add swe agent edit hack

* disable color in git diff

* fix git diff cmd

* fix state return

* support limit eval

* increase t imeout and export pip cache

* add eval limit config

* return state when hit turn limit

* save log to file; allow agent to give up

* run eval with max 50 turns

* add outputs to gitignore

* save swe_instance & instruction

* add uuid to swebench

* add streamlit dep

* fix save series

* fix the issue where session id might be duplicated

* allow setting temperature for llm (use 0 for eval)

* Get report from agent running log

* support evaluating task success right after inference.

* remove extra log

* comment out prompt for baseline

* add visualizer for eval

* use plaintext for instruction

* reduce timeout for all; only increase timeout for init

* reduce timeout for all; only increase timeout for init

* ignore sid for swe env

* close sandbox in each eval loop

* update visualizer instruction

* increase max chars

* add finish action to history too

* show test result in metrics

* add sidebars for visualizer

* also visualize swe_instance

* cleanup browser when agent controller finish runinng

* do not mount workspace for swe-eval to avoid accidentally overwrite files

* Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files"

This reverts commit 8ef77390543e562e6f0a5a9992418014d8b3010c.

* Revert "Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files""

This reverts commit 016cfbb9f0475f32bacbad5822996b4eaff24a5e.

* run jupyter command via copy to, instead of cp to mount

* only print mixin output when failed

* change ssh box logging

* add visualizer for pass rate

* add instance id to sandbox name

* only remove container we created

* use opendevin logger in main

* support multi-processing infer

* add back metadata, support keyboard interrupt

* remove container with startswith

* make pbar behave correctly

* update instruction w/ multi-processing

* show resolved rate by repo

* rename tmp dir name

* attempt to fix racing for copy to ssh_box

* fix script

* bump swe-bench-all version

* fix ipython with self-contained commands

* add jupyter demo to swe_env_box

* make resolved count two column

* increase height

* do not add glob to url params

* analyze obs length

* print instance id prior to removal handler

* add gold patch in visualizer

* fix interactive git by adding a git --no-pager as alias

* increase max_char to 10k to cover 98% of swe-bench obs cases

* allow parsing note

* prompt v2

* add iteration reminder

* adjust user response

* adjust order

* fix return eval

* fix typo

* add reminder before logging

* remove other resolve rate

* re adjust to new folder structure

* support adding eval note

* fix eval note path

* make sure first log of each instance is printed

* add eval note

* fix the display for visualizer

* tweak visualizer for better git patch reading

* exclude empty patch

* add retry mechanism for swe_env_box start

* fix ssh timeout issue

* add stat field for apply test patch success

* add visualization for fine-grained report

* attempt to support monologue agent by constraining it to single thread

* also log error msg when stopeed

* save error as well

* override WORKSPACE_MOUNT_PATH and WORKSPACE_BASE for monologue to work in mp

* add retry mechanism for sshbox

* remove retry for swe env box

* try to handle loop state stopped

* Add get report scripts

* Add script to convert agent output to swe-bench format

* Merge fine grained report for visualizer

* Update eval readme

* Update README.md

* Add CodeAct gpt4-1106 output and eval logs on swe-bench-lite

* Update the script to get model report

* Update get_model_report.sh

* Update get_agent_report.sh

* Update report merge script

* Add agent output conversion script

* Update swe_lite_env_setup.sh

* Add example swe-bench output files

* Update eval readme

* Remove redundant scripts

* set iteration count down to false by default

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm (#1666)

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm

* Review Feedback

* Missing None Check

* Review feedback and improved error handling

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>

* fix prepare_swe_util scripts

* update builder images

* update setup script

* remove swe-bench build workflow

* update lock

* remove experiments since they are moved to hf

* remove visualizer (since it is moved to hf repo)

* simply jupyter execution via heredoc

* update ssh_box

* add initial docker readme

* add pkg-config as dependency

* add script for swe_bench all-in-one docker

* add rsync to builder

* rename var

* update commit

* update readme

* update lock

* support specify timeout for long running tasks

* fix path

* separate building of all deps and files

* support returning states at the end of controller

* remove return None

* support specify timeout for long running tasks

* add timeout for all existing sandbox impl

* fix swe_env_box for new codebase

* update llm config in config.py

* support pass sandbox in

* remove force set

* update eval script

* fix issue of overriding final state

* change default eval output to hf demo

* change default eval output to hf demo

* fix config

* only close it when it is NOT external sandbox

* add scripts

* tweak config

* only put in hostory when state has history attr

* fix agent controller on the case of run out interaction budget

* always assume state is always not none

* remove print of final state

* catch all exception when cannot compute completion cost

* Update README.md

* save source into json

* fix path

* update docker path

* return the final state on close

* merge AgentState with State

* fix integration test

* merge AgentState with State

* fix integration test

* add ChangeAgentStateAction to history in attempt to fix integration

* add back set agent state

* update tests

* update tests

* move scripts for setup

* update script and readme for infer

* do not reset logger when n processes == 1

* update eval_infer scripts and readme

* simplify readme

* copy over dir after eval

* copy over dir after eval

* directly return get state

* update lock

* fix output saving of infer

* replace print with logger

* update eval_infer script

* add back the missing .close

* increase timeout

* copy all swe_bench_format file

* attempt to fix output parsing

* log git commit id as metadata

* fix eval script

* update lock

* update unit tests

* fix argparser unit test

* fix lock

* the deps are now lightweight enough to be incude in make build

* add spaces for tests

* add eval outputs to gitignore

* remove git submodule

* readme

* tweak git email

* update upload instruction

* bump codeact version for eval

---------

Co-authored-by: Bowen Li <libowen.ne@gmail.com>
Co-authored-by: huybery <huybery@gmail.com>
Co-authored-by: Bart Shappee <bshappee@gmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-15 16:15:55 +00:00
Aleksandar
657b177b4e
Default to less expensive gpt-3.5-turbo model (#1675) 2024-05-09 19:11:27 -04:00
Engel Nyst
446eaec1e6
Refactor config to dataclasses (#1552)
* mypy is invaluable

* fix config, add test

* Add new-style toml support

* add singleton, small doc fixes

* fix some cases of loading toml, clean up, try to make it clearer

* Add defaults_dict for UI

* allow config to be mutable
error handling
fix toml parsing

* remove debug stuff

* Adapt Makefile

* Add defaults for temperature and top_p

* update to CodeActAgent

* comments

* fix unit tests

* implement groups of llm settings (CLI)

* fix merge issue

* small fix sandboxes, small refactoring

* adapt LLM init to accept overrides at runtime

* reading config is enough

* Encapsulate minimally embeddings initialization

* agent bug fix; fix tests

* fix sandboxes tests

* refactor globals in sandboxes to properties
2024-05-09 22:48:29 +02:00
Arno.Edwards
06aae67fed
feat(makefile): add capability to skip Docker image pull (#1664) 2024-05-09 09:06:26 -04:00
tahussle
04676d17a8
Updated Makefile to support Manjaro / Arch linux hosts (#1642) 2024-05-08 12:06:41 +00:00
Leo
6013faeec5
Add frontend tests to pre-commit and Makefile. (#1549)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-05-03 16:15:22 -04:00
Leo
95e4ca490f
Feat: add lint frontend and lint all to Makefile. (#1354)
* Feat: add lint frontend and lint all to Makefile.

* style codes.

* Remove redundant target.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-02 11:53:57 +00:00
Frank Xu
836864fa88
[feat] Integrate BrowserGym (#1452)
* add a single-threaded server serving browsergym

* update poetry

* update browser page content

* add import to make sure browsergym environments are registered properly

* remove flask server, use multiprocess impl and Pipe

* fix

* refactor BrowserEnv

* update browser action and obs to include more complete info

* fix screenshot

* update poetry lock

* add playwright install to workflow

* update

* add better html to text conversion

* update for better text conversion to maintain parity with the current handling of browseurlaction

* update

* update poetry

* update multiprocessing mp

* fix multiprocessing

* update

* update github workflow

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-02 19:52:53 +08:00
Jirka Borovec
1b810cfbf0
ci/lint: fix calling Ruff's format (#1457)
* ci/lint: fix calling Ruff's format

* Transition for ruff lint. Only checking the modified files.

---------

Co-authored-by: ifuryst <ifuryst@gmail.com>
2024-05-01 22:19:54 -04:00
Robert Brennan
c50319138e
Revert "feat(makefile): add capability to skip Docker image pull (#1463)" (#1489)
This reverts commit 442ab7371c66cd97158ea9b08de5aa37ce1d31f3.
2024-05-01 11:00:06 -04:00
Arno.Edwards
442ab7371c
feat(makefile): add capability to skip Docker image pull (#1463)
* feat(makefile): add capability to skip Docker image pull

* ci(github-actions): add conditional Docker installation based on ENV variable
2024-05-01 09:01:11 -04:00
Aleksandar
5989853f7a
Fix duplicate LLM_BASE_URL entry in config.toml and enable different ollama embeddings (#1437)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-30 02:26:02 +02:00
Graham Neubig
567e2c2b35
Add poetry install to makefile (#1436) 2024-04-28 22:40:37 +01:00
Engel Nyst
ca73ecb499
message in setup-config about model (#1398) 2024-04-27 08:19:31 -04:00
Jirka Borovec
e32d95cb1a
lint: simplify hooks already covered by Ruff (#1204)
* lint: simplify hooks already covered by Ruff

* prune dev dependency

* setting E, W, F

* poetry?

* autopep8

* quote-style = "single"

* double-quote-string-fixer

* --all-files

* apply

* Q

* drop double-quote-string-fixer

* --all-files

* apply pre-commit

* python3.11 -m poetry lock --no-update

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-27 11:32:14 +00:00
Alex Bäuerle
e9a9186717
build(backend): change reload to ignore workspace correctly (#1393)
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-26 18:19:18 +00:00
Alex Bäuerle
dc73954cce
build: when running in dev mode, reload the poetry server whenever a … (#1323)
* build: when running in dev mode, reload the poetry server whenever a file changes

* only reload for specific directories

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-24 09:34:40 -07:00
Xingyao Wang
fc5e075ea0
feat(sandbox): Implementation of Sandbox Plugin to Support Jupyter (#1255)
* initialize plugin definition

* initialize plugin definition

* simplify mixin

* further improve plugin mixin

* add cache dir for pip

* support clean up cache

* add script for setup jupyter and execution server

* integrate JupyterRequirement to ssh_box

* source bashrc at the end of plugin load

* add execute_cli that accept code via stdin

* make JUPYTER_EXEC_SERVER_PORT configurable via env var

* increase background cmd sleep time

* Update opendevin/sandbox/plugins/mixin.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* add mixin to base class

* make jupyter requirement a dataclass

* source plugins only when >0 requirements

* add `sandbox_plugins` for each agent & have controller take care of it

* update build.sh to make logs available in /opendevin/logs

* switch to use config for lib and cache dir

* fix permission issue with /workspace

* use python to implement execute_cli to avoid stdin escape issue

* wait until jupyter is avaialble

* support plugin via copying instead of mounting

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-23 08:45:53 +08:00
Beichen Ma
2242702cf9
feat add prerequisites validation (#943)
* feat add prerequisites validation

* fix error

* Update Makefile

Co-authored-by: Robert Brennan <accounts@rbren.io>

* change awk with sek

* fix typo

* fix error

* fix linux error

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-22 03:40:56 +00:00
Yoni
5a4913224a
Use /bin/bash as the shell for Makefile (#1264) 2024-04-21 15:17:35 -04:00
Leo
0e572c3e41
feat: support tls. #1234 (#1248)
* feat: support tls.

* update the frontend README.

* Update frontend/README.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-04-21 14:50:48 -04:00
மனோஜ்குமார் பழனிச்சாமி
0356f6ec89
Azure LLM fix (#1227)
* azure embedding fix

* corrected embedding config

* fixed doc
2024-04-20 01:05:14 +02:00
மனோஜ்குமார் பழனிச்சாமி
18348911c2
Use python3.11 as default (#885)
* use python3.11

* use python3.11

* removed 3.12

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-17 17:28:16 +00:00
Boxuan Li
aed82704a9
Fix python linter inconsistent behaviour with quotes (#1112)
* This has been a headache for a long time, and we had #1071 and #1100 with the hope to fix the inconsistent behaviour across linters and environments. However, we recently found out that double-quote-string-fixer plugin in pre-commit-hook has inconsistent behaviour on python 3.11 and 3.12. See discussion here. This is sad because while this plugin enforces single quote behaviour with 3.11, it doesn't always enforce so with 3.12. Specifically, with fstr syntax, this plugin allows both single quotes and double quotes with python 3.12.

The problem is, some developers have black linter installed/integrated with their IDE, which is probably the most popular linter in python world (ranked by GitHub stars). This linter insists on always using double quotes. Now we have black and double-quote-string-fixer fight each other (iff the developer uses python 3.12) for some quotes (fstr syntax).

After a lot of research, I couldn't find a way to enforce single quote behaviour without introducing a new dependency, flake8, together with a plugin for it to enforce quotes' behavior. I believe it's better off introducing the more popular black if we have to introduce a new linter. Since black and autopep8 sometimes fight each other, and they mostly overlap, I further remove autopep8.

The unfortunate consequence of this PR is that I had to revert all single quotes back to double quotes. This might cause some inconvenience to existing PRs as they have to resolve conflicts, but I believe the headache will be gone soon. That being said, I am open to abandon this PR if anyone has a better idea to solve the headache.

* Remove black

* Prevent black from changing quotes

* Use flake8 to enforce single quotes

* Fix quotes in config.py

* Add back autopep8

* Add make lint to run linters
2024-04-17 00:12:52 -04:00
Alex Bäuerle
71edee17be
build: fix workspace variable name in dev setup (#1138) 2024-04-15 16:33:49 -04:00
RaGe
de672029ef
Add build-frontend to build (#1137) 2024-04-15 13:00:25 -07:00
Engel Nyst
2f9bf606c7
Don't save backend file (#870)
* Don't save backend file

* Update Makefile

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>

---------

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>
2024-04-07 19:39:32 +02:00
Anas DORBANI
d3770f1db6
add github action for project build on macos and linux (#838)
* update github action to build and run tests  on macos and linux

* fix docker installation

* Fix poetry installation on macos

* Fix docker installation

* Fix docker installation - start docker daemon

* Change docker installation macos

* Update docker buildx version

* new docker installation

* Add new start docker structure

* Add new start docker structure 2

* update github action to build and run tests  on macos and linux

* Update makefile to fix chroma-hnswlib issue with macos

* fix macos build

* Fix macos issue

* Fix macos

* Reformat Makefile

* updates
2024-04-07 03:54:52 -04:00
Engel Nyst
99a8dc4ff9
Fallback to less expensive model (#475) 2024-04-07 05:45:37 +02:00
Anas DORBANI
9c98b67002
Fix awk error for the WSL users (#837) 2024-04-07 00:56:15 +00:00
Graham Neubig
f40fe6ac28
Remove pnpm (#823)
* Remove pnpm

* Remove from ci

* Remove pnpm

* Remove cache from lint
2024-04-06 12:39:02 -04:00
Graham Neubig
8f097f8643
Make poetry install manual and provide user with install instructions (#818)
* Add install instructions for poetry

* Update ci

* Move poetry before docker pull

* Added link
2024-04-06 12:38:12 -04:00
Anas DORBANI
d38113cead
Ad/fix pre commit (#807)
* Fix pre-commit config and some mypy issues

* remove hook of requirements.txt
2024-04-06 03:48:30 +00:00
Anas DORBANI
66cd9f9bd9
Check python npm pnpm docker (#800) 2024-04-06 01:20:47 +00:00
Anas DORBANI
7cc58b28a5
Fix pre-commit unset when using make build (#796) 2024-04-06 00:11:12 +00:00
Davide Guidotti
3da56d3b6f
Add the ability to set LLM_BASE_URL with make setup-config (#616)
* Add the ability to set LLM_BASE_URL with make setup-config

* Add hint for LLM_BASE_URL in make setup-config

* Adjust indentation after merge

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-05 12:57:56 -05:00
Anas DORBANI
8c2b4f5fde
Update and fix Makefile (#743) 2024-04-05 02:08:23 +00:00
Anas DORBANI
5ec0e5b7ec
Switch to Poetry (#378)
* create the pyproject file

* Fix the pyproject.toml file

* Update Makefile

* adapt makefile

* fix some execution issues

* Untrack lock files and wait for the backend to get start before frontend

* Remove LangChain dependencies

* Add github action for pytest

* add missing dependency

* rebase and fix the versions adding lock file

* add torch and pymupdfb deps

* some conflicts fixes

* Add dependencies evaluation group

* add poetry.lock

* Fix unexpected operator

---------

Co-authored-by: Robert Brennan <contact@rbren.io>
2024-04-05 00:27:29 +00:00
Vincent
a0928ae590
Updating MakeFile and fixing monologue memory parameter issue (#692)
* Updating memory for monologue agent to fix base_url being used with OpenAIEmbedding by accident, added default text-embedding-ada-002 to monologue agent memory for OpenAIEmbeddings, added more enviroment variable configurations to setup-config in make file

* adding indent
2024-04-04 17:47:31 -05:00
mashiro
0534c14279
feat: i18n (#723)
* feat: i18n

* fix: ci lint error

* fix: pnpm run pre script not trigger

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-04 16:38:19 -04:00
mashiro
0fdc401f91
chore: use pnpm to manage frontend deps (#659)
* chore: use pnpm to manage frontend deps

* typo: change content of tips

* docs: update node.js require version & replacement installation link

* feat: detect the Node.js version to ensure corepack is supported

* typo: remove chinese comment

* fix: lint CI config, use pnpm to install dependencies

* fix: nextui style error when using pnpm

* fix: ci setup pnpm cwd

* fix: frontend lint ci install deps crash

* fix: ci lint frontend missing package.json path

* fix: frontend lint ci add cache-dependency-path prop
2024-04-04 14:55:03 +08:00
xcodebuild
3adcb7ea56
fix: make run on Windows (#665)
* fix: make run on Windows

* Update Makefile

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-03 23:15:31 -04:00
Robert Brennan
310cd7017d
Add link to LiteLLM to make-setup (#614)
* Update Makefile

* fix tab

* add note to readme
2024-04-03 10:22:08 -04:00
xcodebuild
1c6f046c84
Fix make vars (#641)
* fix: let BACKEND_HOST and FRONTEND_PORT works in markfile

* feat: let vite do not clear terminal to keep backend log
2024-04-03 08:58:01 -04:00