Xingyao Wang b2fdb963b6
Add detailed tutorial for adding new evaluation benchmarks (#1827)
* Add detailed tutorial for adding new evaluation benchmarks

* update tutorial, fix typo, and log observation to the cmdline

* fix url

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* simplify readme and add comments to the actual code

* Fix typo in evaluation/TUTORIAL.md

* Fix typo in evaluation/swe_bench/run_infer.py

* Fix another typo in evaluation/swe_bench/run_infer.py

* Update TUTORIAL.md

* Set host net work to false for SWEBench

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 13:40:53 -04:00
..
2024-05-11 17:15:19 +02:00

OpenDevin Shared Abstraction and Components

This is a Python package that contains all the shared abstraction (e.g., Agent) and components (e.g., sandbox, web browser, search API, selenium).

See the main README for instructions on how to run OpenDevin from the command line.

Sandbox Image

docker build -f opendevin/sandbox/docker/Dockerfile -t opendevin/sandbox:v0.1 .

Sandbox Runner

Run the docker-based interactive sandbox:

mkdir workspace
python3 opendevin/sandbox/docker/sandbox.py -d workspace

It will map ./workspace into the docker container with the folder permission correctly adjusted for current user.

Example screenshot:

image