Commit Graph

866 Commits

Author SHA1 Message Date
Xingyao Wang
a4af937dc4 stop printing 2024-05-28 23:01:35 +08:00
Xingyao Wang
95eb048672 allow specifying max iter in cmdline script 2024-05-28 22:42:30 +08:00
Xingyao Wang
832a82867f fix issue for CodeActSWEAgent 2024-05-28 22:26:49 +08:00
Xingyao Wang
3eaa6fbcbb add codeact swe agent 2024-05-28 22:25:56 +08:00
Xingyao Wang
e699f21f19 update max iter 2024-05-28 22:20:12 +08:00
Xingyao Wang
368f0b9434 update README 2024-05-28 22:00:41 +08:00
Xingyao Wang
a27b0bb748 revert instructions for run infer 2024-05-28 21:57:47 +08:00
Xingyao Wang
a9dc3ce6f3 revert instructions for run infer 2024-05-28 21:56:06 +08:00
Xingyao Wang
a98f15ae95 revert changes to codeact 2024-05-28 20:36:32 +08:00
Xingyao Wang
fa97e57360 revert changes from codeact agent and create new CodeActSWEAgent 2024-05-28 20:33:22 +08:00
Xingyao Wang
cb23bdbf62 default to 50 turns 2024-05-28 12:51:29 +08:00
Xingyao Wang
a36f6f5d33 update hint string 2024-05-28 12:45:05 +08:00
Xingyao Wang
6e2736f46b improve git get patch 2024-05-28 11:42:28 +08:00
Xingyao Wang
851df736b9 update prompt 2024-05-28 10:51:18 +08:00
Xingyao Wang
604c8d9888 update edit error message 2024-05-28 01:43:32 +08:00
Xingyao Wang
c2a284fde2 change cwd for jupyter if needed 2024-05-28 01:36:21 +08:00
Xingyao Wang
7783c10f82 update error message to include current file info 2024-05-28 01:13:40 +08:00
Xingyao Wang
deef10b43e change prompt to abs path 2024-05-28 01:09:25 +08:00
Xingyao Wang
2a1cc9a089 remove extra print 2024-05-28 01:07:21 +08:00
Xingyao Wang
4f853e79cf also log in_context_example to run infer 2024-05-28 00:53:33 +08:00
Xingyao Wang
4aeb002901 add icl for swebench 2024-05-28 00:48:24 +08:00
Xingyao Wang
80c0a33c6b fix cwd 2024-05-28 00:48:17 +08:00
Xingyao Wang
1e58a12dbf update infer prompt 2024-05-28 00:45:27 +08:00
Xingyao Wang
8ec58d2618 upgrade agentskills and update testcases 2024-05-28 00:43:17 +08:00
Xingyao Wang
e9d788959d update swe_bench prompt;
use minimal prompt for codeact;
2024-05-27 23:44:25 +08:00
dependabot[bot]
5bccaefc5f Bump litellm from 1.38.2 to 1.38.10 (#2089)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.38.2 to 1.38.10.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.2...v1.38.10)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:28:44 +08:00
Engel Nyst
55fdee31ad Remove unnecessary stuff from the sandboxes tests (#2095) 2024-05-27 20:50:02 +05:30
Yoni
3c5c214d87 Fix: serve UI from fastAPI app (#2086) 2024-05-27 18:56:13 +05:30
Xingyao Wang
ae8cda1495 Support specifying custom cost per token (#2083)
* support specifying custom cost per token

* fix test for new attrs

* add to docs

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-27 19:35:34 +08:00
Yufan Song
3d29ec0418 add version (#2078) 2024-05-26 20:46:39 -07:00
Aaron Xia
b66a915de1 feat: auto clean session with session close called. (#1990)
* feat: auto clean session with session close called.

* fix: lint

* fix: lint

* fix: lint

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-27 11:08:24 +08:00
Aleksandar
18d07bda89 feat: add max_budget_per_task configuration to control task cost (#2070)
* feat: add max_budget_per_task configuration to control task cost

* Fix test_arg_parser.py

* Use the config.max_budget_per_task as default value

* Add max_budget_per_task to core/main.py as well

* Update opendevin/controller/agent_controller.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-27 02:04:31 +08:00
Engel Nyst
783fea62a0 Ignore pid for loop detection (Was: override eq...) (#2045)
* rewrite, implement pid ignore in the controller

* make the helper method private
2024-05-26 19:27:12 +02:00
Xingyao Wang
2c0a2dbc61 fix yet another swe_bench issue (#2069) 2024-05-26 10:01:43 -07:00
Gant
f0271f9f91 need to run as root to use SWEBench container (#2068) 2024-05-26 14:21:33 +00:00
Xingyao Wang
5114230e53 Some SWE-Bench infer fixes and improvements (#2065)
* reset workspace base properly

* support running without hint

* support running without hint

* bump swe-bench eval docker to v1.2 for latest agentskills

* only give hint when use hint text is trie

* add swe-agent instructions for validation

* update dockerfile

* pin the python interpreter for execute_cli

* avoid initialize plugins twice

* default to use hint

* save results to swe_bench_lite

* unset gh token and increase max iter to 50

* remove printing of use hint status

* refractor ssh login into one function

* ok drop to 30 turns bc it is so expensive :(

* remove reproduce comments to avoid stuck
2024-05-26 10:02:11 +00:00
Xingyao Wang
a6b3ce866d refractor ssh login into one function (#2066) 2024-05-26 08:56:13 +00:00
Boxuan Li
7d6cb69a51 main.py: Fix redundant ChangeAgentStateAction (#2064) 2024-05-26 15:20:56 +08:00
Chris Mamatas
1891fd88d5 feat(frontend): Added React Router (#2061)
* Added React Router

* Moved router import above ./App

---------

Co-authored-by: Chris Mamatas <chrismamatas1@gmail.com>
2024-05-25 21:57:15 +03:00
Shimada666
be1c2ad60d feat: use retry decorator instead of retrying in a loop (#2058)
* feat: use retry decorator instead of retrying in a loop

* update code logic

* update poetry lock
2024-05-25 16:04:40 +00:00
Yizhe Zhang
0c829cd067 Support Entity-Deduction-Arena (EDA) Benchmark (#1931)
* adding draft evaluation code for EDA, using chatgpt as the temporal agent for now

* Update README.md

* Delete frontend/package.json

* reverse the irrelevant changes

* reverse package.json

* use chatgpt as the codeactagent

* integrate with opendevin

* Update evaluation/EDA/README.md

* Update evaluation/EDA/README.md

* Use poetry to manage packages

* integrate with opendevin

* minor update

* minor update

* update poetry

* update README

* clean-up infer scripts

* add run_infer script and improve readme

* log final success and final message & ground truth

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-25 23:17:04 +08:00
Xingyao Wang
28ab00946b update README for GAIA (#2054)
* update README for GAIA

* Update evaluation/gaia/README.md

* Update evaluation/gaia/README.md

* Update evaluation/gaia/README.md

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-25 15:01:03 +00:00
Xingyao Wang
ec68af5b83 fix the openai_api_key detected by agentskills (#2052) 2024-05-25 22:09:07 +08:00
Xingyao Wang
221035d39a Add retry logic to ssh login (#2053)
* add retry logic to ssh login

* Update opendevin/runtime/docker/ssh_box.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-25 12:16:24 +00:00
Shimada666
b31f7701eb Integrate Multimodal tools to agentskills. (#2016)
* suport reading multimodal files

* move file

* update dependency

* remove useless pip install

* add comments

* update the comment

* Apply suggestions from code review

* Add unit test for TXTReader

* pre-commit hook corrupted utf16 test txt

* Revert unnecessary dependency upgrades

* feat: import some readers for agentskill

* add dependencies

* Integrate some multimodal tools

* add shell pip dependency

* update dependencies

* update dependencies

* update print window

* remove __main__

* locally import cv2

* add c library for opencv

* update lock file

* update prompt

* remove unuseful file

* add some unittest

* add unittest & remove excel-related parser

* rollback poetry lock

* remove markdown

* remove requests

* optimize parse_video output

* Fix integration tests for CodeActAgent

* remove test_parse_image unittest

* Add a TODO to containers/sandbox/Dockerfile

* update dependencies

* remove pyproject.toml useless package

* change document via openai key

* Fix prompts after removing some actions

---------

Co-authored-by: Mingchen Zhuge <mczhuge@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Mingchen Zhuge <64179323+mczhuge@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-25 18:58:49 +08:00
Boxuan Li
91f313c914 BrowserEnv: init exception handling (#2050)
* BrowserEnv: init exception handling

* Revert irrelevant changes

* Remove type ignore
2024-05-25 00:17:25 -07:00
மனோஜ்குமார் பழனிச்சாமி
36ff060c1a Added links in docs (#2051) 2024-05-25 11:23:20 +05:30
மனோஜ்குமார் பழனிச்சாமி
cfae6821fa refactored timeout (#2044) 2024-05-24 18:19:14 +02:00
mamoodi
752ce8c4ea Update bug template to include os version (#1982) 2024-05-24 15:58:05 +00:00
dependabot[bot]
cc6895a65c Bump streamlit from 1.34.0 to 1.35.0 (#2037)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.34.0 to 1.35.0.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.34.0...1.35.0)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 23:00:37 +08:00