OpenHands

mirror of https://github.com/OpenHands/OpenHands.git synced 2026-03-22 13:47:19 +08:00

Author	SHA1	Message	Date
Boxuan Li	ee86d8d25e	Frontend support for delegation and rejection (#2608 ) 1. Add support for rejection action on frontend 2. Show users the reason for rejection 3. Get rid of weird empty box after delegation 4. On web GUI, show customer when a delegation starts and ends	2024-06-26 00:30:10 -07:00
Boxuan Li	7e78fde48f	Bug fix: add error observation to history (#2610 ) * Bug fix: add error observation to history * Regenerate to demonstrate format error	2024-06-24 21:24:17 -07:00
Boxuan Li	8bce806dce	Tweak prompts of ManagerAgent and CommitWriterAgent (#2609 ) * Tweak prompts of ManagerAgent and CommitWriterAgent * Fix prompts	2024-06-24 00:14:28 -07:00
Boxuan Li	01fa52d062	Enforce linter in tests folder (#2557 )	2024-06-20 21:50:34 -07:00
மனோஜ்குமார் பழனிச்சாமி	41564c2eac	Use :main instead of :latest (#2539 ) Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-06-21 03:57:50 +00:00
Shimada666	26fc3c886a	Make plugins sandbox-agnostic (#2101 ) * tmp * tmp * merge main * feat: auto build image cache * remove plugins * use config file * update mamba setup shell * support agnostic sandbox image autobuild * remove config * Update .gitignore Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> * Update opendevin/runtime/docker/ssh_box.py Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> * update setup.sh * readd sudo * add sudo in dockerfile * remove export * move od-runtime dependencies to sandbox dockerfile * factor out re-build logic into a separate util file * tweak existing plugin to use OD specific sandbox * update testcase * attempt to fix unit test using image built in ghcr * use cache tag * try to fix unit tests * add unittest * add unittest * add some unittests * revert gh workflow changes * feat: optimize sandbox image naming rule * add pull latest image hint * add opendevin python hint and use mamba to install gcc * update docker image naming rule and fix mamba issue * Update opendevin/runtime/docker/ssh_box.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * fix: opendevin user use correct pip * fix lint issue * fix custom sandbox base image * rename test name * add skipif --------- Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: tobitege <tobitege@gmx.de>	2024-06-19 19:58:07 -07:00
Engel Nyst	80fe13f4be	rename our completion as a drop-in replacement of litellm completion (#2509 )	2024-06-19 05:25:25 +02:00
Engel Nyst	b2307db010	Document, rename Agent* exceptions to LLM* (#2508 ) * rename "Agent" exceptions to LLM, document LLMResponseError	2024-06-18 22:30:22 +00:00
Boxuan Li	009f6f9ebc	Integration tests: check agent error and fix test_edits (#2473 ) * Integration tests: check agent error and fix test_edits * Fix CodeActSWEAgent test_ipython_module prompt logs	2024-06-17 20:39:03 +08:00
tobitege	3142764104	fix: test_ipython being skipped (#2477 )	2024-06-17 16:33:37 +05:30
tobitege	d2509a19c8	fix: logger with more masking of sensitive data (#2470 ) * fix: more logger sensitive masking * fix: test_config.py updated for more sensitive patterns * added one more...	2024-06-16 17:32:26 -04:00
tobitege	823298e0d0	fix: Agentskills enhancements (#2384 ) * avoid repeat logging of unneeded messages * refactored append/edit_file (tests next) * agentskills and unit test fixes * testing * more changes and test prompts * smaller changes * final test fixes * remove dead code from test_agent.py * reverting unneeded changes * updated tests, more tweaks to skills * refactor (#2442) * chores: fix DelegatorAgent description (#2446) * change * change comments * fix * stopped container to prevent port issues. (#2447) * chore: remove useless browsing code in CodeActSWEAgent (#2438) * remove useless * fix integration test * Regenerate test_ipython_module artifacts for CodeActSWEAgent --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Merge remote-tracking branch 'upstream/main' into agent-fileops * unneeded tweak * * fix edit_file to not introduce extra newline * updated docstrings with more details for LLM * fix legacy typo in prompts causing ]] instead of ] * several mock files regenerated * Regen'ed CodeActSWEAgent integration tests * fix _print_window signature; explicit exception type in _is_valid_path * splitlines with named param --------- Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-06-16 15:06:46 -04:00
Yufan Song	426e429b18	chore: remove useless browsing code in CodeActSWEAgent (#2438 ) * remove useless * fix integration test * Regenerate test_ipython_module artifacts for CodeActSWEAgent --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-06-15 10:53:03 +08:00
Engel Nyst	bb4ea1e6cb	Adjust is-stuck check for the same steps to 3 until it's stopped (#2437 )	2024-06-14 19:20:12 +05:30
Boxuan Li	dd1095cf6b	regenerate.sh: Exit upon common known errors (#2385 ) * Exit regenerate.sh upon common known errors * More fixes * Remove mention of transient issue * Use tmp file instead of tty * Remove redundant cleanup	2024-06-13 23:42:58 -07:00
Engel Nyst	1cc70be616	workspace_mount_path sentinel: an undefined string (#2431 )	2024-06-14 10:39:33 +05:30
Yufan Song	90ec0095df	Add integration test for CodeActSWEAgent (#2377 ) * add test log * remove browsing internet * add test by GPT-4o * fix prompts * change test_agent * fix test * fix nits	2024-06-12 02:46:15 +08:00
tobitege	9605106e72	feat: append_file incl. all tests [agentskills] (#2346 ) * new skill: append_file incl. all tests * more tests needed caring * file_name for append_file/edit_file; updated tests	2024-06-10 17:18:40 +00:00
tobitege	f1760f3a67	remove some MonologueAgent mentions (#2364 )	2024-06-10 11:57:37 +00:00
Boxuan Li	91ddd93756	conftest: Exit without revealing secrets (#2351 )	2024-06-10 10:47:31 +08:00
Temo	e925cefeef	Refactored prompt.py to reduce token usage (#1996 ) * Refactored prompt.py to reduce token usage * Reverted some destructive changes * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Apply suggestions from code review * Apply suggestions from code review * Update agenthub/codeact_agent/prompt.py * fix integration test * make lint * feat: support ToolQA benchmark (#2263) * Add files via upload * Update README.md * Update run_infer.py * Update utils.py * make lint * Update evaluation/toolqa/run_infer.py --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * feat: revert hiden special paths change in file action (#2328) * revert change in file action * remove useless code * make lint * Support gpqa benchmark evaluation (#2080) * feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * fix(frontend): prevent API key from resetting after modal change (#2329) * remove bottom chatbox fade * Modal wider; fix lint error * settings: attempt to not clear api key for same provider * prevent api key from resetting after changing the model * revert other changes and fix post test tear down error --------- Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> * fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034) * fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * regenerate integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> * Feat: Support Gorilla APIBench (#2081) * removed unused files from gorilla * Update run_infer.py, removed unused imports * Update utils.py * Update ast_eval_hf.py * Update ast_eval_tf.py * Update ast_eval_th.py * Create README.md * Update run_infer.py * make lint * Update run_infer.py * fix lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * remote useless (#2332) * fix integration test * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * fix integration test --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Frank Xu <frankxu2004@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com> Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> Co-authored-by: Aaron Xia <zhhuaxia@gmail.com> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-06-09 10:19:05 -07:00
Engel Nyst	fab8c9003b	remove deprecated github-token config (#2334 ) Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-06-09 09:50:24 +02:00
Boxuan Li	a9a2f10170	Revamp AgentRejectAction and allow ManagerAgent to handle rejection (#1735 ) * Fix AgentRejectAction handling * Add ManagerAgent to integration tests * Fix regenerate.sh * Fix merge * Update README for micro-agents * Add test reject to regenerate.sh * regenerate.sh: Add support for running a specific test and/or agent * Refine reject schema, and allow ManagerAgent to handle reject * Add test artifacts for test_simple_task_rejection * Fix manager agent tests * Fix README * test_simple_task_rejection: check final agent state * Integration test: exit if mock prompt not found * Update test_simple_task_rejection tests * Fix test_edits test artifacts after prompt update * Fix ManagerAgent test_edits * WIP * Fix tests * update test_edits for ManagerAgent * Skip local sandbox for reject test * Fix test comparison	2024-06-08 23:12:30 -07:00
tobitege	a97d0767e9	fix: Backticks get always escaped by runtime; add Ipython test (#2321 ) * added tests related to backticks * updated .gitignore * added extra linter test for #2210 * hotfix for integration test * added test_ipython unit test * added test_ipython unit test * remove draft test from test_ipython.py --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-06-08 21:02:27 +00:00
Aaron Xia	b5a17efc45	fix: codeact bug [If running a command that never returns, it gets stuck #1895 ] (#2034 ) * fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * regenerate integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com>	2024-06-08 16:40:23 +00:00
Xingyao Wang	903381f16e	Add back jupyter PWD env var for agentskills (#2327 ) * add back jupyter pwd env var for agentskills * add unit test for pwd change in execute_cli	2024-06-08 08:51:42 +00:00
tobitege	b431fce938	tests: more Agentskills tests; updated .gitignore (#2307 ) * added tests related to backticks * updated .gitignore * added extra linter test for #2210 * hotfix for integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-06-07 16:29:03 +00:00
Yufan Song	6aba337416	fix (#2318 )	2024-06-07 09:22:29 -07:00
Frank Xu	4455260290	[bugfix] browse actions shouldn't change url and screenshot, only observations (#2311 ) * browse related actions shouldn't change url and screenshot, only the observations should * fix linting * fix integrat * update integration test --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-06-08 00:03:32 +08:00
Boxuan Li	45ce09d70e	CodeActAgent: Delegate to BrowsingAgent for browsing tasks (#2103 )	2024-06-07 00:53:47 -07:00
tobitege	1fa09e0414	fix: test_sandbox tests didn't close dockers (#2274 ) * fix test_sandbox tests to close dockers * removed try/finally --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-06-06 03:45:45 +00:00
Frank Xu	48151bdbb0	[feat] WebArena benchmark, MiniWoB++ benchmark and related arch changes (#2170 ) * add webarena, and revamp messaging for webarena eval * add changes for browsergym * update infer script * fix unit tests * update * add multiple run for miniwob * update instruction, remove personal path * update * add code for getting final reward, fix integration, add results * add avg cost calculation	2024-06-06 09:01:20 +08:00
RainRat	3b0e1361a4	fix typos (#2267 ) * fix typos no functional change * fix typos * fix typos * fix integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Leo <ifuryst@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com>	2024-06-05 23:06:40 +08:00
tobitege	44bbe5e208	Fix agentskills tests (#2242 ) * Fix agentskills tests * Improved test_agent_skill --------- Co-authored-by: Leo <ifuryst@gmail.com>	2024-06-04 21:33:32 +00:00
tobitege	0082640ac8	fix test_config to prevent leaks (#2245 )	2024-06-04 21:32:46 +02:00
Graham Neubig	7a2122ebc2	Default to gpt-4o (#2158 ) * Default to gpt-4o * Fix default	2024-05-31 14:44:07 +00:00
மனோஜ்குமார் பழனிச்சாமி	961c96a2a1	Added ssh_password to config setup (#2139 ) Co-authored-by: Aleksandar <isavitaisa@gmail.com>	2024-05-31 07:26:16 +05:30
Xingyao Wang	01ef90205d	Add CodeActSWEAgent to remove browsing & github + improvements on agentskills (#2105 ) * update swe_bench prompt; use minimal prompt for codeact; * upgrade agentskills and update testcases * update infer prompt * fix cwd * add icl for swebench * also log in_context_example to run infer * remove extra print * change prompt to abs path * update error message to include current file info * change cwd for jupyter if needed * update edit error message * update prompt * improve git get patch * update hint string * default to 50 turns * revert changes from codeact agent and create new CodeActSWEAgent * revert changes to codeact * revert instructions for run infer * revert instructions for run infer * update README * update max iter * add codeact swe agent * fix issue for CodeActSWEAgent * allow specifying max iter in cmdline script * stop printing * Update agenthub/codeact_swe_agent/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Fix prompt regression in jupyter plugin --------- Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-29 21:19:00 -07:00
Rahul Anand	b3cce763a2	fix #2123 (#2125 )	2024-05-29 17:56:45 -04:00
Boxuan Li	9b371b1b5f	Refactor agent delegation and tweak micro agents (#1910 ) This PR fixes #1897. In addition, this PR fixes and tweaks a few micro-agents. For the first time, I am able to use ManagerAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds ManagerAgent as part of integration tests. test_write_simple_script involves delegation to CoderAgent while test_edits involves delegation to TypoFixerAgent. Also for the first time, I am able to use DelegateAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds DelegateAgent as part of integration tests. It involves delegation to StudyRepoForTaskAgent, CoderAgent and VerifierAgent. This PR is a blocker for #1735 and likely #1945.	2024-05-28 20:01:16 -07:00
Engel Nyst	55fdee31ad	Remove unnecessary stuff from the sandboxes tests (#2095 )	2024-05-27 20:50:02 +05:30
Xingyao Wang	ae8cda1495	Support specifying custom cost per token (#2083 ) * support specifying custom cost per token * fix test for new attrs * add to docs --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-05-27 19:35:34 +08:00
Aleksandar	18d07bda89	feat: add max_budget_per_task configuration to control task cost (#2070 ) * feat: add max_budget_per_task configuration to control task cost * Fix test_arg_parser.py * Use the config.max_budget_per_task as default value * Add max_budget_per_task to core/main.py as well * Update opendevin/controller/agent_controller.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-27 02:04:31 +08:00
Engel Nyst	783fea62a0	Ignore pid for loop detection (Was: override eq...) (#2045 ) * rewrite, implement pid ignore in the controller * make the helper method private	2024-05-26 19:27:12 +02:00
Shimada666	b31f7701eb	Integrate Multimodal tools to `agentskills`. (#2016 ) * suport reading multimodal files * move file * update dependency * remove useless pip install * add comments * update the comment * Apply suggestions from code review * Add unit test for TXTReader * pre-commit hook corrupted utf16 test txt * Revert unnecessary dependency upgrades * feat: import some readers for agentskill * add dependencies * Integrate some multimodal tools * add shell pip dependency * update dependencies * update dependencies * update print window * remove __main__ * locally import cv2 * add c library for opencv * update lock file * update prompt * remove unuseful file * add some unittest * add unittest & remove excel-related parser * rollback poetry lock * remove markdown * remove requests * optimize parse_video output * Fix integration tests for CodeActAgent * remove test_parse_image unittest * Add a TODO to containers/sandbox/Dockerfile * update dependencies * remove pyproject.toml useless package * change document via openai key * Fix prompts after removing some actions --------- Co-authored-by: Mingchen Zhuge <mczhuge@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Mingchen Zhuge <64179323+mczhuge@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-05-25 18:58:49 +08:00
Boxuan Li	78241d9d43	Add tests for browser agent (#2031 ) Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-05-24 09:59:40 +00:00
Boxuan Li	c59bcbbffd	Minor docstring & prompt fixes for AgentSkills (#2028 ) * A few minor fixes to agentskills * Regenerate prompts * Remove redundant comment	2024-05-24 13:30:48 +08:00
Boxuan Li	633ece5f9c	Fix integration tests (#2024 )	2024-05-23 20:24:31 -07:00
Robert Brennan	9ca2007201	fix json encoding (#2018 ) * fix json encoding * add test * add another test * fix integration tests --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-05-23 23:36:15 +00:00
Xingyao Wang	602ffcdffb	Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941 ) * add draft for skills * Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file * Remove new_sample.txt file * add some work from opendevin w/ fixes * Add unit tests for agentskills module * fix some issues and updated tests * add more tests for open * tweak and handle goto_line * add tests for some edge cases * add tests for scrolling * add tests for edit * add tests for search_dir * update tests to use pytest * use pytest --forked to avoid file op unit tests to interfere with each other via global var * update doc based on swe agent tool * update and add tests for find_file and search_file * move agent_skills to plugins * add agentskills as plugin and docs * add agentskill to ssh box and fix sandbox integration * remove extra returns in doc * add agentskills to initial tool for jupyter * support re-init jupyter kernel (for agentskills) after restart * fix print window's issue with indentation and add testcases * add prompt for codeact with the newest edit primitives * modify the way line number is presented (remove leading space) * change prompt to the newest display format * support tracking of costs via metrics * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * implement and add tests for py linting * remove extra text arg for incompatible subprocess ver * remove sample.txt * update test_edits integration tests * fix all integration * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update opendevin/runtime/plugins/agent_skills/agentskills.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * correctly setup plugins for swebench eval * bump swe-bench version and add logging * correctly setup plugins for swebench eval * bump swe-bench version and add logging * Revert "correctly setup plugins for swebench eval" This reverts commit `2bd1055673`. * bump version * remove _AGENT_SKILLS_DOCS * move flake8 to test dep * update poetry.lock * remove extra arg * reduce max iter for eval * update poetry * fix integration tests --------- Co-authored-by: OpenDevin <opendevin@opendevin.ai> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-23 16:04:09 +00:00

1 2 3

122 Commits