* clean up sandbox and ssh related stuff
* remove ssh hostname
* remove ssh hostname
* remove ssh password
* update config
* fix typo that breaks the test
* Updated documentation using ruff's autofix feature
* Updated pyproject.toml to include docstring validations
* Updated documentation using ruff's autofix feature
* Updated pyproject.toml to include docstring validations
* Updated docstrings using ruff's autfix feature
* Deleted opendevin/runtime/utils/soource.py, Keeping in sync with main
---------
Co-authored-by: Graham Neubig <neubig@gmail.com>
Currently, OpenDevin uses a global singleton LLM config and a global singleton agent config. This PR allows customers to configure an LLM config for each agent. A hypothetically useful scenario is to use a cheaper LLM for repo exploration / code search, and a more powerful LLM to actually do the problem solving (CodeActAgent).
Partially solves #2075 (web GUI improvement is not the goal of this PR)
* Fix AgentRejectAction handling
* Add ManagerAgent to integration tests
* Fix regenerate.sh
* Fix merge
* Update README for micro-agents
* Add test reject to regenerate.sh
* regenerate.sh: Add support for running a specific test and/or agent
* Refine reject schema, and allow ManagerAgent to handle reject
* Add test artifacts for test_simple_task_rejection
* Fix manager agent tests
* Fix README
* test_simple_task_rejection: check final agent state
* Integration test: exit if mock prompt not found
* Update test_simple_task_rejection tests
* Fix test_edits test artifacts after prompt update
* Fix ManagerAgent test_edits
* WIP
* Fix tests
* update test_edits for ManagerAgent
* Skip local sandbox for reject test
* Fix test comparison
This PR fixes#1897. In addition, this PR fixes and tweaks a few micro-agents.
For the first time, I am able to use ManagerAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds ManagerAgent as part of integration tests. test_write_simple_script involves delegation to CoderAgent while test_edits involves delegation to TypoFixerAgent.
Also for the first time, I am able to use DelegateAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds DelegateAgent as part of integration tests. It involves delegation to StudyRepoForTaskAgent, CoderAgent and VerifierAgent.
This PR is a blocker for #1735 and likely #1945.
* Add ruff for shared mutable defaults (B)
* Apply B006, B008 on current files, except fast API
* Update agenthub/SWE_agent/prompts.py
Co-authored-by: Graham Neubig <neubig@gmail.com>
* fix unintended behavior change
* this is correct, tell Ruff to leave it alone
---------
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
I was able to run a few benchmark instances from SWE-Bench by myself following the documentation - it was great! In general the experience was smooth, thanks to @xingyaoww, @libowen2121 and the team! I made a few small enhancements and fixes to further improve the developer experience.
Always use poetry run python (using python from poetry's virtual environment) over python or python3 in scripts to make sure the behavior is consistent.
Make AGENT configurable. One can use an argument to control which agent they would like to benchmark. To facilitate this, I removed hardcoded CodeActAgent from run_infer.sh, and also added VERSION attribute to all agents, as the benchmark needs to record the agent version.
Make EVAL_LIMIT configurable. One can use an argument to control how many instances they'd like to benchmark. Useful for debugging & development purposes.
Fix 'eval_output_dir' not defined error in run_infer.py.
Other enhancements to the README file and logs.
I also notice that a lot of code from run_infer.py could be shared by other benchmarks, but since we only have one benchmark now, I think we could avoid over-engineering. A refactor and code dedup would be useful in the future once we have more benchmarks, though.
* feat: make other agents support asking user input in MessageAction.
* Update agenthub/micro/_instructions/actions/message.md
Co-authored-by: Robert Brennan <accounts@rbren.io>
* Update agenthub/micro/_instructions/actions/message.md
Co-authored-by: Robert Brennan <accounts@rbren.io>
* feat: make other agents support asking user input in MessageAction.
* Regenerate test artifacts
---------
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
* move MemoryCondenser, LongTermMemory, json, out of the monologue
* PlannerAgent and Microagents use the custom json.loads/dumps
* Move short term history out of monologue agent...
* move memory in their package
* add __init__
* Add AgentRejectAction across multiple modules
This commit introduces the AgentRejectAction class and integrates it across various modules and actions. It includes updates to READMEs, action definitions, and agent controllers to handle the new 'reject' action. This functionality will allow agents to properly signal task rejection.
* Fix unit test
* Remove wrong generates attributes from a few micro-agents
* Add TypoFixerAgent micro-agent to fix typos
* Improve parse_response to accurately extract the first complete JSON object
* Add tests for parse_response function handling complex scenarios
* Fix tests and logic to use action_from_dict
* Fix small formatting issues
* remove screenshot in browser observation
* refactor utils
* allow only dict
* fix screenshot not showing up in frontend
---------
Co-authored-by: Robert Brennan <accounts@rbren.io>
* Add new CommitWriterAgent to auto-generate commit messages from staged diffs
This commit introduces the CommitWriterAgent along with its configuration and detailed task description. The agent is designed to analyze git diffs staged for commit and automatically generate succinct and relevant commit messages.
* Remove devnote section from yaml and add README
* Fix micro agents definitions
* Add tests for micro agents
* Add to CI
* Revert "Add to CI"
This reverts commit 94f3b4e7c8408a1b0267f3847cbaefdcd995db05.
* Remove test artifacts for ManagerAgent
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>