* regenerate.sh: Allow testing on a specific agent and/or test
* Check agent finish state
* rengerate.sh: Rerun after fixing the prompts
* Fix SWEAgent test_write_simple_script
* Add more help message
* Add a known issue to README.md
* regenerate.sh: Fix help message typo
* Fix a typo in README
* refactor remind_iterations
* regenerate tests
* concatenate iteration message
* add some helpers to the tests
* regenerate tests
* add to logs
* regenerate tests
* add debug info
* fix exit_on_message
* fix regen script
* regenerate tests
* Revert "Merge branch 'rb/test-regen' of ssh://github.com/opendevin/opendevin into rb/test-regen"
This reverts commit b9cd1acbf2, reversing
changes made to c888285304.
* remove prints
* revert files
* revert more
* revert more
* regenerate for the last time I hope
* add back remind_iter
* regenerate
* add back remind_iter
* regenerate
* fix remind_iter
* regenerate yet again
* regen
* remove comment
* regen again
* Fix edit hack for multiple edits in same command
This PR changes ([\s\S]*) to ([\s\S]*?) to make the capturing
group non-greedy. This change ensures that the regex captures
the smallest set of characters that extends up to the first
end_of_edit it encounters, rather than extending across multiple
edit commands.
Without the fix, a bash command consisting of multiple edits
would be corrupt and lead to unexpected edit results.
* mypy is invaluable
* fix config, add test
* Add new-style toml support
* add singleton, small doc fixes
* fix some cases of loading toml, clean up, try to make it clearer
* Add defaults_dict for UI
* allow config to be mutable
error handling
fix toml parsing
* remove debug stuff
* Adapt Makefile
* Add defaults for temperature and top_p
* update to CodeActAgent
* comments
* fix unit tests
* implement groups of llm settings (CLI)
* fix merge issue
* small fix sandboxes, small refactoring
* adapt LLM init to accept overrides at runtime
* reading config is enough
* Encapsulate minimally embeddings initialization
* agent bug fix; fix tests
* fix sandboxes tests
* refactor globals in sandboxes to properties
* align codeact agent with the slight adjustment on eval branch
* update integration test for new prompt
* Regenerate test artifacts for CodeActAgent
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
* Add AgentRejectAction across multiple modules
This commit introduces the AgentRejectAction class and integrates it across various modules and actions. It includes updates to READMEs, action definitions, and agent controllers to handle the new 'reject' action. This functionality will allow agents to properly signal task rejection.
* Fix unit test
* Remove wrong generates attributes from a few micro-agents
* remove extra actions
* remove message observations
* support null obs
* handle null obs
* fix frontend for changes
* fix the way messages flow to the UI
* change think to message
* add regen script
* regenerate all integration tests
* change task
* remove gh test
* fix messages
* fix tests
* help agent exit after hitting max iter
* Update opendevin/events/observation/success.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* Update agenthub/codeact_agent/codeact_agent.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* Add TypoFixerAgent micro-agent to fix typos
* Improve parse_response to accurately extract the first complete JSON object
* Add tests for parse_response function handling complex scenarios
* Fix tests and logic to use action_from_dict
* Fix small formatting issues
* move towards event stream
* refactor agent state changes
* move agent state logic
* fix callbacks
* break on finish
* closer to working
* change frontend to accomodate new flow
* handle start action
* fix locked stream
* revert message
* logspam
* no async on close
* get rid of agent_task
* fix up closing
* better asyncio handling
* sleep to give back control
* fix key
* logspam
* update frontend agent state actions
* fix pause and cancel
* delint
* fix map
* delint
* wait for agent to finish
* fix unit test
* event stream enums
* fix merge issues
* fix lint
* fix test
* fix test
* add user message action
* add user message action
* fix up user messages
* fix main.py flow
* refactor message waiting
* lint
* fix test
* fix test
* initialize plugin definition
* initialize plugin definition
* simplify mixin
* further improve plugin mixin
* add cache dir for pip
* support clean up cache
* add script for setup jupyter and execution server
* integrate JupyterRequirement to ssh_box
* source bashrc at the end of plugin load
* add execute_cli that accept code via stdin
* make JUPYTER_EXEC_SERVER_PORT configurable via env var
* increase background cmd sleep time
* Update opendevin/sandbox/plugins/mixin.py
Co-authored-by: Robert Brennan <accounts@rbren.io>
* add mixin to base class
* make jupyter requirement a dataclass
* source plugins only when >0 requirements
* add `sandbox_plugins` for each agent & have controller take care of it
* update build.sh to make logs available in /opendevin/logs
* switch to use config for lib and cache dir
* Add SANDBOX_WORKSPACE_DIR into config
* Add SANDBOX_WORKSPACE_DIR into config
* fix occurence of /workspace
* fix permission issue with /workspace
* use python to implement execute_cli to avoid stdin escape issue
* add IPythonRunCellAction and get it working
* wait until jupyter is avaialble
* support plugin via copying instead of mounting
* add agent talk action
* support follow-up user language feedback
* add __str__ for action to be printed better
* only print PLAN at the beginning
* wip: update codeact agent
* get rid the initial messate
* update codeact agent to handle null action;
add thought to bash
* dispatch thought for RUN action as well
* fix weird behavior of pxssh where the output would not flush correctly
* make ssh box can handle exit_code properly as well
* add initial version of swe-agent plugin;
* rename swe cursors
* split setup script into two and create two requirements
* print SWE-agent command documentation
* update swe-agent to default to no custom docs
* add initial version of swe-agent plugin;
* rename swe cursors
* split setup script into two and create two requirements
* print SWE-agent command documentation
* update swe-agent to default to no custom docs
* update dockerfile with dependency from swe-agent
* make env setup a separate script for .bashrc source
* add wip prompt
* fix mount_dir for ssh_box
* update prompt
* fix mount_dir for ssh_box
* default to use host network
* default to use host network
* move prompt to a separate file
* fix swe-tool plugins;
add missing _split_string
* remove hostname from sshbox
* update the prompt with edit functionality
* fix swe-tool plugins;
add missing _split_string
* add awaiting into status bar
* fix the bug of additional send event
* remove some print action
* move logic to config.py
* remove debugging comments
* make host network as default
* make WORKSPACE_MOUNT_PATH as abspath
* implement execute_cli via file cp
* Revert "implement execute_cli via file cp"
This reverts commit 06f0155bc1.
* add codeact dependencies to default container
* add IPythonRunCellObservation
* add back cache dir and default to /tmp
* make USE_HOST_NETWORK a bool
* revert use host network to false
* add temporarily fix for IPython RUN action
* update prompt
* revert USE_HOST_NETWORK to true since it is not affecting anything
* attempt to fix lint
* remove newline
* fix jupyter execution server
* add `thought` to most action class
* fix unit tests for current action abstraction
* support user exit
* update test cases with the latest action format (added 'thought')
* fix integration test for CodeActAGent by mocking stdin
* only mock stdin for tests with user_responses.log
* remove -exec integration test for CodeActAgent since it is not supported
* remove specific stop word
* fix comments
* improve clarity of prompt
* fix py lint
* fix integration tests
* sandbox might failed in chown due to mounting, but it won't be fatal
* update debug instruction for sshbox
* fix typo
* get RUN_AS_DEVIN and network=host working with app sandbox
* get RUN_AS_DEVIN and network=host working with app sandbox
* attempt to fix the workspace base permission
* sandbox might failed in chown due to mounting, but it won't be fatal
* update sshbox instruction
* remove default user id since it will be passed in the instruction
* revert permission fix since it should be resolved by correct SANDBOX_USER_ID
* the permission issue can be fixed by simply provide correct env var
* remove log
* set sandbox user id to getuid by default
* move logging to initializer
* make the uid consistent across host, app container, and sandbox
* remove hostname as it causes sudo issue
* fix permission of entrypoint script
* make the uvicron app run as host user uid for jupyter plugin
* add warning message
* update dev md for instruction of running unit tests
* add back unit tests
* revert back to the original sandbox implementation to fix testcases
* revert use host network
* get docker socket gid and usermod instead of chmod 777
* allow unit test workflow to find docker.sock
* make sandbox test working via patch
* fix arg parser that's broken for some reason
* try to fix app build disk space issue
* fix integration test
* Revert "fix arg parser that's broken for some reason"
This reverts commit 6cc8961133.
* update Development.md
* cleanup intergration tests & add exception for CodeAct+execbox
* fix config
* implement user_message action
* fix doc
* fix event dict error
* fix frontend lint
* revert accidentally changes to integration tests
* revert accidentally changes to integration tests
---------
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Robert Brennan <contact@rbren.io>
* update dev md for instruction of running unit tests
* add back unit tests
* revert back to the original sandbox implementation to fix testcases
* allow unit test workflow to find docker.sock
* make sandbox test working via patch
* fix arg parser that's broken for some reason
* fix integration test
* Revert "fix arg parser that's broken for some reason"
This reverts commit 6cc8961133.
* update Development.md
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.
* More monologue prompt improvements
* Dynamically set username provided in prompt.
* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.
* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.
* Update prompts in integration tests to match monologue agent changes.
* Minor tweaks to make merge easier.
* Another minor prompt tweak, better invalid json handling.
* Fix lint error
* More catch-up to fix lint errors introduced by merge.
* Force WORKSPACE_MOUNT_PATH_IN_SANDBOX to match WORKSPACE_MOUNT_PATH in local sandbox mode, combine exception handlers in prompts.py.
---------
Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.
* More monologue prompt improvements
* Dynamically set username provided in prompt.
* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.
* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.
* Update prompts in integration tests to match monologue agent changes.
* Minor tweaks to make merge easier.
* Another minor prompt tweak, better invalid json handling.
* Fix lint error
* More catch-up to fix lint errors introduced by merge.
---------
Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
* Add integration test framework with mock llm
* Fix MonologueAgent and PlannerAgent tests
* Remove adhoc logging
* Use existing logs
* Fix SWEAgent and PlannerAgent
* Check-in test log files
* conftest: look up under test name folder only
* Add docstring to conftest
* Finish dev doc
* Avoid non-determinism
* Remove dependency on llm embedding model
* Init embedding model only for MonologueAgent
* Add adhoc fix for sandbox discrepancy
* Test ssh and exec sandboxes
* CI: fix missing sandbox type
* conftest: Remove hack
* Reword comment for TODO
* Merge branch 'main' of https://github.com/JayQuimby/OpenDevin
* Using commands.sh for ACI
* parsing, prompting, and actions modifications
* added start and end index to read and write
* bug fixes and test updates
* Lint code changes to ensure code is proper
* State management, bugs, prompts
* Prompt Engineering
* exception handling
* big fixes
* more bug fixes
* merge conflicts
* Renamed SWEAgent, basic tests, bug fixes
* prompt changes, bug fixes
* merge conflicts
* merge conflict
* start and end line for read and write
* more merge conflicts
* env error
* linter error
* read fixed, prompt change, example added
* added x_line:end read operation
---------
Co-authored-by: Robert Brennan <accounts@rbren.io>