* initialize plugin definition * initialize plugin definition * simplify mixin * further improve plugin mixin * add cache dir for pip * support clean up cache * add script for setup jupyter and execution server * integrate JupyterRequirement to ssh_box * source bashrc at the end of plugin load * add execute_cli that accept code via stdin * make JUPYTER_EXEC_SERVER_PORT configurable via env var * increase background cmd sleep time * Update opendevin/sandbox/plugins/mixin.py Co-authored-by: Robert Brennan <accounts@rbren.io> * add mixin to base class * make jupyter requirement a dataclass * source plugins only when >0 requirements * add `sandbox_plugins` for each agent & have controller take care of it * update build.sh to make logs available in /opendevin/logs * switch to use config for lib and cache dir * Add SANDBOX_WORKSPACE_DIR into config * Add SANDBOX_WORKSPACE_DIR into config * fix occurence of /workspace * fix permission issue with /workspace * use python to implement execute_cli to avoid stdin escape issue * add IPythonRunCellAction and get it working * wait until jupyter is avaialble * support plugin via copying instead of mounting * add agent talk action * support follow-up user language feedback * add __str__ for action to be printed better * only print PLAN at the beginning * wip: update codeact agent * get rid the initial messate * update codeact agent to handle null action; add thought to bash * dispatch thought for RUN action as well * fix weird behavior of pxssh where the output would not flush correctly * make ssh box can handle exit_code properly as well * add initial version of swe-agent plugin; * rename swe cursors * split setup script into two and create two requirements * print SWE-agent command documentation * update swe-agent to default to no custom docs * add initial version of swe-agent plugin; * rename swe cursors * split setup script into two and create two requirements * print SWE-agent command documentation * update swe-agent to default to no custom docs * update dockerfile with dependency from swe-agent * make env setup a separate script for .bashrc source * add wip prompt * fix mount_dir for ssh_box * update prompt * fix mount_dir for ssh_box * default to use host network * default to use host network * move prompt to a separate file * fix swe-tool plugins; add missing _split_string * remove hostname from sshbox * update the prompt with edit functionality * fix swe-tool plugins; add missing _split_string * add awaiting into status bar * fix the bug of additional send event * remove some print action * move logic to config.py * remove debugging comments * make host network as default * make WORKSPACE_MOUNT_PATH as abspath * implement execute_cli via file cp * Revert "implement execute_cli via file cp" This reverts commit 06f0155bc17d1f99097e71b83b2143f6e8092654. * add codeact dependencies to default container * add IPythonRunCellObservation * add back cache dir and default to /tmp * make USE_HOST_NETWORK a bool * revert use host network to false * add temporarily fix for IPython RUN action * preliminary implementation of CodeActAgent's jupyter * update node module * update prompt * revert USE_HOST_NETWORK to true since it is not affecting anything * attempt to fix lint * remove newline * update prompt * Refactor browser style. (#1358) * delete useless assets and css class. * add waiting for page loaded (networkidle with 3s timeout) * Add integration test framework with mock llm (#1301) * Add integration test framework with mock llm * Fix MonologueAgent and PlannerAgent tests * Remove adhoc logging * Use existing logs * Fix SWEAgent and PlannerAgent * Check-in test log files * conftest: look up under test name folder only * Add docstring to conftest * Finish dev doc * Avoid non-determinism * Remove dependency on llm embedding model * Init embedding model only for MonologueAgent * Add adhoc fix for sandbox discrepancy * Test ssh and exec sandboxes * CI: fix missing sandbox type * conftest: Remove hack * Reword comment for TODO * Revert "refactor(frontend): Terminal (#1315)" (#1360) This reverts commit 27246aca7e0f3d399740db466f31026c891a5374. * revert USE_HOST_NETWORK to true since it is not affecting anything * attempt to fix lint * handle IsADirectory errors (#1365) * update to 0.4.0 (#1362) Co-authored-by: Jim Su <jimsu@protonmail.com> * feat(frontend): multiple design changes (#1370) * fix/improve terminal hook (#1371) * Revert "update node module" This reverts commit 459b1031e722529ddc00ca475b88245bf52edeaa. * support SyntaxHighlighter and markdown for jupyter visualization * fix jupyter execution server * make jupyter active * improve the display of markdown and raw text * get base64 image display for react * add `thought` to most action class * fix unit tests for current action abstraction * support user exit * update test cases with the latest action format (added 'thought') * fix integration test for CodeActAGent by mocking stdin * only mock stdin for tests with user_responses.log * remove -exec integration test for CodeActAgent since it is not supported * remove specific stop word * fix comments * improve clarity of prompt * attempt to fix lint * attempt to fix lint yet agiain * fix py lint * fix integration tests * sandbox might failed in chown due to mounting, but it won't be fatal * update debug instruction for sshbox * fix typo * get RUN_AS_DEVIN and network=host working with app sandbox * get RUN_AS_DEVIN and network=host working with app sandbox * attempt to fix the workspace base permission * sandbox might failed in chown due to mounting, but it won't be fatal * update sshbox instruction * remove default user id since it will be passed in the instruction * revert permission fix since it should be resolved by correct SANDBOX_USER_ID * the permission issue can be fixed by simply provide correct env var * remove log * set sandbox user id to getuid by default * move logging to initializer * make the uid consistent across host, app container, and sandbox * remove hostname as it causes sudo issue * fix permission of entrypoint script * make the uvicron app run as host user uid for jupyter plugin * add warning message * fix frontend lint * update dev md for instruction of running unit tests * add back unit tests * revert back to the original sandbox implementation to fix testcases * revert use host network * get docker socket gid and usermod instead of chmod 777 * allow unit test workflow to find docker.sock * make sandbox test working via patch * fix arg parser that's broken for some reason * try to fix app build disk space issue * fix integration test * Revert "fix arg parser that's broken for some reason" This reverts commit 6cc89611337bb74555fd16b4be78681fb7e36573. * update Development.md * cleanup intergration tests & add exception for CodeAct+execbox * fix config * implement user_message action * fix doc * fix event dict error * fix frontend lint * revert accidentally changes to integration tests * revert accidentally changes to integration tests --------- Co-authored-by: Robert Brennan <accounts@rbren.io> Co-authored-by: Leo <ifuryst@gmail.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Jim Su <jimsu@protonmail.com> Co-authored-by: Alex Bäuerle <alex@a13x.io> Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com> Co-authored-by: Robert Brennan <contact@rbren.io>
CodeAct-based Agent Framework
This folder implements the CodeAct idea that relies on LLM to autonomously perform actions in a Bash shell. It requires more from the LLM itself: LLM needs to be capable enough to do all the stuff autonomously, instead of stuck in an infinite loop.
NOTE: This agent is still highly experimental and under active development to reach the capability described in the original paper & repo.
Demo of the expected capability - work-in-progress.
mkdir workspace
PYTHONPATH=`pwd`:$PYTHONPATH python3 opendevin/main.py -d ./workspace -c CodeActAgent -t "Please write a flask app that returns 'Hello, World\!' at the root URL, then start the app on port 5000. python3 has already been installed for you."
Example: prompts gpt-4-0125-preview to write a flask server, install flask library, and start the server.
Most of the things are working as expected, except at the end, the model did not follow the instruction to stop the interaction by outputting <execute> exit </execute> as instructed.
TODO: This should be fixable by either (1) including a complete in-context example like this, OR (2) collect some interaction data like this and fine-tune a model (like this, a more complex route).