OpenHands

mirror of https://github.com/OpenHands/OpenHands.git synced 2025-12-26 05:48:36 +08:00

Author	SHA1	Message	Date
adragos	e0b67ad2f1	feat: add Security Analyzer functionality (#3058 ) * feat: Initial work on security analyzer * feat: Add remote invariant client * chore: improve fault tolerance of client * feat: Add button to enable Invariant Security Analyzer * [feat] confirmation mode for bash actions * feat: Add Invariant Tab with security risk outputs * feat: Add modal setting for Confirmation Mode * fix: frontend tests for confirmation mode switch * fix: add missing CONFIRMATION_MODE value in SettingsModal.test.tsx * fix: update test to integrate new setting * feat: Initial work on security analyzer * feat: Add remote invariant client * chore: improve fault tolerance of client * feat: Add button to enable Invariant Security Analyzer * feat: Add Invariant Tab with security risk outputs * feat: integrate security analyzer with confirmation mode * feat: improve invariant analyzer tab * feat: Implement user confirmation for running bash/python code * fix: don't display rejected actions * fix: make confirmation show only on assistant messages * feat: download traces, update policy, implement settings, auto-approve based on defined risk * Fix: low risk not being shown because it's 0 * fix: duplicate logs in tab * fix: log duplication * chore: prepare for merge, remove logging * Merge confirmation_mode from OpenDevin main * test: update tests to pass * chore: finish merging changes, security analyzer now operational again * feat: document Security Analyzers * refactor: api, monitor * chore: lint, fix risk None, revert policy * fix: check security_risk for None * refactor: rename instances of invariant to security analyzer * feat: add /api/options/security-analyzers endpoint * Move security analyzer from tab to modal * Temporary fix lock when security analyzer is not chosen * feat: don't show lock at all when security analyzer is not enabled * refactor: - Frontend: * change type of SECURITY_ANALYZER from bool to string * add combobox to select SECURITY_ANALYZER, current options are "invariant and "" (no security analyzer) * Security is now a modal, lock in bottom right is visible only if there's a security analyzer selected - Backend: * add close to SecurityAnalyzer * instantiate SecurityAnalyzer based on provided string from frontend * fix: update close to be async, to be consistent with other close on resources * fix: max height of modal (prevent overflow) * feat: add logo * small fixes * update docs for creating a security analyzer module * fix linting * update timeout for http client * fix: move security_analyzer config from agent to session * feat: add security_risk to browser actions * add optional remark on combobox * fix: asdict not called on dataclass, remove invariant dependency * fix: exclude None values when serializing * feat: take default policy from invariant-server instead of being hardcoded * fix: check if policy is None * update image name * test: fix some failing runs * fix: security analyzer tests * refactor: merge confirmation_mode and security_analyzer into SecurityConfig. Change invariant error message for docker * test: add tests for invariant parsing actions / observations * fix: python linting for test_security.py * Apply suggestions from code review Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * use ActionSecurityRisk \| None intead of Optional * refactor action parsing * add extra check * lint parser.py * test: add field keep_prompt to test_security * docs: add information about how to enable the analyzer * test: Remove trailing whitespace in README.md text --------- Co-authored-by: Mislav Balunovic <mislav.balunovic@gmail.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-08-13 11:29:41 +00:00
Xingyao Wang	568e6cdb40	feat: change Jupyter cwd alone with "bash" (#3331 ) * remove unused plugin mixin * change the entire jupyter PWD with bash; print jupyter pwd in obs as well; * remove unused field * remove unused comments * change the entire jupyter PWD with bash; print jupyter pwd in obs as well; * fix runtime tests for jupyter * update intgeration tests * fix test again --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-13 06:08:31 -04:00
Yufan Song	86d933f1b0	remove useless code (#3355 )	2024-08-12 21:34:17 -04:00
mamoodi	e2b2f74737	Add comments to the runtime_build script (#3333 )	2024-08-12 15:17:12 -04:00
Yufan Song	1424db8309	remove uesless process (#3338 )	2024-08-11 18:53:55 -07:00
Yufan Song	28dd882f98	remove useless sanbox (#3336 )	2024-08-11 19:08:30 +00:00
Yufan Song	99ac91f6ad	remove sandbox abstract class (#3337 ) * remove sandbox abstract class * remove cancellable stream	2024-08-11 22:47:28 +08:00
Yufan Song	da5bf6c1bf	remove useless mock test code (#3335 )	2024-08-11 07:33:58 +00:00
Xingyao Wang	e6fa5b5df0	chore: remove unused plugin mixin (#3332 ) * remove unused plugin mixin * remove unused field * remove unused comments	2024-08-09 20:50:49 -04:00
tofarr	b4a7e27bfd	chore Fix architecture diagram (#3141 ) * Fix architecture diagram * Updated Readme with permanent img * Fix path --------- Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local> Co-authored-by: Tim O'Farrell <tofarr@gmai.com> Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>	2024-08-09 20:48:31 +00:00
Xingyao Wang	bdf6df12c3	fix: pip not available in runtime (#3306 ) * try to fix pip unavailable * update test case for pip * force rebuild in CI * remove extra symlink * fix newline * added semi-colon to line 31 * Dockerfile.j2: activate env at the end * Revert "Dockerfile.j2: activate env at the end" This reverts commit cf2f5651021fe80d4ab69a35a85f0a35b29dc3d7. * cleanup Dockerfile * switch default python image * remove image agnostic (no longer used) * fix tests * switch to nikolaik/python-nodejs:python3.11-nodejs22 * fix test * fix test * revert docker * update template --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-09 15:04:43 -04:00
Xingyao Wang	2e6b08db4f	fix: workspace folder permission & app container cannot access client API (#3300 ) * also copy over pyproject and poetry lock * add missing readme * remove extra git config init since it is already done in client.py * only chown if the /workspace dir does not exists * Revert "remove extra git config init since it is already done in client.py" This reverts commit e8556cd76dcb1720b33f5e06904c56efda2e7d9f. * remove extra git config init since it is already done in client.py * fix test runtime * print container log while reconnecting * print log in more readable format * print log in more readable format * increase lines * clean up sandbox and ssh related stuff * remove ssh hostname * remove ssh hostname * fix docker app cannot access runtime API issue * remove ssh password * API HOSTNAME should be pre-fixed with SANDBOX * update config * fix typo that breaks the test	2024-08-08 19:28:34 -04:00
Xingyao Wang	a5195b0e65	chore: clean up sandbox and ssh related configs (#3301 ) * clean up sandbox and ssh related stuff * remove ssh hostname * remove ssh hostname * remove ssh password * update config * fix typo that breaks the test	2024-08-08 22:15:40 +00:00
tofarr	040b9cb75c	Chore Readme updates (#3302 ) * Readme updates Added explicit installation instructions to server and frontend README * Documentation update * WIP * WIP --------- Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>	2024-08-08 18:06:58 -04:00
Xingyao Wang	db302fd33c	fix: dubious ownership when running `git` (#3282 ) * switch default to eventstream runtime * remove pull docker from makefile * fix unittest * fix file store path * try deprecate server runtime * remove persist sandbox * move file utils * remove server runtime related workflow * remove unused method * attempt to remove the reliance on filestore for BE * fix async for list file * fix list_files to post * fix list files * add suffix to directory * make sure list file returns abs path; make sure other backend endpoints accpets abs path * remove server runtime test workflow * set git config in runtime * chown for workspace in client; use INIT_COMMANDS to maintain all commands that need to be run before bash start; * fix client issue; add test case for git; --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-08 13:14:45 +00:00
Xingyao Wang	90d0a62469	(arch) Switch default runtime to EventStream Runtime (#3271 ) * switch default to eventstream runtime * remove pull docker from makefile * fix unittest * fix file store path * try deprecate server runtime * remove persist sandbox * move file utils * remove server runtime related workflow * remove unused method * attempt to remove the reliance on filestore for BE * fix async for list file * fix list_files to post * fix list files * add suffix to directory * make sure list file returns abs path; make sure other backend endpoints accpets abs path * remove server runtime test workflow * set git config in runtime	2024-08-08 10:11:49 +08:00
Xingyao Wang	b30a2dd87a	completely remove update_source_code (#3280 )	2024-08-07 16:57:11 +00:00
Xingyao Wang	bb66f15ff6	[Arch] Streamline EventStream Runtime Image Building Logic (#3259 ) * remove nocache * simplify runtime build to use hash & always update source * style * try to fix temp folder issue * fix rm tree * create build folder first (to get correct hash), then copy it over to actual build folder * fix assert * fix indentation * fix copy over * add runtime documentation * fix runtime docs * fix typo * Update docs/modules/usage/runtime.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update docs/modules/usage/runtime.md Co-authored-by: Graham Neubig <neubig@gmail.com> --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-07 06:09:38 +08:00
Xingyao Wang	31b244f95e	[Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230 ) * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit 2b668b1506e02145cb8f87e321aad62febca3d50. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit 8be4c86756f0e3fc62957b327ba2ac4999c419de. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit 30e2181c3fc1410e69596c2dcd06be01f1d016b3. * disable agnostic tests in CI * fix test * make sure plugin arg is not passed when no plugin is specified; remove redundant on_event function; * move mock prompt * rename runtime * remove extra logging * refactor run_controller's interface; support multiple runtime for integration test; filter out hostname for prompt * uncomment other tests * pass the right runtime to controller * log runtime when start * uncomment tests * improve symbol filters * add intergration test prompts that seemd ok * add integration test workflow * add python3 to default ubuntu image * symlink python and fix permission to jupyter pip * add retry for jupyter execute server * fix jupyter pip install; add post-process for jupyter pip install; simplify init by add agent_skills path to PYTHONPATH; add testcase to tests jupyter pip install; * fix bug * use ubuntu:22.04 for eventstream integration tests * add todo * update testcase * remove redundant code * fix unit test * reduce dependency for runtime * try making llama-index an optional dependency that's not installed by default * remove pip install since it seemd not needed * log ipython execution; await write message since it returns a future * update ipy testcase * do not install llama-index in CI * do not install llama-index in the app docker as well * set sandbox container image in the integration test script * log plugins & env var for runtime * update conftest for sha256 * add git * remove all non-alphanumeric chalracters * add working ipy module tests! * default to use host network * remove is_async from browser to make thing a little more reliable; retry loading browser when error; * add sleep to wait a bit for http server * kill http server before regenerate browsing tests * fix browsing * only set sandbox container image if undefined * skip empty config value * update evaluation to use the latest run_controller * revert logger in execute_server to be compatible with server runtime * revert logging level to fix jupyter * set logger level * revert the logging * chmod for workspace to fix permission * support getting timeout from action * update test for server runtime * try to fix file permission * fix test_cmd_run_action_serialization_deserialization test (added timeout) * poetry: pip 24.2, torch 2.2.2 * revert adding pip to pyproject.toml * add build to dependencies in pyproject.toml * forgot poetry lock --no-update * fix a DelegatorAgent prompt_002.log (timeout) * fix a DelegatorAgent prompt_003.log (timeout) * couple more timeout attribs in prompt files * some more prompt files * prompts galore * add clarification comment for timeout * default timeout to config * add assert * update integraton tests for eventstream * update integration tests * fix timeout for action<->dict * remove redundant on_event * default to use instance image * update run_controller interface * add logging for copy * refactor swe_bench for the new design * fix action execution timeout * updatelock * remove build sandbox locally * fix runtime * use plain for-loop for single process * remove extra print * get swebench inference working * print whole `test_result` dict * got swebench patch post-process working * update swe-bench evaluation readme * refactor using shared reset_logger function * move messy swebench prompt to a different file * support the ability to specify whether to keep prompt * support the ability to specify whether to keep prompt * fix dockerfile * fix import and remove unnecessary strip logic * fix action serialization * get agentbench running * remove extra ls for agent bench * fix agentbench metric * factor out common documentation for eval * update biocoder doc * remove swe_env_box since it is no longer needed * get biocoder working * add func timeout for bird * fix jupyter pwd with ~ as user name * fix jupyter pwd with ~ as user name * get bird working * get browsing evaluation working * make eda runnable * fix id column * fix eda run_infer * unify eval output using a structured format; make swebench coompatible with that format; update client source code for every swebench run; do not inject testcmd for swebench * standardize existing benchs for the new eval output * set update source code = true * get gaia standardized * fix gaia * gorilla refactored but stuck at language.so to test * refactor and make gpqa work * refactor humanevalfix and get it working * refactor logic reasoning and get it working * refactor browser env so it works with eventstream runtime for eval * add initial version of miniwob refactor * fix browsergym environment * get miniwob working!! * allowing injecting additional dependency to OD runtime docker image * allowing injecting additional dependency to OD runtime docker image * support logic reasoning with pre-injected dependency * get mint working * update runtime build * fix mint docker * add test for keep_prompt; add missing await close for some tests * update integration tests for eventstream runtime * fix integration tests for server runtime * refactor ml bench and toolqa * refactor webarena * fix default factory * Update run_infer.py * add APIError to retry * increase timeout for swebench * make sure to hide api key when dump eval output * update the behavior of put source code to put files instead of tarball * add dishash to dependency * sendintr when timeout * fix dockerfile copy * reduce timeout * use dirhash to avoid repeat building for update source * fix runtime_build testcase * add dir_hash to docker build pipeline * revert api error * update poetry lock * add retries for swebench run infer * fix git patch * update poetry lock * adjust config order * fix mount volumns * enforce all eval to use "instance_id" * remove file store from runtime * make file_store public inside eventstream * move the runtime logic inside `main` out * support using async function for process_instance_fn * refactor run_infer with the create_time * fix file store * Update evaluation/toolqa/utils.py Co-authored-by: Graham Neubig <neubig@gmail.com> * fix typo --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: super-dainiu <78588128+super-dainiu@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-06 17:21:45 +00:00
Graham Neubig	789f3504a9	Add init_runtime_tools for event stream runtime (#3256 )	2024-08-06 01:14:31 +08:00
Xingyao Wang	a69120d399	[Arch] Use hash to avoid repeat building `EventStreamRuntime` image (#3243 ) * update the behavior of put source code to put files instead of tarball * add dishash to dependency * fix dockerfile copy * use dirhash to avoid repeat building for update source * fix runtime_build testcase * add dir_hash to docker build pipeline * add additional tests for source directory * add comment * clear the assertion by explictly check existing files * also assert od is a dir	2024-08-05 03:13:32 +00:00
tobitege	abec52abfe	(fix) Revert #3233 ; more logging in runtimes (#3236 ) * ServerRuntime: config copy in init * revert #3233 but more logging * get_box_classes: reset order back to previous version * 3 logging commands switched to debug (were info) * runtimes debug output of config on initialization * removed unneeded logger message from _init_container	2024-08-04 19:13:37 +00:00
Xingyao Wang	6a12a9f83c	[Arch, Eval] Allowing injecting additional dependency to OD runtime docker image (#3237 ) * allowing injecting additional dependency to OD runtime docker image * update runtime build * make `extra_deps` optional str \| None	2024-08-04 17:38:56 +00:00
Xingyao Wang	62ce183c2d	[Agent Action] Support the ability to specify whether to keep prompt for CmdRun (#3218 ) * support the ability to specify whether to keep prompt * fix action serialization * fix jupyter pwd with ~ as user name * add test for keep_prompt; add missing await close for some tests * update integration tests for eventstream runtime * fix integration tests for server runtime	2024-08-04 20:30:25 +08:00
Kaushik Deka	415843476c	Feat: Add Vision Input Support for LLM with Vision Capabilities (#2848 ) * add image feature * fix-linting * check model support for images * add comment * Add image support to other models * Add images to chat * fix linting * fix test issues * refactor variable names and import * fix tests * fix chat message tests * fix linting * add pydantic class message * use message * remove redundant comments * remove redundant comments * change Message class * remove unintended change * fix integration tests using regenerate.sh * rename image_bas64 to images_url, fix tests * rename Message.py to message, change reminder append logic, add unit tests * remove comment, fix error to merge * codeact_swe_agent * fix f string * update eventstream integration tests * add missing if check in codeact_swe_agent * update integration tests * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatMessage.tsx --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>	2024-08-04 02:26:22 +08:00
Xingyao Wang	b7061f4497	[Eval, Browser] Refactor Browser Env so it works with `EventStreamRuntime` for Browsing Evaluation (#3235 ) * refactor browser env so it works with eventstream runtime for eval * fix browsergym environment	2024-08-03 15:06:37 +00:00
tobitege	1166b0e610	client runtime: fix config passing on init; added logging (#3233 )	2024-08-03 10:37:38 +08:00
Xingyao Wang	001195a3ea	reduce the duplication in run_controller (#3217 )	2024-08-02 10:12:34 +08:00
Xingyao Wang	4f0a454ed6	[Arch] Support integration tests using EventStream Runtime (#3184 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit 2b668b1506e02145cb8f87e321aad62febca3d50. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit 8be4c86756f0e3fc62957b327ba2ac4999c419de. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit 30e2181c3fc1410e69596c2dcd06be01f1d016b3. * disable agnostic tests in CI * fix test * make sure plugin arg is not passed when no plugin is specified; remove redundant on_event function; * move mock prompt * rename runtime * remove extra logging * refactor run_controller's interface; support multiple runtime for integration test; filter out hostname for prompt * uncomment other tests * pass the right runtime to controller * log runtime when start * uncomment tests * improve symbol filters * add intergration test prompts that seemd ok * add integration test workflow * add python3 to default ubuntu image * symlink python and fix permission to jupyter pip * add retry for jupyter execute server * fix jupyter pip install; add post-process for jupyter pip install; simplify init by add agent_skills path to PYTHONPATH; add testcase to tests jupyter pip install; * fix bug * use ubuntu:22.04 for eventstream integration tests * add todo * update testcase * remove redundant code * fix unit test * reduce dependency for runtime * try making llama-index an optional dependency that's not installed by default * remove pip install since it seemd not needed * log ipython execution; await write message since it returns a future * update ipy testcase * do not install llama-index in CI * do not install llama-index in the app docker as well * set sandbox container image in the integration test script * log plugins & env var for runtime * update conftest for sha256 * add git * remove all non-alphanumeric chalracters * add working ipy module tests! * default to use host network * remove is_async from browser to make thing a little more reliable; retry loading browser when error; * add sleep to wait a bit for http server * kill http server before regenerate browsing tests * fix browsing * only set sandbox container image if undefined * skip empty config value * update evaluation to use the latest run_controller * revert logger in execute_server to be compatible with server runtime * revert logging level to fix jupyter * set logger level * revert the logging * chmod for workspace to fix permission * support getting timeout from action * update test for server runtime * try to fix file permission * fix test_cmd_run_action_serialization_deserialization test (added timeout) * poetry: pip 24.2, torch 2.2.2 * revert adding pip to pyproject.toml * add build to dependencies in pyproject.toml * forgot poetry lock --no-update * fix a DelegatorAgent prompt_002.log (timeout) * fix a DelegatorAgent prompt_003.log (timeout) * couple more timeout attribs in prompt files * some more prompt files * prompts galore * add clarification comment for timeout * default timeout to config * add assert * update integraton tests for eventstream * update integration tests * fix timeout for action<->dict * remove redundant on_event * fix action execution timeout * updatelock --------- Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: tobitege <tobitege@gmx.de>	2024-08-01 22:07:39 +00:00
tobitege	a4cb880699	(feat) LLM class: added acompletion and streaming + unit test (#3202 ) * LLM class: added acompletion and streaming, unit test test_acompletion.py * LLM: cleanup of self.config defaults and their use * added set_missing_attributes to LLMConfig * move default checker up	2024-08-01 22:41:40 +02:00
Xingyao Wang	286f10053e	[arch] Implement `copy_to` for Runtime (#3211 ) * add copy to * implement for ServerRuntime * implement copyto for runtime (required by eval); add tests for copy to * fix exist file check * unify copy_to_behavior and fix stuff	2024-08-02 02:46:11 +08:00
Xingyao Wang	1d49ef253b	[Runtime] Reduce dependency to speed up CI and reduce image size (#3195 ) * reduce dependency for runtime * try making llama-index an optional dependency that's not installed by default * do not install llama-index in CI * do not install llama-index in the app docker as well	2024-07-31 13:55:09 -04:00
Engel Nyst	d41699c133	rename to UserRejectObservation (#3175 )	2024-07-31 22:44:31 +08:00
Xingyao Wang	bd68249fba	[Arch] Test `EventStreamRuntime` to ensure its feature parity with `ServerRuntime` (#3157 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit 2b668b1506e02145cb8f87e321aad62febca3d50. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit 8be4c86756f0e3fc62957b327ba2ac4999c419de. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit 30e2181c3fc1410e69596c2dcd06be01f1d016b3. * disable agnostic tests in CI * fix test --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-31 04:30:59 +08:00
மனோஜ்குமார் பழனிச்சாமி	570fd6e483	Fix: Parse exit code correctly for background commands (#3161 ) * Parse exit code correctly for background commands * Update ssh_box.py	2024-07-29 23:57:58 +08:00
tobitege	1eb3bdea95	remove unneeded message about config file not found (#3158 )	2024-07-29 16:27:14 +02:00
Xingyao Wang	b1ea204c5b	Migrate multi-line-bash-related sandbox tests into runtime tests and fix multi-line issue (#3128 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-27 20:12:57 +00:00
Engel Nyst	9ed95abf83	Fix max budget per task error in headless mode (#3147 ) * set agent in ERROR instead of PAUSED when in headless mode * fallback to config value for budget	2024-07-27 17:35:40 +00:00
Xingyao Wang	b5d3fcaba8	[docs] Update README.md for new OpenDevin Runtime (#3142 ) * Update README.md * address comment * remove DEBUG flag	2024-07-27 15:45:09 +00:00
Engel Nyst	a29c795418	clear last_error when restoring a session (#3146 )	2024-07-27 15:34:36 +00:00
Engel Nyst	f07280153a	restore logging of user messages when using cli (#3145 )	2024-07-27 12:58:23 +00:00
Graham Neubig	275ea706cf	Remove remaining global config (#3099 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import	2024-07-26 18:43:32 +00:00
tobitege	d0217b84ef	test_runtime: run tests per runtime, not alternating (#3103 )	2024-07-26 03:01:50 +08:00
tobitege	d50a8447ad	fix: add llm `drop_params` parameter to LLMConfig (#2471 ) * feat: add drop_params to LLMConfig * Update opendevin/llm/llm.py Fix use of unknown method. Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com> --------- Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>	2024-07-24 16:25:36 +02:00
Xingyao Wang	405c8a0456	[Arch] Add runtime image build CI & clean up runtime build using `jinja2` template (#3055 ) * test_runtime_client.py to test _execute_bash() * runtime_build and runtime tweaks * fix in docker script * revert bash changes * use sandbox_config.update_source_code to control source code update * add od_version to the sandbox tag * add doc instruction for update source code * do not remove whole poetry folder; add mamba clean * add missing newlines * cleanup runtime dockerfile into jinja template * make prep temp file a separate function; make that function accessible through cli * modify `runtime_build.py` so it can generate directory for building docker img * add dockerfile and sdist of runtime to gitignore since it will be dynamically generated * add runtime to build * do not rebuild new image when an `od_runtime` is provided * use default container_image for testing if possible * move runtime tests to ghcr runtime workflow * update docker base dir for runtime * fix unittest * fix image name * fix image name for test case * rename to make it consistent --------- Co-authored-by: tobitege <tobitege@gmx.de>	2024-07-24 21:56:12 +08:00
Boxuan Li	445f290beb	Validate to_replace in edit_file_by_replace AgentSkill (#3073 ) * Validate to_replace in edit_file_by_replace AgentSkill * Remove redundant replace reminder prompt * Add unit tests * Fix prompt	2024-07-22 21:01:35 -07:00
Xingyao Wang	41a8bb3cf1	[eval,fix]: metrics get carried across eval instances (#3072 ) * fix: make max_budget_per_task optional in `run_agent_controller` * update arg for each run infer * fix: metrics logging carried along; reset llm metric with the agent; --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-23 03:30:28 +00:00
Xingyao Wang	da17665cab	fix: make max_budget_per_task optional in `run_agent_controller` (#3071 ) * fix: make max_budget_per_task optional in `run_agent_controller` * update arg for each run infer	2024-07-22 21:47:00 -04:00
Graham Neubig	4099e48122	Removed config from agent controller (#3038 ) * Removed config from agent controller * Fix tests * Increase budget * Update tests * Update prompts * Add missing prompt * Fix mistaken deletions * Fix browsing test * Fixed browse tests	2024-07-22 17:42:57 +00:00
Xingyao Wang	ce8a11a62f	[Arch] Shrink runtime image size (#3051 ) * test_runtime_client.py to test _execute_bash() * runtime_build and runtime tweaks * fix in docker script * revert bash changes * use sandbox_config.update_source_code to control source code update * add od_version to the sandbox tag * add doc instruction for update source code * do not remove whole poetry folder; add mamba clean * add missing newlines --------- Co-authored-by: tobitege <tobitege@gmx.de>	2024-07-22 02:34:45 +08:00

1 2 3 4 5 ...

546 Commits