OpenHands

mirror of https://github.com/OpenHands/OpenHands.git synced 2026-03-22 13:47:19 +08:00

Author	SHA1	Message	Date
Engel Nyst	6b1f23a20a	Fix browsing actions to be more robust (#4226 )	2024-10-06 22:03:13 -04:00
Engel Nyst	9d0e6a24bc	Refactor embeddings (#4219 )	2024-10-05 18:59:08 +00:00
Xingyao Wang	42649745bd	fix(runtime): fix bash interrupt on program that cannot be stopped via ctrl+c (#4161 )	2024-10-04 06:48:44 +08:00
tofarr	152f99c64f	Chore Bump python version (#3545 )	2024-10-03 13:40:55 -04:00
Xingyao Wang	e81c5597d6	feat(runtime): use micromamba instead of mamba and fix build issue (#4154 )	2024-10-02 21:23:18 +00:00
Rehan Ganapathy	c8a933590a	(feat) allow specification of config.toml location via args (solves #3947 ) (#4168 ) Co-authored-by: Rehan Ganapathy <rehanganapathy@MACASF.local>	2024-10-02 20:30:12 +00:00
Graham Neubig	178dbfaf4a	Run pre-commit (#4163 )	2024-10-02 04:52:02 +00:00
Xingyao Wang	240a470a1d	Revert "add few seconds to properly receive timeout error from client" This reverts commit `dd2cb4399a`.	2024-10-01 23:44:05 -04:00
Xingyao Wang	dd2cb4399a	add few seconds to properly receive timeout error from client	2024-10-01 23:43:50 -04:00
Engel Nyst	5a45c648a8	attributes for BE/FE should not be sent (#4150 )	2024-10-01 23:00:03 +00:00
Xingyao Wang	54ac340e0b	refactor: standardize linter output data structure and interface (#4077 ) Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-10-01 02:40:23 +08:00
Engel Nyst	e582806004	Vision and prompt caching fixes (#4014 )	2024-09-28 14:37:29 +02:00
tobitege	575a829d94	(enh) add test_python_version to test_bash.py runtime tests (#4098 )	2024-09-28 08:21:14 +08:00
Amir	3e5c01dfc8	Remove param from docstring that does not exist in the append_file (#4060 )	2024-09-26 22:25:11 +02:00
Engel Nyst	0a03c802f5	Refactor llm.py (#4057 )	2024-09-26 17:44:18 +00:00
tobitege	2cc1c3ef42	(enh) Docker runtime builder with BuildKit support, enh. caching (#4009 )	2024-09-26 08:50:53 +02:00
mamoodi	1d052818ae	Set runtime container image so it doesn't need to be rebuilt (#4035 )	2024-09-25 05:20:45 +02:00
tobitege	fbef93b762	Refactor `config.py` file into package (own folder with separate files) (#3987 )	2024-09-23 12:42:54 -04:00
tobitege	01462e11d7	(fix) CodeActAgent/LLM: react on should_exit flag (user cancellation) (#3968 )	2024-09-20 23:49:45 +02:00
tobitege	6682e0f1dd	(fix) CodeActAgent: use `content` of AgentDelegateObservation (#3970 ) Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com>	2024-09-20 18:31:11 +02:00
Engel Nyst	8fdfece059	Refactor messages serialization (#3832 ) Co-authored-by: Robert Brennan <accounts@rbren.io>	2024-09-18 23:48:58 +02:00
tobitege	b4408b41c9	(feat) LLM class: add safety_settings for Gemini; improve max_output_tokens defaulting (#3925 )	2024-09-18 11:51:23 -04:00
niliy01	804674bb9f	refactor the logic in agent_controller to imporve readability (#3873 ) Signed-off-by: Yi Lin <teroincn@gmail.com>	2024-09-16 14:13:52 -04:00
tobitege	a33f61c025	(feat) Show messages' timestamp in UI (#3869 )	2024-09-16 05:41:29 +02:00
tobitege	554636cf2a	(fix) Fix runtime (RT) tests and split tests in 2 actions (openhands/root) (#3791 ) Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-09-14 21:51:30 +02:00
tobitege	6111f530c2	(fix) StuckDetector: syntax error loops were not detected (#3663 ) Co-authored-by: mamoodi <mamoodiha@gmail.com>	2024-09-13 16:53:52 +02:00
Xingyao Wang	2fe2f4c530	[eval] increase timeout for SWEBench eval init/complete (#3829 ) * [eval] increase timeout for swebench eval init/complete * allow CmdRunAction to optionally block when .timeout is setted * fix unit test for serialization * fix unit tests for security analyzer * fix integration tests * add more timeout	2024-09-12 15:20:58 +00:00
tofarr	e5cb80d59d	docs: Update steps for running integration tests in a local environment (#3830 ) * docs: Update steps for running integration tests in a local environment	2024-09-12 03:22:53 -06:00
Frank Xu	fe5ecb6da8	add url info in browsing observation (#3815 ) * add url info in browsing observation * fix integration tests for url --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-09-12 02:46:39 +02:00
mamoodi	f3b2085f9b	Reduce runtime tests duration by running them across CPUs (#3779 ) * Reduce runtime tests duration by running them across CPUs * fix hardcoded image name * test two cpus * Test folder change * Up the CPU to 4 again to test * Change to 3 CPUs * Down to 2 * Add param to remove all openhands containers * Add comment * Add reruns just in case * Fix ordering of if	2024-09-10 14:31:17 -04:00
Cole Murray	97a03faf33	Add Handling of Cache Prompt When Formatting Messages (#3773 ) * Add Handling of Cache Prompt When Formatting Messages * Fix Value for Cache Control * Fix Value for Cache Control * Update openhands/core/message.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Fix lint error * Serialize Messages if Propt Caching Is Enabled * Remove formatting message change --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>	2024-09-10 16:34:41 +00:00
dependabot[bot]	822de89394	chore(deps): bump browsergym from 0.3.4 to 0.4.3 (#3762 ) * chore(deps): bump browsergym from 0.3.4 to 0.4.3 Bumps browsergym from 0.3.4 to 0.4.3. --- updated-dependencies: - dependency-name: browsergym dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * integration tests updated to browsergym --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-09-10 17:24:40 +02:00
tobitege	5ffff742de	Regression fixes: LLM logging; client readiness (EventStreamRuntime) (#3776 ) * Regression fixes: LLM logging; client readiness (EventStreamRuntime) * fix llm.async_completion_wrapper bad edit in previous commit * regen couple of mock files * client: always log initialized status	2024-09-09 21:02:43 +02:00
tobitege	2b7517e542	(enh) add caching@v4 action in workflows (#3780 ) * dummy test change * regen yml: 1st install python 3.11, then poetry * fix caching for poetry; old entry for python was rather useless * fix steps order (cache before poetry) * add poetry caching to ghcr_runtime; fix fork conditions * ghcr_runtime: more caching actions; condition fixes * fix interim action error (order of steps) * cache@v4 instead of v3 * fixed interim typo for 2 fork conditions * runtime/test_env_vars: compacted multiple tests into one to reduce time * ugh if fork condition changes again	2024-09-09 10:49:49 +02:00
Robert Brennan	ab3851593d	Support interactive commands (#3653 ) * hacky solution for interactive commands * add more behavior * debug * fix continue functionality * remove prints * refactor a bit * reduce test sleep * fix python version * fix pre-commit issue * Regenerate integration tests * Update openhands/runtime/client/client.py * revert some prompt stuff * several integration mock files regenerated * execute_action: remove duplicate exception logging --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>	2024-09-08 21:45:51 +02:00
niliy01	82a154f7e7	(feat) making prompt caching optional instead of enabled default (#3689 ) * (feat) making prompt caching optional instead of enabled default At present, only the Claude models support prompt caching as a experimental feature, therefore, this feature should be implemented as an optional setting rather than being enabled by default. Signed-off-by: Yi Lin <teroincn@gmail.com> * handle the conflict * fix unittest mock return value * fix lint error in whitespace --------- Signed-off-by: Yi Lin <teroincn@gmail.com>	2024-09-05 18:52:26 +02:00
Xingyao Wang	688068a44e	Fix issues for running `RemoteRuntime` in parallel on SWE-Bench (#3716 ) * feat: add SWE-bench fullset support * fix instance image list * update eval script and documentation * increase timeout for remote runtime * add push script * handle the case when ret push is an generator * update pbar * set SWE-Bench default to run SWE-Bench lite * add script to cleanup remote runtime * fix the cases when tag is too long * update README * update readme for cleanup * rename od to oh * Update evaluation/swe_bench/README.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/README.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * gets API key and Runtime from env var --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-09-05 10:34:31 +08:00
tobitege	bc31fb15fe	(fix) CodeActAgent: fix issues with vision support in prompts (#3665 ) * CodeActAgent: fix message prep if prompt caching is not supported * fix python version in regen tests workflow * fix in conftest "mock_completion" method * add disable_vision to LLMConfig; revert change in message parsing in llm.py * format messages in several files for completion * refactored message(s) formatting (llm.py); added vision_is_active() * fix a unit test * regenerate: added LOG_TO_FILE and FORCE_REGENERATE env flags * try to fix path to logs folder in workflow * llm: prevent index error * try FORCE_USE_LLM in regenerate * tweaks everywhere... * fix 2 random unit test errors :( * added FORCE_REGENERATE_TESTS=true to regenerate CLI * fix test_lint_file_fail_typescript again * double-quotes for env vars in workflow; llm logger set to debug * fix typo in regenerate * regenerate iterations now 20; applied iteration counter fix by Li * regenerate: pass FORCE_REGENERATE flag into env * fixes for int tests. several mock files updated. * browsing_agent: fix response_parser.py adding ) to empty response * test_browse_internet: fix skipif and revert obsolete mock files * regenerate: fi bracketing for http server start/kill conditions * disable test_browse_internet for CodeActAgents; mock files updated after merge missed to include more mock files earlier * reverts after review feedback from Li * forgot one * browsing agent test, partial fixes and updated mock files * test_browse_internet works in my WSL now! * adapt unit test test_prompt_caching.py * add DEBUG to regenerate workflow command * convert regenerate workflow params to inputs * more integration test mock files updated * more files * test_prompt_caching: restored test_prompt_caching_headers purpose * file_ops: fix potential exception, like "cross device copy"; fixed mock files accordingly * reverts/changes wrt feedback from xingyao * updated docs and config template * code cleanup wrt review feedback	2024-09-04 17:58:30 +02:00
tobitege	7068a73ae7	(enh) Improve CodeActAgent's file editing reliability (#3610 ) * improve file editing prompts and unit test converted most raise calls to a _output_error call in file_ops.py * tweaks in test_agent_skill.py wrt to SEP separator * tweaked the separator * remove server runtime remnants and TEST_RUNTIME references * restore use of TEST_RUNTIME args and variables * fix integration tests * added hint to properly escape docstrings * revert latest prompt change --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-09-02 06:03:22 +02:00
tobitege	1ef83a8554	fix test_is_stuck.py to not fail on macos (#3662 )	2024-08-30 05:06:30 +00:00
Xingyao Wang	090c911a50	(refactor) Make `Runtime` class synchronous (#3661 ) * change runtime to be synchronous * fix test runtime with the new interface * fix arg * fix eval * fix missing config attribute * fix plugins * fix on_event by revert it back to async * update upload_file endpoint * fix argument to upload file * remove unncessary async for eval; fix evaluation run in parallel * use asyncio to run controller for eval * revert file upload * truncate eval test result output	2024-08-30 01:37:03 +00:00
Xingyao Wang	8b1f207d39	feat: support remote runtime (#3406 ) * feat: refactor building logic into runtime builder * return image name * fix testcases * use runtime builder for eventstream runtime * have runtime builder return str * add api_key to sandbox config * draft remote runtime * remove extra if clause * initialize runtime based on box class * add build logic * use base64 for file upload * get runtime image prefix from API * replace ___ with _s_ to make it a valid image name * use /build to start build and /build_status to check the build progress * update logging * fix exit code * always use port * add remote runtime * rename runtime * fix tests import * make dir first if work_dir does not exists; * update debug print to remote runtime * fix exit close_sync * update logging * add retry for stop * use all box class for test keep prompt * fix test browsing * add retry stop * merge init commands to save startup time * fix await * remove sandbox url * support execute through specific runtime url * fix file ops * simplify close * factor out runtime retry code * fix exception handling * fix content type error (e.g., bad gateway when runtime is not ready) * add retry for wait until alive; add retry for check image exists * Revert "add retry for wait until alive;" This reverts commit `dd013cd268`. * retry when wait until alive * clean up msg * directly save sdist to temp dir for _put_source_code_to_dir * support running testcases in parallel * tweak logging; try to close session * try to close session even on exception * update poetry lock * support remote to run integration tests * add warning for workspace base on remote runtime * set default runtime api * remove server runtime * update poetry lock * support running swe-bench (n=1) eval on remoteruntime * add a timeout of 30 min * add todo for docker namespace * update poetry loc	2024-08-29 15:53:37 +00:00
Shubham raj	296fa8182a	Mock config env variables (#3621 )	2024-08-29 15:48:23 +00:00
tobitege	a2d94c9cb1	(enh) StuckDetector: fix+enhance syntax error loop detection (#3628 ) * fix StuckDetector and add more errors for detection * more stringent error detection and more unit tests	2024-08-29 17:33:54 +02:00
tobitege	8fca5a5354	linter and test_aider_linter extensions for eslint (#3543 ) * linter and test_aider_linter extensions for eslint * linter tweaks * try enabling verbose output in linter test * one more option for linter test * try conftest.py for tests/unit folder * enable verbose mode in workflow; remove conftest.py again * debug print statements of linter results * skip some tests if eslint is not installed at all * more tweaks * final test skip setups * code quality revisions * fix test again --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-29 10:40:43 +02:00
Graham Neubig	c6ba0e8339	Remove singleton config (#3614 ) * Remove singleton config * Fix tests * Fix logging reset * Fix pre-commit	2024-08-28 20:05:49 +01:00
tobitege	f8c4d1df45	(test) Fix test_agent_controller.py mock exceptions (#3577 ) * fix test_agent_controller.py mock exceptions * revert change to agent_controller.py	2024-08-28 19:05:22 +02:00
Xingyao Wang	d9a8b53bc2	feat: specialize CodeAct into micro agents by providing markdown files (#3511 ) * update microagent name and update template.toml * substitute actual micro_agent_name for prompt manager * add python-frontmatter * support micro agent in codeact * add test cases * add instruction from require env var * add draft gh micro agent * update poetry lock * update poetry lock	2024-08-28 14:58:16 +00:00
Kaushik Deka	5bb931e4d6	Add prompt caching (Sonnet, Haiku only) (#3411 ) * Add prompt caching * remove anthropic-version from extra_headers * change supports_prompt_caching method to attribute * change caching strat and log cache statistics * add reminder as a new message to fix caching * fix unit test * append reminder to the end of the last message content * move token logs to post completion function * fix unit test failure * fix reminder and prompt caching * unit tests for prompt caching * add test * clean up tests * separate reminder, use latest two messages * fix tests --------- Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-08-26 20:46:44 -04:00
tobitege	8fcf0817d4	(eval) Aider_bench: add eval_ids arg to run specific instance id's (#3592 ) * add eval_ids arg to run specific instance id's; fix/extend README * fix description in parser for --eval-ids * fix test_arg_parser.py to account for added arg * fix typo in README to say "summarize" instead of "summarise" for script	2024-08-27 00:49:26 +08:00

1 2 3 4 5 ...

269 Commits