Tim O'Farrell
4b303ec9b4
Fixes to unblock frontend ( #11488 )
...
Co-authored-by: Ray Myers <ray.myers@gmail.com>
2025-10-23 14:43:45 -06:00
Xingyao Wang
4507a25b85
Evaluation: redirect sessions to repo-local .eval_sessions via helper; apply across entrypoints; add tests ( #10540 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-22 13:34:02 +00:00
Xingyao Wang
c2f46200c0
chore(lint): Apply comprehensive linting and formatting fixes ( #10287 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-13 21:13:19 +02:00
Xingyao Wang
04ff4a025b
feat(cli): Use CLI to launch OpenHands UI server via Docker ( #9783 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-09 02:04:07 +08:00
Robert Brennan
205f0234e8
Rename Conversation to ServerConversation and AppConfig to OpenHandsConfig ( #8754 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-28 21:48:34 +02:00
Graham Neubig
689d3c9046
Update pre-commit hook versions to most recent versions ( #8343 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-08 03:59:13 +00:00
Boxuan Li
d7c49a0656
[Evaluation] Fix sandbox config in TAC ( #7684 )
2025-04-03 08:19:10 +00:00
Boxuan Li
34bf6a6402
[Evaluation] Fix run_infer.py path in TAC ( #7683 )
2025-04-03 04:34:02 +00:00
Rohit Malhotra
5ffb1ef704
Fix typing ( #7083 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-03 20:41:11 +00:00
Engel Nyst
395c1ea9e3
[Refactor] split runtime initialization (create, connect, init) in cli scripts ( #7036 )
2025-03-03 00:19:25 +01:00
Engel Nyst
4f98bce6df
Add selected_repo to command line ( #6949 )
2025-02-26 20:42:59 +01:00
Mateusz Kwiatkowski
6562297615
Replace shebang with /usr/bin/env bash for improved portability ( #6876 )
...
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-02-24 18:07:28 +00:00
Xingyao Wang
1a7003a705
Add sysbox support to remote runtime for eval; Add memory monitor, stress tests to help debug memory issue ( #6684 )
...
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-02-18 20:02:28 +00:00
Boxuan Li
4443417c75
A few fixes for TAC evaluation harness ( #6586 )
2025-02-14 21:01:57 -08:00
Boxuan Li
ef12bc5381
Evaluation harness: Add agent config option ( #6662 )
2025-02-13 15:05:03 -05:00
tofarr
bbfdc62139
Fix for issue where retries continue on a closed runtime ( #6564 )
...
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2025-02-03 08:44:09 -07:00
Boxuan Li
62402cd617
The-Agent-Company evaluation harness: Support splits ( #6577 )
2025-02-02 13:12:01 +08:00
Calvin Smith
a12087243a
Pydantic-based configuration and setting objects ( #6321 )
...
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-01-17 12:33:22 -07:00
Xingyao Wang
0bed17758f
fix: incorrect soft-timeout implementation & fix hard-timeout follow-up command ( #6280 )
2025-01-17 01:27:00 +08:00
Boxuan Li
92b8d55c2d
Rename trajectories_path config to save_trajectory_path ( #6216 )
...
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-01-14 04:32:45 +00:00
tofarr
23473070b9
Revert "Config objects as Pydantic BaseModels ( #6176 )" ( #6214 )
2025-01-13 07:36:25 -07:00
Calvin Smith
873dddb4e8
Config objects as Pydantic BaseModels ( #6176 )
...
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-01-12 15:09:45 -05:00
Boxuan Li
6a4442e590
[Evaluation] Add summarise_results script for TheAgentCompany benchmark ( #5811 )
2024-12-27 20:33:41 -08:00
Boxuan Li
5ed80b5c32
[doc] Fix link in TheAgentCompany benchmark's README.md ( #5848 )
2024-12-27 22:21:02 +08:00
Boxuan Li
b1719bb3db
Add TheAgentCompany evaluation harness ( #5731 )
2024-12-22 14:12:30 -05:00