Ryan H. Tran
|
ddaa186971
|
[GAIA] Add prompt improvement to alleviate solution parsing issue & support Tavily search tools (#9057)
|
2025-06-17 13:16:50 +07:00 |
|
Zach
|
0b3d15a4d7
|
Fix missing 'fi' statement in GAIA benchmark scripts/run_infer.sh (#7465)
|
2025-03-24 16:04:25 +00:00 |
|
Mateusz Kwiatkowski
|
6562297615
|
Replace shebang with /usr/bin/env bash for improved portability (#6876)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
|
2025-02-24 18:07:28 +00:00 |
|
Boxuan Li
|
ef12bc5381
|
Evaluation harness: Add agent config option (#6662)
|
2025-02-13 15:05:03 -05:00 |
|
Xingyao Wang
|
9908e1b285
|
[Evaluation]: Log openhands version in eval output folder, instead of agent version (#5394)
|
2024-12-04 03:33:43 +00:00 |
|
OpenHands
|
678436da30
|
Fix issue #5222: [Refactor]: Refactor the evaluation directory (#5223)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
|
2024-11-25 08:35:52 -05:00 |
|