Xingyao Wang
2c0a2dbc61
fix yet another swe_bench issue ( #2069 )
2024-05-26 10:01:43 -07:00
Gant
f0271f9f91
need to run as root to use SWEBench container ( #2068 )
2024-05-26 14:21:33 +00:00
Xingyao Wang
5114230e53
Some SWE-Bench infer fixes and improvements ( #2065 )
...
* reset workspace base properly
* support running without hint
* support running without hint
* bump swe-bench eval docker to v1.2 for latest agentskills
* only give hint when use hint text is trie
* add swe-agent instructions for validation
* update dockerfile
* pin the python interpreter for execute_cli
* avoid initialize plugins twice
* default to use hint
* save results to swe_bench_lite
* unset gh token and increase max iter to 50
* remove printing of use hint status
* refractor ssh login into one function
* ok drop to 30 turns bc it is so expensive :(
* remove reproduce comments to avoid stuck
2024-05-26 10:02:11 +00:00
Xingyao Wang
a6b3ce866d
refractor ssh login into one function ( #2066 )
2024-05-26 08:56:13 +00:00
Boxuan Li
7d6cb69a51
main.py: Fix redundant ChangeAgentStateAction ( #2064 )
2024-05-26 15:20:56 +08:00
Chris Mamatas
1891fd88d5
feat(frontend): Added React Router ( #2061 )
...
* Added React Router
* Moved router import above ./App
---------
Co-authored-by: Chris Mamatas <chrismamatas1@gmail.com >
2024-05-25 21:57:15 +03:00
Shimada666
be1c2ad60d
feat: use retry decorator instead of retrying in a loop ( #2058 )
...
* feat: use retry decorator instead of retrying in a loop
* update code logic
* update poetry lock
2024-05-25 16:04:40 +00:00
Yizhe Zhang
0c829cd067
Support Entity-Deduction-Arena (EDA) Benchmark ( #1931 )
...
* adding draft evaluation code for EDA, using chatgpt as the temporal agent for now
* Update README.md
* Delete frontend/package.json
* reverse the irrelevant changes
* reverse package.json
* use chatgpt as the codeactagent
* integrate with opendevin
* Update evaluation/EDA/README.md
* Update evaluation/EDA/README.md
* Use poetry to manage packages
* integrate with opendevin
* minor update
* minor update
* update poetry
* update README
* clean-up infer scripts
* add run_infer script and improve readme
* log final success and final message & ground truth
---------
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-25 23:17:04 +08:00
Xingyao Wang
28ab00946b
update README for GAIA ( #2054 )
...
* update README for GAIA
* Update evaluation/gaia/README.md
* Update evaluation/gaia/README.md
* Update evaluation/gaia/README.md
---------
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
2024-05-25 15:01:03 +00:00
Xingyao Wang
ec68af5b83
fix the openai_api_key detected by agentskills ( #2052 )
2024-05-25 22:09:07 +08:00
Xingyao Wang
221035d39a
Add retry logic to ssh login ( #2053 )
...
* add retry logic to ssh login
* Update opendevin/runtime/docker/ssh_box.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
2024-05-25 12:16:24 +00:00
Shimada666
b31f7701eb
Integrate Multimodal tools to agentskills. ( #2016 )
...
* suport reading multimodal files
* move file
* update dependency
* remove useless pip install
* add comments
* update the comment
* Apply suggestions from code review
* Add unit test for TXTReader
* pre-commit hook corrupted utf16 test txt
* Revert unnecessary dependency upgrades
* feat: import some readers for agentskill
* add dependencies
* Integrate some multimodal tools
* add shell pip dependency
* update dependencies
* update dependencies
* update print window
* remove __main__
* locally import cv2
* add c library for opencv
* update lock file
* update prompt
* remove unuseful file
* add some unittest
* add unittest & remove excel-related parser
* rollback poetry lock
* remove markdown
* remove requests
* optimize parse_video output
* Fix integration tests for CodeActAgent
* remove test_parse_image unittest
* Add a TODO to containers/sandbox/Dockerfile
* update dependencies
* remove pyproject.toml useless package
* change document via openai key
* Fix prompts after removing some actions
---------
Co-authored-by: Mingchen Zhuge <mczhuge@gmail.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
Co-authored-by: Mingchen Zhuge <64179323+mczhuge@users.noreply.github.com >
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
2024-05-25 18:58:49 +08:00
Boxuan Li
91f313c914
BrowserEnv: init exception handling ( #2050 )
...
* BrowserEnv: init exception handling
* Revert irrelevant changes
* Remove type ignore
2024-05-25 00:17:25 -07:00
மனோஜ்குமார் பழனிச்சாமி
36ff060c1a
Added links in docs ( #2051 )
2024-05-25 11:23:20 +05:30
மனோஜ்குமார் பழனிச்சாமி
cfae6821fa
refactored timeout ( #2044 )
2024-05-24 18:19:14 +02:00
mamoodi
752ce8c4ea
Update bug template to include os version ( #1982 )
2024-05-24 15:58:05 +00:00
dependabot[bot]
cc6895a65c
Bump streamlit from 1.34.0 to 1.35.0 ( #2037 )
...
Bumps [streamlit](https://github.com/streamlit/streamlit ) from 1.34.0 to 1.35.0.
- [Release notes](https://github.com/streamlit/streamlit/releases )
- [Commits](https://github.com/streamlit/streamlit/compare/1.34.0...1.35.0 )
---
updated-dependencies:
- dependency-name: streamlit
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 23:00:37 +08:00
dependabot[bot]
5538ee9bde
Bump @types/react from 18.3.2 to 18.3.3 in /frontend ( #2039 )
...
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react ) from 18.3.2 to 18.3.3.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases )
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react )
---
updated-dependencies:
- dependency-name: "@types/react"
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 23:00:08 +08:00
dependabot[bot]
9a0bae6d9b
Bump @testing-library/react from 13.4.0 to 15.0.7 in /frontend ( #2040 )
...
Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library ) from 13.4.0 to 15.0.7.
- [Release notes](https://github.com/testing-library/react-testing-library/releases )
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md )
- [Commits](https://github.com/testing-library/react-testing-library/compare/v13.4.0...v15.0.7 )
---
updated-dependencies:
- dependency-name: "@testing-library/react"
dependency-type: direct:development
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:53 +08:00
dependabot[bot]
de0f30f6cc
Bump eslint-plugin-react-hooks from 4.6.0 to 4.6.2 in /frontend ( #2041 )
...
Bumps [eslint-plugin-react-hooks](https://github.com/facebook/react/tree/HEAD/packages/eslint-plugin-react-hooks ) from 4.6.0 to 4.6.2.
- [Release notes](https://github.com/facebook/react/releases )
- [Changelog](https://github.com/facebook/react/blob/main/packages/eslint-plugin-react-hooks/CHANGELOG.md )
- [Commits](https://github.com/facebook/react/commits/HEAD/packages/eslint-plugin-react-hooks )
---
updated-dependencies:
- dependency-name: eslint-plugin-react-hooks
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:37 +08:00
dependabot[bot]
6ae16dbc48
Bump react-i18next from 14.1.1 to 14.1.2 in /frontend ( #2043 )
...
Bumps [react-i18next](https://github.com/i18next/react-i18next ) from 14.1.1 to 14.1.2.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md )
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.1...v14.1.2 )
---
updated-dependencies:
- dependency-name: react-i18next
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:18 +08:00
dependabot[bot]
b6be108f49
Bump monaco-editor from 0.48.0 to 0.49.0 in /frontend ( #2042 )
...
Bumps [monaco-editor](https://github.com/microsoft/monaco-editor ) from 0.48.0 to 0.49.0.
- [Release notes](https://github.com/microsoft/monaco-editor/releases )
- [Changelog](https://github.com/microsoft/monaco-editor/blob/main/CHANGELOG.md )
- [Commits](https://github.com/microsoft/monaco-editor/compare/v0.48.0...v0.49.0 )
---
updated-dependencies:
- dependency-name: monaco-editor
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:58:36 +08:00
dependabot[bot]
ef813af9d7
Bump litellm from 1.38.0 to 1.38.2 ( #2038 )
...
Bumps [litellm](https://github.com/BerriAI/litellm ) from 1.38.0 to 1.38.2.
- [Release notes](https://github.com/BerriAI/litellm/releases )
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.0...v1.38.2 )
---
updated-dependencies:
- dependency-name: litellm
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 14:51:22 +00:00
dependabot[bot]
909d7b45ef
Bump boto3 from 1.34.111 to 1.34.112 ( #2036 )
...
Bumps [boto3](https://github.com/boto/boto3 ) from 1.34.111 to 1.34.112.
- [Release notes](https://github.com/boto/boto3/releases )
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst )
- [Commits](https://github.com/boto/boto3/compare/1.34.111...1.34.112 )
---
updated-dependencies:
- dependency-name: boto3
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 14:50:02 +00:00
Xingyao Wang
e731048ccf
Improve action and observation logging for the CLI interface ( #2035 )
...
* properly log user messages;
format browser action/obs, summarize action, messages properly for logging
* add source to message
* add spaces for printing
2024-05-24 08:21:25 -04:00
Jiayi Pan
2d52298a1d
Support GAIA benchmark ( #1911 )
...
* Add gaia test
* Improve gaia prompts
* Fix browser_env hang bug
* Fix gaia bugs
* add gaia to eval readme
* Fix gaia bugs
* minor fix
* add run_infer.sh and update readme
* set num eval worker to 1
* default to 2023 gaia level1 subset
* default to level 1
* add prompt to instruct model enclose answer within <solution> tag
* add missing break
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
2024-05-24 11:22:28 +00:00
dependabot[bot]
2f6167b953
Bump framer-motion from 11.2.5 to 11.2.6 in /frontend ( #2010 )
...
Bumps [framer-motion](https://github.com/framer/motion ) from 11.2.5 to 11.2.6.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md )
- [Commits](https://github.com/framer/motion/compare/v11.2.5...v11.2.6 )
---
updated-dependencies:
- dependency-name: framer-motion
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com >
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-05-24 10:01:42 +00:00
Boxuan Li
78241d9d43
Add tests for browser agent ( #2031 )
...
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-05-24 09:59:40 +00:00
Boxuan Li
b13a40c05c
README.md: Add CodeCov badge ( #2022 )
...
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-05-24 09:54:25 +00:00
dependabot[bot]
ad2784d534
Bump ruff from 0.4.4 to 0.4.5 ( #2004 )
...
Bumps [ruff](https://github.com/astral-sh/ruff ) from 0.4.4 to 0.4.5.
- [Release notes](https://github.com/astral-sh/ruff/releases )
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md )
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.4...v0.4.5 )
---
updated-dependencies:
- dependency-name: ruff
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 05:51:05 -04:00
Boxuan Li
593b8d468b
Fix CI workflows [mac-test] ( #2025 )
...
* Fix CI settings
* Stop saving cpu cycles for GitHub
* Conditionally run mac tests
* Random push to trigger CI checks again
---------
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-05-24 09:25:00 +00:00
sp.wack
ae105c2faf
feat(frontend): Add actions to send feedback to backend ( #2020 )
...
* Add feedback actions to send to backend
* Uncomment request
* Refactor and disable feedback when sending
* disable defaultProp error
---------
Co-authored-by: amanape <stephanpsaras@gmail.com >
2024-05-24 04:26:06 -04:00
dependabot[bot]
9207a8da01
Bump browsergym from 0.2.6 to 0.3.2 ( #2013 )
...
Bumps [browsergym](https://github.com/ServiceNow/BrowserGym ) from 0.2.6 to 0.3.2.
- [Release notes](https://github.com/ServiceNow/BrowserGym/releases )
- [Commits](https://github.com/ServiceNow/BrowserGym/compare/v0.2.6...v0.3.2 )
---
updated-dependencies:
- dependency-name: browsergym
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-24 16:03:56 +08:00
Frank Xu
53f64ffa06
Improve browsing agent prompts, allowing agent to properly finish when done ( #1993 )
...
* improve browsing agent, allowing it to properly finish.
* handle parsing error, show user what the agent's browsing thoughts in the front end
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-24 00:02:19 -07:00
Boxuan Li
c59bcbbffd
Minor docstring & prompt fixes for AgentSkills ( #2028 )
...
* A few minor fixes to agentskills
* Regenerate prompts
* Remove redundant comment
2024-05-24 13:30:48 +08:00
Xingyao Wang
cbf4c4b4c4
fix ExceptionPxssh ( #2023 )
...
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-23 21:24:21 -07:00
Boxuan Li
633ece5f9c
Fix integration tests ( #2024 )
2024-05-23 20:24:31 -07:00
Robert Brennan
9ca2007201
fix json encoding ( #2018 )
...
* fix json encoding
* add test
* add another test
* fix integration tests
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
2024-05-23 23:36:15 +00:00
dependabot[bot]
b492b6293a
Bump lint-staged from 15.2.2 to 15.2.4 in /frontend ( #2009 )
...
Bumps [lint-staged](https://github.com/okonet/lint-staged ) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases )
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md )
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4 )
---
updated-dependencies:
- dependency-name: lint-staged
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:25:00 -04:00
dependabot[bot]
0a6b26735b
Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend ( #2008 )
...
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser ) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases )
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md )
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser )
---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:24:42 -04:00
dependabot[bot]
dff0f1be13
Bump @types/react-syntax-highlighter in /frontend ( #2007 )
...
Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter ) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases )
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter )
---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:24:07 -04:00
dependabot[bot]
e7306b7226
Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend ( #2006 )
...
Bumps [@react-types/shared](https://github.com/adobe/react-spectrum ) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases )
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1 )
---
updated-dependencies:
- dependency-name: "@react-types/shared"
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:23:49 -04:00
DaxServer
b118df606f
build: Add poetry command to use Python 3.11 for environment setup ( #1972 )
2024-05-23 12:05:19 -04:00
Xingyao Wang
602ffcdffb
Implement agentskills for OpenDevin to helpfully improve edit AND including more useful tools/skills ( #1941 )
...
* add draft for skills
* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file
* Remove new_sample.txt file
* add some work from opendevin w/ fixes
* Add unit tests for agentskills module
* fix some issues and updated tests
* add more tests for open
* tweak and handle goto_line
* add tests for some edge cases
* add tests for scrolling
* add tests for edit
* add tests for search_dir
* update tests to use pytest
* use pytest --forked to avoid file op unit tests to interfere with each other via global var
* update doc based on swe agent tool
* update and add tests for find_file and search_file
* move agent_skills to plugins
* add agentskills as plugin and docs
* add agentskill to ssh box and fix sandbox integration
* remove extra returns in doc
* add agentskills to initial tool for jupyter
* support re-init jupyter kernel (for agentskills) after restart
* fix print window's issue with indentation and add testcases
* add prompt for codeact with the newest edit primitives
* modify the way line number is presented (remove leading space)
* change prompt to the newest display format
* support tracking of costs via metrics
* Update opendevin/runtime/plugins/agent_skills/README.md
* Update opendevin/runtime/plugins/agent_skills/README.md
* implement and add tests for py linting
* remove extra text arg for incompatible subprocess ver
* remove sample.txt
* update test_edits integration tests
* fix all integration
* Update opendevin/runtime/plugins/agent_skills/README.md
* Update opendevin/runtime/plugins/agent_skills/README.md
* Update opendevin/runtime/plugins/agent_skills/README.md
* Update agenthub/codeact_agent/prompt.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
* Update agenthub/codeact_agent/prompt.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
* Update agenthub/codeact_agent/prompt.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
* Update opendevin/runtime/plugins/agent_skills/agentskills.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
* correctly setup plugins for swebench eval
* bump swe-bench version and add logging
* correctly setup plugins for swebench eval
* bump swe-bench version and add logging
* Revert "correctly setup plugins for swebench eval"
This reverts commit 2bd1055673 .
* bump version
* remove _AGENT_SKILLS_DOCS
* move flake8 to test dep
* update poetry.lock
* remove extra arg
* reduce max iter for eval
* update poetry
* fix integration tests
---------
Co-authored-by: OpenDevin <opendevin@opendevin.ai >
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-23 16:04:09 +00:00
Robert Brennan
ea9c785075
fix session state after resuming ( #1999 )
...
* fix state resuming
* fix session reconnection
* fix lint
2024-05-23 11:47:36 -04:00
Xingyao Wang
6ff50ed369
Fix SWE-Bench evaluation due to setuptools version ( #1995 )
...
* correctly setup plugins for swebench eval
* bump swe-bench version and add logging
* Revert "correctly setup plugins for swebench eval"
This reverts commit 2bd1055673 .
* bump version
2024-05-23 23:17:42 +08:00
dependabot[bot]
d6327f99ce
Bump litellm from 1.37.20 to 1.38.0 ( #2005 )
...
Bumps [litellm](https://github.com/BerriAI/litellm ) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases )
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0 )
---
updated-dependencies:
- dependency-name: litellm
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 14:57:56 +00:00
dependabot[bot]
1c40ea5222
Bump docker from 7.0.0 to 7.1.0 ( #2002 )
...
Bumps [docker](https://github.com/docker/docker-py ) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases )
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0 )
---
updated-dependencies:
- dependency-name: docker
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 16:45:46 +02:00
dependabot[bot]
58d45a1a8a
Bump boto3 from 1.34.110 to 1.34.111 ( #2001 )
...
Bumps [boto3](https://github.com/boto/boto3 ) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases )
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst )
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111 )
---
updated-dependencies:
- dependency-name: boto3
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 14:42:31 +00:00
Niklas Muennighoff
ef6cdb7532
HumanEvalFix integration ( #1908 )
...
* Preliminary HumanEvalFix integration
* Clean paths
* fix: set workspace path correctly for config
fix: task in that contains /
* add missing run_infer.sh
* update run_infer w/o hard coded agent
* fix typo
* change `instance_id` to `task_id`
* add the warning and env var setting to run_infer.sh
* reset back workspace mount at the end of each instance
* 10 max iter is probably enough for humanevalfix
* Remove unneeded section
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
* Fix link
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
* Use logger
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
* Update run_infer.py
fix a bug:
ERROR:concurrent.futures:exception calling callback for <Future at 0x309cbc470 state=finished raised NameError>
concurrent.futures.process._RemoteTraceback:
* Update README.md
* Update README.md
* Update README.md
* Update README.md
added an example
* Update README.md
added: enable_auto_lint = true
* Update pyproject.toml
add: evaluate package
* Delete poetry.lock
update poetry.lock
* update poetry.lock
update poetry.lock
* Update README.md
* Update README.md
---------
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
Co-authored-by: Robert <871607149@qq.com >
2024-05-23 13:09:40 +00:00