Commit Graph

986 Commits

Author SHA1 Message Date
tobitege
b431fce938 tests: more Agentskills tests; updated .gitignore (#2307)
* added tests related to backticks

* updated .gitignore

* added extra linter test for #2210

* hotfix for integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-07 16:29:03 +00:00
Yufan Song
6aba337416 fix (#2318) 2024-06-07 09:22:29 -07:00
Frank Xu
4455260290 [bugfix] browse actions shouldn't change url and screenshot, only observations (#2311)
* browse related actions shouldn't change url and screenshot, only the observations should

* fix linting

* fix integrat

* update integration test

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-06-08 00:03:32 +08:00
Boxuan Li
45ce09d70e CodeActAgent: Delegate to BrowsingAgent for browsing tasks (#2103) 2024-06-07 00:53:47 -07:00
Biraj Silwal
001cc33664 fix: ExplorerActions overlapping with file name. (#2287)
* fix ExplorerActions overlapping with file name.

* Update frontend/src/components/file-explorer/FileExplorer.tsx

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-07 03:30:16 +00:00
dependabot[bot]
1df9649c7e Bump tailwindcss from 3.4.3 to 3.4.4 in /frontend (#2298) 2024-06-07 09:03:03 +08:00
Mohammad Sadoughi
19788cbad8 updated Makefile setup-config to store the persist_sandbox bolean value to config.toml (#2304)
Co-authored-by: msadough <msadough@amazon.com>
2024-06-06 22:14:09 +00:00
dependabot[bot]
dea9b5c258 Bump boto3 from 1.34.119 to 1.34.120 (#2299)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.119 to 1.34.120.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.119...1.34.120)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:50:12 +02:00
dependabot[bot]
07423c9277 Bump ruff from 0.4.7 to 0.4.8 (#2297)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.7 to 0.4.8.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.7...v0.4.8)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:49:38 +02:00
dependabot[bot]
bb757223a2 Bump litellm from 1.40.2 to 1.40.4 (#2300)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.2 to 1.40.4.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.2...v1.40.4)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:48:51 +02:00
dependabot[bot]
ac0c6efc82 Bump openai from 1.31.0 to 1.31.2 (#2301)
Bumps [openai](https://github.com/openai/openai-python) from 1.31.0 to 1.31.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.31.0...v1.31.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:48:04 +02:00
tobitege
1ce4d383d3 doc: add Python keyring to Troubleshooting documentation (#2289)
* fix: set Python keyring for Poetry

* Python keyring troubleshooting added

* Revert Makefile change

* Troubleshooting extended

* setup config: added absolute path hint
2024-06-06 12:26:58 +00:00
Aleksandar
b0b19e6c25 Update AgentHubREADME.md (#2290)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-06 11:14:41 +00:00
dependabot[bot]
08137d1968 Bump boto3 from 1.34.118 to 1.34.119 (#2280)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.118 to 1.34.119.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.118...1.34.119)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 14:13:41 +08:00
super-dainiu
beabcce16d [Hotfix] Fix ML-Bench continue `run_inference.py` (#2284)
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

* fix

* update summarize

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-06-06 03:53:21 +00:00
tobitege
1fa09e0414 fix: test_sandbox tests didn't close dockers (#2274)
* fix test_sandbox tests to close dockers

* removed try/finally

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-06 03:45:45 +00:00
Frank Xu
48151bdbb0 [feat] WebArena benchmark, MiniWoB++ benchmark and related arch changes (#2170)
* add webarena, and revamp messaging for webarena eval

* add changes for browsergym

* update infer script

* fix unit tests

* update

* add multiple run for miniwob

* update instruction, remove personal path

* update

* add code for getting final reward, fix integration, add results

* add avg cost calculation
2024-06-06 09:01:20 +08:00
dependabot[bot]
99c6333e1a Bump openai from 1.30.5 to 1.31.0 (#2283)
Bumps [openai](https://github.com/openai/openai-python) from 1.30.5 to 1.31.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.30.5...v1.31.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 21:14:49 +00:00
Xingyao Wang
42d3dd8a2e Update screenshot (#2286)
* Add files via upload

* Update screenshot.png
2024-06-05 22:43:46 +02:00
மனோஜ்குமார் பழனிச்சாமி
971ad68431 Solved Hugging Face cache issue. (#2277) 0.6.2 2024-06-05 21:18:33 +05:30
dependabot[bot]
3bf0636a53 Bump litellm from 1.40.0 to 1.40.2 (#2282)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.0 to 1.40.2.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.0...v1.40.2)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:46:00 +08:00
dependabot[bot]
105b5b9103 Bump json-repair from 0.21.0 to 0.23.1 (#2278)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.21.0 to 0.23.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.21.0...0.23.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:45:42 +08:00
dependabot[bot]
a4bccfc6aa Bump @types/node from 20.14.1 to 20.14.2 in /frontend (#2279)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.1 to 20.14.2.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:40:53 +08:00
dependabot[bot]
4e540da85e Bump prettier from 3.3.0 to 3.3.1 in /frontend (#2281)
Bumps [prettier](https://github.com/prettier/prettier) from 3.3.0 to 3.3.1.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.3.0...3.3.1)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:39:41 +08:00
RainRat
3b0e1361a4 fix typos (#2267)
* fix typos

no functional change

* fix typos

* fix typos

* fix integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-05 23:06:40 +08:00
மனோஜ்குமார் பழனிச்சாமி
ae815b20d2 Improved logs (#2272) 2024-06-05 17:54:40 +05:30
Aaron Xia
69542c9999 fix: there maybe unexpected files in event file list, not like 1.json… (#2270)
* fix: there maybe unexpected files in event file list, not like 1.json, 2.json, but .DS_Store for macOS system.

* log

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-05 17:56:39 +08:00
dependabot[bot]
95a9be2dc5 Bump @typescript-eslint/eslint-plugin from 7.11.0 to 7.12.0 in /frontend (#2260) 2024-06-05 08:10:05 +00:00
Boxuan Li
208b1461ca [AgentBench evaluation] set run_as_devin to true (#2269)
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-05 07:53:33 +00:00
dependabot[bot]
1b25a37ad4 Bump @testing-library/react from 15.0.7 to 16.0.0 in /frontend (#2227)
* Bump @testing-library/react from 15.0.7 to 16.0.0 in /frontend

Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 15.0.7 to 16.0.0.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v15.0.7...v16.0.0)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* resolve error during test teardown

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-06-05 07:51:58 +00:00
Ryan H. Tran
0584e428b2 [Mint evaluation] Fix bug in stopping when the agent reaches max steps or solution proposals (#2268)
* fix: bug in stopping when the agent reaches max steps or solution proposals

* remove --eval-num-workers

* update env.py
2024-06-05 06:47:07 +00:00
super-dainiu
ebafb702e5 Add ML-Bench Evaluation with OpenDevin (#2015)
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-06-05 01:56:39 +00:00
Leo
040d6bd806 fix: add an early exit check for agent answers in agent bench. (#2257)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-06-04 18:45:07 -07:00
tobitege
5776474dcf Fix SWE-Bench README typos (#2250) 2024-06-05 01:18:02 +00:00
tobitege
44bbe5e208 Fix agentskills tests (#2242)
* Fix agentskills tests

* Improved test_agent_skill

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 21:33:32 +00:00
tobitege
0082640ac8 fix test_config to prevent leaks (#2245) 2024-06-04 21:32:46 +02:00
tobitege
7263705492 fix frontend tests; minor readme update (#2219)
* fix frontend tests; minor readme update

* Fix indent in ChatInput.test

* Fix linting errors, finally

* lint: minor fixes (per make lint)

* All tests passed!
2024-06-04 20:46:47 +03:00
dependabot[bot]
4de08a9c00 Bump @types/node from 20.14.0 to 20.14.1 in /frontend (#2258)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.0 to 20.14.1.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 16:09:17 +00:00
dependabot[bot]
7e3e740616 Bump jose from 5.3.0 to 5.4.0 in /frontend (#2259)
Bumps [jose](https://github.com/panva/jose) from 5.3.0 to 5.4.0.
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v5.3.0...v5.4.0)

---
updated-dependencies:
- dependency-name: jose
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 16:06:38 +00:00
மனோஜ்குமார் பழனிச்சாமி
2ffd54d258 fixed output logging (#2244)
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 16:05:23 +00:00
dependabot[bot]
6dd6e6c087 Bump @typescript-eslint/parser from 7.11.0 to 7.12.0 in /frontend (#2261)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.11.0 to 7.12.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.12.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 15:59:35 +00:00
dependabot[bot]
aec3e18836 Bump litellm from 1.39.5 to 1.40.0 (#2256)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.39.5 to 1.40.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.39.5...v1.40.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:36:15 +00:00
dependabot[bot]
d85c548bf5 Bump opencv-python from 4.9.0.80 to 4.10.0.82 (#2255)
Bumps [opencv-python](https://github.com/opencv/opencv-python) from 4.9.0.80 to 4.10.0.82.
- [Release notes](https://github.com/opencv/opencv-python/releases)
- [Commits](https://github.com/opencv/opencv-python/commits)

---
updated-dependencies:
- dependency-name: opencv-python
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:08:05 +00:00
dependabot[bot]
0f60899ee0 Bump google-generativeai from 0.5.4 to 0.6.0 (#2254)
Bumps [google-generativeai](https://github.com/google/generative-ai-python) from 0.5.4 to 0.6.0.
- [Release notes](https://github.com/google/generative-ai-python/releases)
- [Changelog](https://github.com/google-gemini/generative-ai-python/blob/main/RELEASE.md)
- [Commits](https://github.com/google/generative-ai-python/compare/v0.5.4...v0.6.0)

---
updated-dependencies:
- dependency-name: google-generativeai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:06:51 +00:00
மனோஜ்குமார் பழனிச்சாமி
4e479038f9 Bugfix by added config to disable plugin initialization for Persistent sandbox (#2179)
* refactored source bashrc logic

* added initialize_plugins config

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-04 10:59:30 -04:00
dependabot[bot]
11b66bd733 Bump boto3 from 1.34.117 to 1.34.118 (#2253)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.117 to 1.34.118.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.117...1.34.118)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 14:57:58 +00:00
dependabot[bot]
62c179be6c Bump pytest from 8.2.1 to 8.2.2 (#2252)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.1 to 8.2.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.2.1...8.2.2)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 14:55:09 +00:00
மனோஜ்குமார் பழனிச்சாமி
4afd85e591 Quick doc fix (#2243) 0.6.1 2024-06-04 07:00:44 +00:00
Leo
9ada36e30b fix: restore python linting. (#2228)
* fix: restore python linting.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* update: extend the Python lint check to evaluation.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update evaluation/logic_reasoning/instruction.txt

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-04 06:36:19 +00:00
Xida Ren (Cedar)
1314a09ce9 One-step launch instructions (#2189)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-06-03 23:28:50 -07:00