1666 Commits

Author SHA1 Message Date
dependabot[bot]
985b16a459
chore(deps-dev): bump ruff from 0.5.6 to 0.5.7 (#3322)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.6 to 0.5.7.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.6...0.5.7)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:35:28 -04:00
tofarr
f8c815279d
Bug fix drag resize (#3307)
* Fix issue where mouse drag fails on first attempt

* Detach event correctly

* Use old variable names

---------

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-09 18:31:29 +03:00
tobitege
d2a8ff0918
fix double quotes around env vars typo in ghcr.yml (#3313) 2024-08-09 14:22:10 +00:00
Graham Neubig
b4db5b9cae
Remove some obsolete examples of persist_sandbox in the doc (#3312) 2024-08-09 14:06:55 +00:00
tobitege
5e6fd58f56
try to fix DOCKER_IMAGE_TAG in ghcr_push_runtime (od_runtime, arm64) (#3311) 2024-08-09 12:02:43 +00:00
tobitege
ac7badc236
(fix) revert #3240 ignore_paths in ghcr.yml (#3308) 2024-08-09 11:06:39 +02:00
Xingyao Wang
2e6b08db4f
fix: workspace folder permission & app container cannot access client API (#3300)
* also copy over pyproject and poetry lock

* add missing readme

* remove extra git config init since it is already done in client.py

* only chown if the /workspace dir does not exists

* Revert "remove extra git config init since it is already done in client.py"

This reverts commit e8556cd76dcb1720b33f5e06904c56efda2e7d9f.

* remove extra git config init since it is already done in client.py

* fix test runtime

* print container log while reconnecting

* print log in more readable format

* print log in more readable format

* increase lines

* clean up sandbox and ssh related stuff

* remove ssh hostname

* remove ssh hostname

* fix docker app cannot access runtime API issue

* remove ssh password

* API HOSTNAME should be pre-fixed with SANDBOX

* update config

* fix typo that breaks the test
2024-08-08 19:28:34 -04:00
dependabot[bot]
ddd2565035
chore(deps-dev): bump @tailwindcss/typography in /frontend (#3294)
Bumps [@tailwindcss/typography](https://github.com/tailwindlabs/tailwindcss-typography) from 0.5.13 to 0.5.14.
- [Release notes](https://github.com/tailwindlabs/tailwindcss-typography/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss-typography/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss-typography/compare/v0.5.13...v0.5.14)

---
updated-dependencies:
- dependency-name: "@tailwindcss/typography"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:52 -04:00
dependabot[bot]
81db5aefc7
chore(deps): bump vite from 5.3.5 to 5.4.0 in /frontend (#3295)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.5 to 5.4.0.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/create-vite@5.4.0/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:36 -04:00
dependabot[bot]
512f56ea80
chore(deps-dev): bump tailwindcss from 3.4.7 to 3.4.9 in /frontend (#3297)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.7 to 3.4.9.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.9/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.7...v3.4.9)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:24 -04:00
Xingyao Wang
a5195b0e65
chore: clean up sandbox and ssh related configs (#3301)
* clean up sandbox and ssh related stuff

* remove ssh hostname

* remove ssh hostname

* remove ssh password

* update config

* fix typo that breaks the test
2024-08-08 22:15:40 +00:00
tofarr
040b9cb75c
Chore Readme updates (#3302)
* Readme updates

Added explicit installation instructions to server and frontend README

* Documentation update

* WIP

* WIP

---------

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
2024-08-08 18:06:58 -04:00
Xingyao Wang
4915168da2
fix: copy over pyproject and poetry lock for App docker (#3299)
* also copy over pyproject and poetry lock

* add missing readme
2024-08-08 21:06:26 +00:00
dependabot[bot]
8a008773ee
chore(deps-dev): bump python-pptx from 1.0.1 to 1.0.2 (#3292)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 1.0.1 to 1.0.2.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:49:45 +08:00
dependabot[bot]
9f18172982
chore(deps): bump zope-interface from 7.0 to 7.0.1 (#3276)
Bumps [zope-interface](https://github.com/zopefoundation/zope.interface) from 7.0 to 7.0.1.
- [Changelog](https://github.com/zopefoundation/zope.interface/blob/master/CHANGES.rst)
- [Commits](https://github.com/zopefoundation/zope.interface/compare/7.0...7.0.1)

---
updated-dependencies:
- dependency-name: zope-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-08-08 23:49:34 +08:00
dependabot[bot]
7bd4af80f0
chore(deps): bump boto3 from 1.34.155 to 1.34.156 (#3293)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.155 to 1.34.156.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.155...1.34.156)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:49:23 +08:00
dependabot[bot]
c741df58e3
chore(deps-dev): bump openai from 1.40.0 to 1.40.1 (#3291)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.0 to 1.40.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.0...v1.40.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:05:39 +08:00
dependabot[bot]
50b6ce8578
chore(deps): bump litellm from 1.43.1 to 1.43.3 (#3290)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.1 to 1.43.3.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.1...v1.43.3)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:05:24 +08:00
Graham Neubig
f36639be28
Improve listen.py test coverage (#3289)
* Add unit tests for listen.py

* Added new tests

* Improve test coverage for listen.py

* Update tests

---------

Co-authored-by: opendevin <opendevin@all-hands.dev>
2024-08-08 14:25:12 +00:00
Xingyao Wang
db302fd33c
fix: dubious ownership when running git (#3282)
* switch default to eventstream runtime

* remove pull docker from makefile

* fix unittest

* fix file store path

* try deprecate server runtime

* remove persist sandbox

* move file utils

* remove server runtime related workflow

* remove unused method

* attempt to remove the reliance on filestore for BE

* fix async for list file

* fix list_files to post

* fix list files

* add suffix to directory

* make sure list file returns abs path;
make sure other backend endpoints accpets abs path

* remove server runtime test workflow

* set git config in runtime

* chown for workspace in client;
use INIT_COMMANDS to maintain all commands that need to be run before bash start;

* fix client issue;
add test case for git;

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-08 13:14:45 +00:00
Xingyao Wang
2669a5378c
update video; (#3284)
clean up docs site;
remove faq;
2024-08-08 08:39:32 -04:00
Xingyao Wang
73bd165118
ci: try to fix runtime build when exact hash for runtime image is found (#3272) 2024-08-08 03:46:05 +00:00
dependabot[bot]
e922e2666b
chore(deps-dev): bump postcss from 8.4.40 to 8.4.41 in /frontend (#3260)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.40 to 8.4.41.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.40...8.4.41)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-07 22:25:12 -04:00
dependabot[bot]
a627fabfc5
chore(deps): bump react-i18next from 15.0.0 to 15.0.1 in /frontend (#3279)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 15.0.0 to 15.0.1.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v15.0.0...v15.0.1)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 22:24:45 -04:00
Xingyao Wang
90d0a62469
(arch) Switch default runtime to EventStream Runtime (#3271)
* switch default to eventstream runtime

* remove pull docker from makefile

* fix unittest

* fix file store path

* try deprecate server runtime

* remove persist sandbox

* move file utils

* remove server runtime related workflow

* remove unused method

* attempt to remove the reliance on filestore for BE

* fix async for list file

* fix list_files to post

* fix list files

* add suffix to directory

* make sure list file returns abs path;
make sure other backend endpoints accpets abs path

* remove server runtime test workflow

* set git config in runtime
2024-08-08 10:11:49 +08:00
sven
71ad979ffd
fixed list rendering (#3273) 2024-08-08 01:50:16 +08:00
Xingyao Wang
b30a2dd87a
completely remove update_source_code (#3280) 2024-08-07 16:57:11 +00:00
dependabot[bot]
36ae44f6ef
chore(deps-dev): bump flake8 from 7.1.0 to 7.1.1 (#3248)
Bumps [flake8](https://github.com/pycqa/flake8) from 7.1.0 to 7.1.1.
- [Commits](https://github.com/pycqa/flake8/compare/7.1.0...7.1.1)

---
updated-dependencies:
- dependency-name: flake8
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-08 00:00:02 +08:00
dependabot[bot]
74264c8902
chore(deps-dev): bump openai from 1.39.0 to 1.40.0 (#3278)
Bumps [openai](https://github.com/openai/openai-python) from 1.39.0 to 1.40.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:59:21 +08:00
dependabot[bot]
33ad4e27ee
chore(deps): bump litellm from 1.43.0 to 1.43.1 (#3274)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.0 to 1.43.1.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.0...v1.43.1)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:59:08 +08:00
dependabot[bot]
069b40e8b4
chore(deps-dev): bump sympy from 1.12.1 to 1.13.1 (#3277)
Bumps [sympy](https://github.com/sympy/sympy) from 1.12.1 to 1.13.1.
- [Release notes](https://github.com/sympy/sympy/releases)
- [Commits](https://github.com/sympy/sympy/compare/sympy-1.12.1...sympy-1.13.1)

---
updated-dependencies:
- dependency-name: sympy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:58:57 +08:00
dependabot[bot]
9c39c8b250
chore(deps): bump boto3 from 1.34.154 to 1.34.155 (#3275)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.154 to 1.34.155.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.154...1.34.155)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:57:44 +08:00
mamoodi
2915679adc
Make all workflows consistent formatting and some more comments (#3270) 2024-08-07 02:00:15 +00:00
Xingyao Wang
c941d028b9
make sure mermaid flowchart is displayed correctly (#3269) 2024-08-07 09:27:17 +08:00
dependabot[bot]
60e11b0dd2
chore(deps-dev): bump ruff from 0.5.5 to 0.5.6 (#3251)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.5...0.5.6)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-06 15:13:34 -07:00
Xingyao Wang
bb66f15ff6
[Arch] Streamline EventStream Runtime Image Building Logic (#3259)
* remove nocache

* simplify runtime build to use hash & always update source

* style

* try to fix temp folder issue

* fix rm tree

* create build folder first (to get correct hash), then copy it over to actual build folder

* fix assert

* fix indentation

* fix copy over

* add runtime documentation

* fix runtime docs

* fix typo

* Update docs/modules/usage/runtime.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update docs/modules/usage/runtime.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-07 06:09:38 +08:00
Xingyao Wang
a22ee6656b
[docs] Clean up evaluation docs (#3268)
* remove llm copy pasta

* add emoji
2024-08-07 05:05:45 +08:00
Xingyao Wang
7270d21cf9 update documentation for evaluation tutorial 2024-08-06 14:55:42 -04:00
dependabot[bot]
9c44d94cef
chore(deps-dev): bump lint-staged from 15.2.7 to 15.2.8 in /frontend (#3255)
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 15.2.7 to 15.2.8.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v15.2.7...v15.2.8)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 13:26:15 -04:00
dependabot[bot]
a8ed63ef23
chore(deps-dev): bump autoprefixer from 10.4.19 to 10.4.20 in /frontend (#3254)
Bumps [autoprefixer](https://github.com/postcss/autoprefixer) from 10.4.19 to 10.4.20.
- [Release notes](https://github.com/postcss/autoprefixer/releases)
- [Changelog](https://github.com/postcss/autoprefixer/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/autoprefixer/compare/10.4.19...10.4.20)

---
updated-dependencies:
- dependency-name: autoprefixer
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 13:26:01 -04:00
Xingyao Wang
31b244f95e
[Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230)
* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

* Apply stash

* fix run as other user;
make test async;

* fix test runtime for run as od

* add run-as-devin to all the runtime tests

* handle the case when username is root

* move all run-as-devin tests from sandbox;
only tests a few cases on different user to save time;

* move over multi-line echo related tests to test_runtime

* fix user-specific jupyter by fixing the pypoetry virtualenv folder

* make plugin's init async;
chdir at initialization of jupyter plugin;
move ipy simple testcase to test runtime;

* support agentskills import in
move tests for jupyter pwd tests;
overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter;
make agentskills read env var lazily, in case env var is updated;

* fix ServerRuntime agentskills issue

* move agnostic image test to test_runtime

* merge runtime tests in CI

* fix enable auto lint as env var

* update warning message

* update warning message

* test for different container images

* change parsing output as debug

* add exception handling for update_pwd_decorator

* fix unit test indentation

* add plugins as default input to Runtime class;
remove init_sandbox_plugins;
implement add_env_var (include jupyter) in the base class;

* fix server runtime auto lint

* Revert "add exception handling for update_pwd_decorator"

This reverts commit 2b668b1506e02145cb8f87e321aad62febca3d50.

* tries to print debugging info for agentskills

* explictly setting uid (try fix permission issue)

* Revert "tries to print debugging info for agentskills"

This reverts commit 8be4c86756f0e3fc62957b327ba2ac4999c419de.

* set sandbox user id during testing to hopefully fix the permission issue

* add browser tools for server runtime

* try to debug for old pwd

* update debug cmd

* only test agnostic runtime when TEST_RUNTIME is Server

* fix temp dir mkdir

* load TEST_RUNTIME at the beginning

* remove ipython tests

* only log to file when DEBUG

* default logging to project root

* temporarily remove log to file

* fix LLM logger dir

* fix logger

* make set pwd an optional aux action

* fix prev pwd

* fix infinity recursion

* simplify

* do not import the whole od library to avoid logger folder by jupyter

* fix browsing

* increase timeout

* attempt to fix agentskills yet again

* clean up in testcases, since CI maybe run as non-root

* add _cause attribute for event.id

* remove parent

* add a bunch of debugging statement again for CI :(

* fix temp_dir fixture

* change all temp dir to follow pytest's tmp_path_factory

* remove extra bracket

* clean up error printing a bit

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* add typing for tmp dir fixture

* clear the directory before running the test to avoid weird CI temp dir

* remove agnostic test case for server runtime

* Revert "remove agnostic test case for server runtime"

This reverts commit 30e2181c3fc1410e69596c2dcd06be01f1d016b3.

* disable agnostic tests in CI

* fix test

* make sure plugin arg is not passed when no plugin is specified;
remove redundant on_event function;

* move mock prompt

* rename runtime

* remove extra logging

* refactor run_controller's interface;
support multiple runtime for integration test;
filter out hostname for prompt

* uncomment other tests

* pass the right runtime to controller

* log runtime when start

* uncomment tests

* improve symbol filters

* add intergration test prompts that seemd ok

* add integration test workflow

* add python3 to default ubuntu image

* symlink python and fix permission to jupyter pip

* add retry for jupyter execute server

* fix jupyter pip install;
add post-process for jupyter pip install;
simplify init by add agent_skills path to PYTHONPATH;
add testcase to tests jupyter pip install;

* fix bug

* use ubuntu:22.04 for eventstream integration tests

* add todo

* update testcase

* remove redundant code

* fix unit test

* reduce dependency for runtime

* try making llama-index an optional dependency that's not installed by default

* remove pip install since it seemd not needed

* log ipython execution;
await write message since it returns a future

* update ipy testcase

* do not install llama-index in CI

* do not install llama-index in the app docker as well

* set sandbox container image in the integration test script

* log plugins & env var for runtime

* update conftest for sha256

* add git

* remove all non-alphanumeric chalracters

* add working ipy module tests!

* default to use host network

* remove is_async from browser to make thing a little more reliable;
retry loading browser when error;

* add sleep to wait a bit for http server

* kill http server before regenerate browsing tests

* fix browsing

* only set sandbox container image if undefined

* skip empty config value

* update evaluation to use the latest run_controller

* revert logger in execute_server to be compatible with server runtime

* revert logging level to fix jupyter

* set logger level

* revert the logging

* chmod for workspace to fix permission

* support getting timeout from action

* update test for server runtime

* try to fix file permission

* fix test_cmd_run_action_serialization_deserialization test (added timeout)

* poetry: pip 24.2, torch 2.2.2

* revert adding pip to pyproject.toml

* add build to dependencies in pyproject.toml

* forgot poetry lock --no-update

* fix a DelegatorAgent prompt_002.log (timeout)

* fix a DelegatorAgent prompt_003.log (timeout)

* couple more timeout attribs in prompt files

* some more prompt files

* prompts galore

* add clarification comment for timeout

* default timeout to config

* add assert

* update integraton tests for eventstream

* update integration tests

* fix timeout for action<->dict

* remove redundant on_event

* default to use instance image

* update run_controller interface

* add logging for copy

* refactor swe_bench for the new design

* fix action execution timeout

* updatelock

* remove build sandbox locally

* fix runtime

* use plain for-loop for single process

* remove extra print

* get swebench inference working

* print whole `test_result` dict

* got swebench patch post-process working

* update swe-bench evaluation readme

* refactor using shared reset_logger function

* move messy swebench prompt to a different file

* support the ability to specify whether to keep prompt

* support the ability to specify whether to keep prompt

* fix dockerfile

* fix import and remove unnecessary strip logic

* fix action serialization

* get agentbench running

* remove extra ls for agent bench

* fix agentbench metric

* factor out common documentation for eval

* update biocoder doc

* remove swe_env_box since it is no longer needed

* get biocoder working

* add func timeout for bird

* fix jupyter pwd with ~ as user name

* fix jupyter pwd with ~ as user name

* get bird working

* get browsing evaluation working

* make eda runnable

* fix id column

* fix eda run_infer

* unify eval output using a structured format;
make swebench coompatible with that format;
update client source code for every swebench run;
do not inject testcmd for swebench

* standardize existing benchs for the new eval output

* set update source code = true

* get gaia standardized

* fix gaia

* gorilla refactored but stuck at language.so to test

* refactor and make gpqa work

* refactor humanevalfix and get it working

* refactor logic reasoning and get it working

* refactor browser env so it works with eventstream runtime for eval

* add initial version of miniwob refactor

* fix browsergym environment

* get miniwob working!!

* allowing injecting additional dependency to OD runtime docker image

* allowing injecting additional dependency to OD runtime docker image

* support logic reasoning with pre-injected dependency

* get mint working

* update runtime build

* fix mint docker

* add test for keep_prompt;
add missing await close for some tests

* update integration tests for eventstream runtime

* fix integration tests for server runtime

* refactor ml bench and toolqa

* refactor webarena

* fix default factory

* Update run_infer.py

* add APIError to retry

* increase timeout for swebench

* make sure to hide api key when dump eval output

* update the behavior of put source code to put files instead of tarball

* add dishash to dependency

* sendintr when timeout

* fix dockerfile copy

* reduce timeout

* use dirhash to avoid repeat building for update source

* fix runtime_build testcase

* add dir_hash to docker build pipeline

* revert api error

* update poetry lock

* add retries for swebench run infer

* fix git patch

* update poetry lock

* adjust config order

* fix mount volumns

* enforce all eval to use "instance_id"

* remove file store from runtime

* make file_store public inside eventstream

* move the runtime logic inside `main` out

* support using async function for process_instance_fn

* refactor run_infer with the create_time

* fix file store

* Update evaluation/toolqa/utils.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix typo

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: super-dainiu <78588128+super-dainiu@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-06 17:21:45 +00:00
dependabot[bot]
9029cd77d3
chore(deps): bump litellm from 1.42.12 to 1.43.0 (#3267)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.12 to 1.43.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.12...v1.43.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:47:02 +08:00
dependabot[bot]
adedfe5e7f
chore(deps-dev): bump python-pptx from 1.0.0 to 1.0.1 (#3263)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 1.0.0 to 1.0.1.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v1.0.0...v1.0.1)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:29 +08:00
dependabot[bot]
43768684d9
chore(deps): bump zope-interface from 6.4.post2 to 7.0 (#3262)
Bumps [zope-interface](https://github.com/zopefoundation/zope.interface) from 6.4.post2 to 7.0.
- [Changelog](https://github.com/zopefoundation/zope.interface/blob/master/CHANGES.rst)
- [Commits](https://github.com/zopefoundation/zope.interface/compare/6.4.post2...7.0)

---
updated-dependencies:
- dependency-name: zope-interface
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:17 +08:00
dependabot[bot]
0f6ce9717f
chore(deps): bump boto3 from 1.34.153 to 1.34.154 (#3264)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.153 to 1.34.154.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.153...1.34.154)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:00 +08:00
dependabot[bot]
d5170448e3
chore(deps-dev): bump openai from 1.38.0 to 1.39.0 (#3266)
Bumps [openai](https://github.com/openai/openai-python) from 1.38.0 to 1.39.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.38.0...v1.39.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:45:46 +08:00
dependabot[bot]
91de16a88b
chore(deps-dev): bump streamlit from 1.37.0 to 1.37.1 (#3265)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.37.0 to 1.37.1.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.37.0...1.37.1)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:45:13 +08:00
dependabot[bot]
b4a018ee57
chore(deps-dev): bump python-pptx from 0.6.23 to 1.0.0 (#3252)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 0.6.23 to 1.0.0.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v0.6.23...v1.0.0)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:13:23 +08:00
dependabot[bot]
a26b0b3744
chore(deps-dev): bump openai from 1.37.2 to 1.38.0 (#3249)
Bumps [openai](https://github.com/openai/openai-python) from 1.37.2 to 1.38.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.37.2...v1.38.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:12:59 +08:00
dependabot[bot]
a6f6350d98
chore(deps): bump litellm from 1.42.9 to 1.42.12 (#3253)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.9 to 1.42.12.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.9...v1.42.12)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:12:45 +08:00