* add a single-threaded server serving browsergym
* update poetry
* update browser page content
* add import to make sure browsergym environments are registered properly
* remove flask server, use multiprocess impl and Pipe
* fix
* refactor BrowserEnv
* update browser action and obs to include more complete info
* fix screenshot
* update poetry lock
* add playwright install to workflow
* update
* add better html to text conversion
* update for better text conversion to maintain parity with the current handling of browseurlaction
* update
* update poetry
* update multiprocessing mp
* fix multiprocessing
* update
* update github workflow
---------
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
* ci/lint: fix calling Ruff's format
* Transition for ruff lint. Only checking the modified files.
---------
Co-authored-by: ifuryst <ifuryst@gmail.com>
* update dev md for instruction of running unit tests
* add back unit tests
* revert back to the original sandbox implementation to fix testcases
* allow unit test workflow to find docker.sock
* make sandbox test working via patch
* fix arg parser that's broken for some reason
* fix integration test
* Revert "fix arg parser that's broken for some reason"
This reverts commit 6cc8961133.
* update Development.md
* get RUN_AS_DEVIN and network=host working with app sandbox
* attempt to fix the workspace base permission
* sandbox might failed in chown due to mounting, but it won't be fatal
* update sshbox instruction
* remove default user id since it will be passed in the instruction
* revert permission fix since it should be resolved by correct SANDBOX_USER_ID
* the permission issue can be fixed by simply provide correct env var
* remove log
* set sandbox user id to getuid by default
* move logging to initializer
* make the uid consistent across host, app container, and sandbox
* remove hostname as it causes sudo issue
* fix permission of entrypoint script
* make the uvicron app run as host user uid for jupyter plugin
* revert use host network
* get docker socket gid and usermod instead of chmod 777
* try to fix app build disk space issue
* ci: refine job matrix and enable cache for poetry
- Replace direct installation of Poetry with pipx to ensure isolated environment setups.
- Enable caching for Poetry dependencies using setup-python action to improve build efficiency.
- Refactor the job matrix specifications.
* ci: enable Homebrew caching in Actions
* ci: optimize Docker and Colima installation in GitHub Actions
- Check if Docker and Colima are already installed before attempting to install them. Link and start each service appropriately to avoid unnecessary reinstallation and ensure they are ready for immediate use in the CI pipeline.
* ci: remove Homebrew cache.
* fix typo
* fix: specified the Python version to avoid errors in actions.
---------
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* Add integration test framework with mock llm
* Fix MonologueAgent and PlannerAgent tests
* Remove adhoc logging
* Use existing logs
* Fix SWEAgent and PlannerAgent
* Check-in test log files
* conftest: look up under test name folder only
* Add docstring to conftest
* Finish dev doc
* Avoid non-determinism
* Remove dependency on llm embedding model
* Init embedding model only for MonologueAgent
* Add adhoc fix for sandbox discrepancy
* Test ssh and exec sandboxes
* CI: fix missing sandbox type
* conftest: Remove hack
* Reword comment for TODO
* DogFood: Use OpenDevin to review PR
* Use diff rather than patch
* Fix prompt
* Return if label not present
* Don't write review to environment variable
* Fix label check
* Use better name for labels
* Fix pre-commit and linter versions to avoid surprise
To avoid surprising results on GitHub Actions, e.g. a new release of pre-commit starts to
reject all PRs, fix it to the latest version, 3.7.0. This PR also fixes ruff and mypy
versions in pyproject.toml since we very likely don't really need latest upgrades from
linters, and upgrades can always bring surprise.
* pre-commit-config: Use v0.3.7 for Ruff as in pyproject.toml
* CI: Add autopep8 linter
Currently, we have autopep8 as part of pre-commit-hook. To ensure
consistent behaviour, we should have it in CI as well.
Moreover, pre-commit-hook contains a double-quote-string-fixer hook
which changes all double quotes to single quotes, but I do observe
some PRs with massive changes that do the opposite way. I suspect
that these authors 1) disable or circumvent the pre-commit-hook,
and 2) have other linters such as black in their IDE, which
automatically change all single quotes to double quotes. This
has caused a lot of unnecessary diff, made review really hard,
and led to a lot of conflicts.
* Use -diff for autopep8
* autopep8: Freeze version in CI
* Ultimate fix
* Remove pep8 long line disable workaround
* Fix lint.yml
* Fix all files under opendevin and agenthub