Boxuan Li dd32fa6f4a
Unify linter behaviour across CI and pre-commit-hook (#1071)
* CI: Add autopep8 linter

Currently, we have autopep8 as part of pre-commit-hook. To ensure
consistent behaviour, we should have it in CI as well.

Moreover, pre-commit-hook contains a double-quote-string-fixer hook
which changes all double quotes to single quotes, but I do observe
some PRs with massive changes that do the opposite way. I suspect
that these authors 1) disable or circumvent the pre-commit-hook,
and 2) have other linters such as black in their IDE, which
automatically change all single quotes to double quotes. This
has caused a lot of unnecessary diff, made review really hard,
and led to a lot of conflicts.

* Use -diff for autopep8

* autopep8: Freeze version in CI

* Ultimate fix

* Remove pep8 long line disable workaround

* Fix lint.yml

* Fix all files under opendevin and agenthub
2024-04-14 00:19:56 -04:00

1.8 KiB

CodeAct-based Agent Framework

This folder implements the CodeAct idea that relies on LLM to autonomously perform actions in a Bash shell. It requires more from the LLM itself: LLM needs to be capable enough to do all the stuff autonomously, instead of stuck in an infinite loop.

NOTE: This agent is still highly experimental and under active development to reach the capability described in the original paper & repo.

Demo of the expected capability - work-in-progress.

mkdir workspace
PYTHONPATH=`pwd`:$PYTHONPATH python3 opendevin/main.py -d ./workspace -c CodeActAgent -t "Please write a flask app that returns 'Hello, World\!' at the root URL, then start the app on port 5000. python3 has already been installed for you."

Example: prompts gpt-4-0125-preview to write a flask server, install flask library, and start the server.

image image

Most of the things are working as expected, except at the end, the model did not follow the instruction to stop the interaction by outputting <execute> exit </execute> as instructed.

TODO: This should be fixable by either (1) including a complete in-context example like this, OR (2) collect some interaction data like this and fine-tune a model (like this, a more complex route).