* Replace OpenDevin with OpenHands * Update CONTRIBUTING.md * Update README.md * Update README.md * update poetry lock; move opendevin folder to openhands * fix env var * revert image references in docs * revert permissions * revert permissions --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
4.0 KiB
sidebar_position
| sidebar_position |
|---|
| 3 |
🧠 Agents and Capabilities
CodeAct Agent
Description
This agent implements the CodeAct idea (paper, tweet) that consolidates LLM agents’ actions into a unified code action space for both simplicity and performance (see paper for more details).
The conceptual idea is illustrated below. At each turn, the agent can:
- Converse: Communicate with humans in natural language to ask for clarification, confirmation, etc.
- CodeAct: Choose to perform the task by executing code
- Execute any valid Linux
bashcommand - Execute any valid
Pythoncode with an interactive Python interpreter. This is simulated throughbashcommand, see plugin system below for more details.
Plugin System
To make the CodeAct agent more powerful with only access to bash action space, CodeAct agent leverages OpenHands's plugin system:
- Jupyter plugin: for IPython execution via bash command
- SWE-agent tool plugin: Powerful bash command line tools for software development tasks introduced by swe-agent.
Demo
https://github.com/All-Hands-AI/OpenHands/assets/38853559/f592a192-e86c-4f48-ad31-d69282d5f6ac
Example of CodeActAgent with gpt-4-turbo-2024-04-09 performing a data science task (linear regression)
Actions
Action,
CmdRunAction,
IPythonRunCellAction,
AgentEchoAction,
AgentFinishAction,
AgentTalkAction
Observations
CmdOutputObservation,
IPythonRunCellObservation,
AgentMessageObservation,
UserMessageObservation
Methods
| Method | Description |
|---|---|
__init__ |
Initializes an agent with llm and a list of messages list[Mapping[str, str]] |
step |
Performs one step using the CodeAct Agent. This includes gathering info on previous steps and prompting the model to make a command to execute. |
Planner Agent
Description
The planner agent utilizes a special prompting strategy to create long term plans for solving problems. The agent is given its previous action-observation pairs, current task, and hint based on last action taken at every step.
Actions
NullAction,
CmdRunAction,
BrowseURLAction,
GithubPushAction,
FileReadAction,
FileWriteAction,
AgentThinkAction,
AgentFinishAction,
AgentSummarizeAction,
AddTaskAction,
ModifyTaskAction,
Observations
Observation,
NullObservation,
CmdOutputObservation,
FileReadObservation,
BrowserOutputObservation
Methods
| Method | Description |
|---|---|
__init__ |
Initializes an agent with llm |
step |
Checks to see if current step is completed, returns AgentFinishAction if True. Otherwise, creates a plan prompt and sends to model for inference, adding the result as the next action. |