mirror of https://github.com/OpenHands/OpenHands.git synced 2025-12-26 05:48:36 +08:00

History

[Agent] Improve edits by adding back edit_file_by_line (#2722 )

* add replace-based block edit & preliminary test case fix

* further fix the insert behavior

* make edit only work on first occurence

* bump codeact version since we now use new edit agentskills

* update prompt for new agentskills

* update integration tests

* make run_infer.sh executable

* remove code block for edit_file

* update integration test for prompt changes

* default to not use hint for eval

* fix insert emptyfile bug

* throw value error when `to_replace` is empty

* make `_edit_or_insert_file` return string so we can try to fix some linter errors (best attempt)

* add todo

* update integration test

* fix sandbox test for this PR

* fix inserting with additional newline

* rename to edit_file_by_replace

* add back `edit_file_by_line`

* update prompt for new editing tool

* fix integration tests

* bump codeact version since there are more changes

* add back append file

* fix current line for append

* fix append unit tests

* change the location where we show edited line no to agent and fix tests

* update integration tests

* fix global window size affect by open_file bug

* fix global window size affect by open_file bug

* increase window size to 300

* add file beginning and ending marker to avoid looping

* expand the editor window to better display edit error for model

* refractor to breakdown edit to internal functions

* reduce window to 200

* move window to 100

* refractor to cleanup some logic into _calculate_window_bounds

* fix integration tests

* fix sandbox test on new prompt

* update demonstration with new changes

* fix integration

* initialize llm inside process_instance to circumvent "AttributeError: Can't pickle local object"

* update kwargs

* retry for internal server error

* fix max iteration

* override max iter from config

* fix integration tests

* remove edit file by line

* fix integration tests

* add instruction to avoid hanging

* Revert "add instruction to avoid hanging"

This reverts commit 06fd2c59387c1c2348bc95cb487af1eb913c6ddd.

* handle content policy violation error

* fix integration tests

* fix typo in prompt - the window is 100

* update all integration tests

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

2024-07-11 15:30:20 +00:00

browsing_agent

Refactoring: event stream based agent history (#2709 )

2024-07-07 21:04:23 +00:00

codeact_agent

[Agent] Improve edits by adding back edit_file_by_line (#2722 )

2024-07-11 15:30:20 +00:00

codeact_swe_agent

Customize LLM config per agent (#2756 )

2024-07-09 22:05:54 -07:00

delegator_agent

Refactoring: event stream based agent history (#2709 )

2024-07-07 21:04:23 +00:00

dummy_agent

Refactoring: event stream based agent history (#2709 )

2024-07-07 21:04:23 +00:00

micro

Customize LLM config per agent (#2756 )

2024-07-09 22:05:54 -07:00

monologue_agent

Customize LLM config per agent (#2756 )

2024-07-09 22:05:54 -07:00

planner_agent

Customize LLM config per agent (#2756 )

2024-07-09 22:05:54 -07:00

__init__.py

remove swe agent (#2708 )

2024-07-01 12:27:14 +09:00

README.md

[Arch] Remove supports for Background Commands (#2803 )

2024-07-06 03:38:05 +08:00

README.md

Agent Framework Research

In this folder, there may exist multiple implementations of Agent that will be used by the framework.

For example, agenthub/codeact_agent, etc. Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.

Constructing an Agent

The abstraction for an agent can be found here.

Agents are run inside of a loop. At each iteration, agent.step() is called with a State input, and the agent must output an Action.

Every agent also has a self.llm which it can use to interact with the LLM configured by the user. See the LiteLLM docs for self.llm.completion.

State

The state contains:

A history of actions taken by the agent, as well as any observations (e.g. file content, command output) from those actions
A list of actions/observations that have happened since the most recent step
A root_task, which contains a plan of action
- The agent can add and modify subtasks through the AddTaskAction and ModifyTaskAction

Actions

Here is a list of available Actions, which can be returned by agent.step():

CmdRunAction - Runs a command inside a sandboxed terminal
IPythonRunCellAction - Execute a block of Python code interactively (in Jupyter notebook) and receives CmdOutputObservation. Requires setting up jupyter plugin as a requirement.
FileReadAction - Reads the content of a file
FileWriteAction - Writes new content to a file
BrowseURLAction - Gets the content of a URL
AgentRecallAction - Searches memory (e.g. a vector database)
AddTaskAction - Adds a subtask to the plan
ModifyTaskAction - Changes the state of a subtask.
AgentFinishAction - Stops the control loop, allowing the user/delegator agent to enter a new task
AgentRejectAction - Stops the control loop, allowing the user/delegator agent to enter a new task
AgentFinishAction - Stops the control loop, allowing the user to enter a new task
MessageAction - Represents a message from an agent or the user

You can use action.to_dict() and action_from_dict to serialize and deserialize actions.

Observations

There are also several types of Observations. These are typically available in the step following the corresponding Action. But they may also appear as a result of asynchronous events (e.g. a message from the user).

Here is a list of available Observations:

You can use observation.to_dict() and observation_from_dict to serialize and deserialize observations.

Interface

Every agent must implement the following methods:

`step`

def step(self, state: "State") -> "Action"

step moves the agent forward one step towards its goal. This probably means sending a prompt to the LLM, then parsing the response into an Action.

`search_memory`

def search_memory(self, query: str) -> list[str]:

search_memory should return a list of events that match the query. This will be used for the recall action.

You can optionally just return [] for this method, meaning the agent has no long-term memory.