* add replace-based block edit & preliminary test case fix * further fix the insert behavior * make edit only work on first occurence * bump codeact version since we now use new edit agentskills * update prompt for new agentskills * update integration tests * make run_infer.sh executable * remove code block for edit_file * update integration test for prompt changes * default to not use hint for eval * fix insert emptyfile bug * throw value error when `to_replace` is empty * make `_edit_or_insert_file` return string so we can try to fix some linter errors (best attempt) * add todo * update integration test * fix sandbox test for this PR * fix inserting with additional newline * rename to edit_file_by_replace * add back `edit_file_by_line` * update prompt for new editing tool * fix integration tests * bump codeact version since there are more changes * add back append file * fix current line for append * fix append unit tests * change the location where we show edited line no to agent and fix tests * update integration tests * fix global window size affect by open_file bug * fix global window size affect by open_file bug * increase window size to 300 * add file beginning and ending marker to avoid looping * expand the editor window to better display edit error for model * refractor to breakdown edit to internal functions * reduce window to 200 * move window to 100 * refractor to cleanup some logic into _calculate_window_bounds * fix integration tests * fix sandbox test on new prompt * update demonstration with new changes * fix integration * initialize llm inside process_instance to circumvent "AttributeError: Can't pickle local object" * update kwargs * retry for internal server error * fix max iteration * override max iter from config * fix integration tests * remove edit file by line * fix integration tests * add instruction to avoid hanging * Revert "add instruction to avoid hanging" This reverts commit 06fd2c59387c1c2348bc95cb487af1eb913c6ddd. * handle content policy violation error * fix integration tests * fix typo in prompt - the window is 100 * update all integration tests --------- Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Agent Framework Research
In this folder, there may exist multiple implementations of Agent that will be used by the framework.
For example, agenthub/codeact_agent, etc.
Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.
Constructing an Agent
The abstraction for an agent can be found here.
Agents are run inside of a loop. At each iteration, agent.step() is called with a
State input, and the agent must output an Action.
Every agent also has a self.llm which it can use to interact with the LLM configured by the user.
See the LiteLLM docs for self.llm.completion.
State
The state contains:
- A history of actions taken by the agent, as well as any observations (e.g. file content, command output) from those actions
- A list of actions/observations that have happened since the most recent step
- A
root_task, which contains a plan of action- The agent can add and modify subtasks through the
AddTaskActionandModifyTaskAction
- The agent can add and modify subtasks through the
Actions
Here is a list of available Actions, which can be returned by agent.step():
CmdRunAction- Runs a command inside a sandboxed terminalIPythonRunCellAction- Execute a block of Python code interactively (in Jupyter notebook) and receivesCmdOutputObservation. Requires setting upjupyterplugin as a requirement.FileReadAction- Reads the content of a fileFileWriteAction- Writes new content to a fileBrowseURLAction- Gets the content of a URLAgentRecallAction- Searches memory (e.g. a vector database)AddTaskAction- Adds a subtask to the planModifyTaskAction- Changes the state of a subtask.AgentFinishAction- Stops the control loop, allowing the user/delegator agent to enter a new taskAgentRejectAction- Stops the control loop, allowing the user/delegator agent to enter a new taskAgentFinishAction- Stops the control loop, allowing the user to enter a new taskMessageAction- Represents a message from an agent or the user
You can use action.to_dict() and action_from_dict to serialize and deserialize actions.
Observations
There are also several types of Observations. These are typically available in the step following the corresponding Action. But they may also appear as a result of asynchronous events (e.g. a message from the user).
Here is a list of available Observations:
CmdOutputObservationBrowserOutputObservationFileReadObservationFileWriteObservationAgentRecallObservationErrorObservationSuccessObservation
You can use observation.to_dict() and observation_from_dict to serialize and deserialize observations.
Interface
Every agent must implement the following methods:
step
def step(self, state: "State") -> "Action"
step moves the agent forward one step towards its goal. This probably means
sending a prompt to the LLM, then parsing the response into an Action.
search_memory
def search_memory(self, query: str) -> list[str]:
search_memory should return a list of events that match the query. This will be used
for the recall action.
You can optionally just return [] for this method, meaning the agent has no long-term memory.