* Refactored prompt.py to reduce token usage * Reverted some destructive changes * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Apply suggestions from code review * Apply suggestions from code review * Update agenthub/codeact_agent/prompt.py * fix integration test * make lint * feat: support ToolQA benchmark (#2263) * Add files via upload * Update README.md * Update run_infer.py * Update utils.py * make lint * Update evaluation/toolqa/run_infer.py --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * feat: revert hiden special paths change in file action (#2328) * revert change in file action * remove useless code * make lint * Support gpqa benchmark evaluation (#2080) * feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * fix(frontend): prevent API key from resetting after modal change (#2329) * remove bottom chatbox fade * Modal wider; fix lint error * settings: attempt to not clear api key for same provider * prevent api key from resetting after changing the model * revert other changes and fix post test tear down error --------- Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> * fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034) * fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * regenerate integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> * Feat: Support Gorilla APIBench (#2081) * removed unused files from gorilla * Update run_infer.py, removed unused imports * Update utils.py * Update ast_eval_hf.py * Update ast_eval_tf.py * Update ast_eval_th.py * Create README.md * Update run_infer.py * make lint * Update run_infer.py * fix lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * remote useless (#2332) * fix integration test * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * fix integration test --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Frank Xu <frankxu2004@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com> Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> Co-authored-by: Aaron Xia <zhhuaxia@gmail.com> Co-authored-by: Graham Neubig <neubig@gmail.com>
Agent Framework Research
In this folder, there may exist multiple implementations of Agent that will be used by the framework.
For example, agenthub/codeact_agent, etc.
Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.
Constructing an Agent
The abstraction for an agent can be found here.
Agents are run inside of a loop. At each iteration, agent.step() is called with a
State input, and the agent must output an Action.
Every agent also has a self.llm which it can use to interact with the LLM configured by the user.
See the LiteLLM docs for self.llm.completion.
State
The state contains:
- A history of actions taken by the agent, as well as any observations (e.g. file content, command output) from those actions
- A list of actions/observations that have happened since the most recent step
- A
root_task, which contains a plan of action- The agent can add and modify subtasks through the
AddTaskActionandModifyTaskAction
- The agent can add and modify subtasks through the
Actions
Here is a list of available Actions, which can be returned by agent.step():
CmdRunAction- Runs a command inside a sandboxed terminalCmdKillAction- Kills a background commandIPythonRunCellAction- Execute a block of Python code interactively (in Jupyter notebook) and receivesCmdOutputObservation. Requires setting upjupyterplugin as a requirement.FileReadAction- Reads the content of a fileFileWriteAction- Writes new content to a fileBrowseURLAction- Gets the content of a URLAgentRecallAction- Searches memory (e.g. a vector database)AddTaskAction- Adds a subtask to the planModifyTaskAction- Changes the state of a subtask.AgentFinishAction- Stops the control loop, allowing the user/delegator agent to enter a new taskAgentRejectAction- Stops the control loop, allowing the user/delegator agent to enter a new taskAgentFinishAction- Stops the control loop, allowing the user to enter a new taskMessageAction- Represents a message from an agent or the user
You can use action.to_dict() and action_from_dict to serialize and deserialize actions.
Observations
There are also several types of Observations. These are typically available in the step following the corresponding Action. But they may also appear as a result of asynchronous events (e.g. a message from the user, logs from a command running in the background).
Here is a list of available Observations:
CmdOutputObservationBrowserOutputObservationFileReadObservationFileWriteObservationAgentRecallObservationErrorObservationSuccessObservation
You can use observation.to_dict() and observation_from_dict to serialize and deserialize observations.
Interface
Every agent must implement the following methods:
step
def step(self, state: "State") -> "Action"
step moves the agent forward one step towards its goal. This probably means
sending a prompt to the LLM, then parsing the response into an Action.
search_memory
def search_memory(self, query: str) -> list[str]:
search_memory should return a list of events that match the query. This will be used
for the recall action.
You can optionally just return [] for this method, meaning the agent has no long-term memory.