mirror of
https://github.com/OpenHands/OpenHands.git
synced 2025-12-26 05:48:36 +08:00
* CodeActAgent: fix message prep if prompt caching is not supported * fix python version in regen tests workflow * fix in conftest "mock_completion" method * add disable_vision to LLMConfig; revert change in message parsing in llm.py * format messages in several files for completion * refactored message(s) formatting (llm.py); added vision_is_active() * fix a unit test * regenerate: added LOG_TO_FILE and FORCE_REGENERATE env flags * try to fix path to logs folder in workflow * llm: prevent index error * try FORCE_USE_LLM in regenerate * tweaks everywhere... * fix 2 random unit test errors :( * added FORCE_REGENERATE_TESTS=true to regenerate CLI * fix test_lint_file_fail_typescript again * double-quotes for env vars in workflow; llm logger set to debug * fix typo in regenerate * regenerate iterations now 20; applied iteration counter fix by Li * regenerate: pass FORCE_REGENERATE flag into env * fixes for int tests. several mock files updated. * browsing_agent: fix response_parser.py adding ) to empty response * test_browse_internet: fix skipif and revert obsolete mock files * regenerate: fi bracketing for http server start/kill conditions * disable test_browse_internet for CodeAct*Agents; mock files updated after merge * missed to include more mock files earlier * reverts after review feedback from Li * forgot one * browsing agent test, partial fixes and updated mock files * test_browse_internet works in my WSL now! * adapt unit test test_prompt_caching.py * add DEBUG to regenerate workflow command * convert regenerate workflow params to inputs * more integration test mock files updated * more files * test_prompt_caching: restored test_prompt_caching_headers purpose * file_ops: fix potential exception, like "cross device copy"; fixed mock files accordingly * reverts/changes wrt feedback from xingyao * updated docs and config template * code cleanup wrt review feedback
OpenHands Architecture
This directory contains the core components of OpenHands.
This diagram provides an overview of the roles of each component and how they communicate and collaborate.

Classes
The key classes in OpenHands are:
- LLM: brokers all interactions with large language models. Works with any underlying completion model, thanks to LiteLLM.
- Agent: responsible for looking at the current State, and producing an Action that moves one step closer toward the end-goal.
- AgentController: initializes the Agent, manages State, and drive the main loop that pushes the Agent forward, step by step
- State: represents the current state of the Agent's task. Includes things like the current step, a history of recent events, the Agent's long-term plan, etc
- EventStream: a central hub for Events, where any component can publish Events, or listen for Events published by other components
- Event: an Action or Observeration
- Action: represents a request to e.g. edit a file, run a command, or send a message
- Observation: represents information collected from the environment, e.g. file contents or command output
- Event: an Action or Observeration
- Runtime: responsible for performing Actions, and sending back Observations
- Sandbox: the part of the runtime responsible for running commands, e.g. inside of Docker
- Server: brokers OpenHands sessions over HTTP, e.g. to drive the frontend
- Session: holds a single EventStream, a single AgentController, and a single Runtime. Generally represents a single task (but potentially including several user prompts)
- SessionManager: keeps a list of active sessions, and ensures requests are routed to the correct Session
Control Flow
Here's the basic loop (in pseudocode) that drives agents.
while True:
prompt = agent.generate_prompt(state)
response = llm.completion(prompt)
action = agent.parse_response(response)
observation = runtime.run(action)
state = state.update(action, observation)
In reality, most of this is achieved through message passing, via the EventStream. The EventStream serves as the backbone for all communication in OpenHands.
flowchart LR
Agent--Actions-->AgentController
AgentController--State-->Agent
AgentController--Actions-->EventStream
EventStream--Observations-->AgentController
Runtime--Observations-->EventStream
EventStream--Actions-->Runtime
Frontend--Actions-->EventStream
Runtime
Please refer to the documentation to learn more about Runtime.