Xingyao Wang b19b724eae
feat: show exact python interpreter to the agent in IPython and Bash (#3448)
* try to fix pip unavailable

* update test case for pip

* force rebuild in CI

* remove extra symlink

* fix newline

* added semi-colon to line 31

* Dockerfile.j2: activate env at the end

* Revert "Dockerfile.j2: activate env at the end"

This reverts commit cf2f5651021fe80d4ab69a35a85f0a35b29dc3d7.

* cleanup Dockerfile

* switch default python image

* remove image agnostic (no longer used)

* fix tests

* simplify integration tests default image

* add nodejs specific runtime tests

* update tests and workflows

* switch to nikolaik/python-nodejs:python3.11-nodejs22

* update build sh to output image name correctly

* increase custom images to test

* fix test

* fix test

* fix double quote

* try fixing ci

* update ghcr workflow

* fix artifact name

* try to fix ghcr again

* fix workflow

* save built image to correct dir

* remove extra -docker-image

* make last tag to be human readable image tag

* fix hyphen to underscore

* run test runtime on all tags

* revert app build

* separate ghcr workflow

* update dockerfile for eval

* fix tag for test run

* try fix tag

* try fix tag via matrix output

* try workflow again

* update comments

* try fixing test matrix

* fix artifact name

* try fix tag again

* Revert "try fix tag again"

This reverts commit b369badd8cccf4a526e36d27eafb77ea2d32f6be.

* tweak filename

* try different path

* fix filepath

* try fix tag artifact path again

* save json instead of line

* update matrix

* print all tags in workflow

* support only streaming diff logs from the runtime client

* remove strip from log line to fix indentation

* get py interpreter for jupyter

* rstrip to remove newline on the rightside for logging

* fix blocking issue for stream logs

* set python interpreter path in bash ps1

* update testcase for jupyter py interpreter path

* remove accidentally added changes

* remove accidentally added changes

* only print dockerfile when debug

* add docs

* remove extra tests that weren't supposed to be in this pr

* add back missing test

* revert

* make LogBuffer synchronous to fix hang in integration tests

* fix integration tests

* Update opendevin/runtime/client/client.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix test case

* fix integration tests

* change deque to list

* update integration tests

* rename test runtime

* fix docs

* rename opendevin to openhands in tests

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-21 20:08:50 +00:00
..
2024-08-20 00:44:54 +08:00

OpenHands Architecture

This directory contains the core components of OpenHands.

This diagram provides an overview of the roles of each component and how they communicate and collaborate. OpenHands System Architecture Diagram (July 4, 2024)

Classes

The key classes in OpenHands are:

  • LLM: brokers all interactions with large language models. Works with any underlying completion model, thanks to LiteLLM.
  • Agent: responsible for looking at the current State, and producing an Action that moves one step closer toward the end-goal.
  • AgentController: initializes the Agent, manages State, and drive the main loop that pushes the Agent forward, step by step
  • State: represents the current state of the Agent's task. Includes things like the current step, a history of recent events, the Agent's long-term plan, etc
  • EventStream: a central hub for Events, where any component can publish Events, or listen for Events published by other components
    • Event: an Action or Observeration
      • Action: represents a request to e.g. edit a file, run a command, or send a message
      • Observation: represents information collected from the environment, e.g. file contents or command output
  • Runtime: responsible for performing Actions, and sending back Observations
    • Sandbox: the part of the runtime responsible for running commands, e.g. inside of Docker
  • Server: brokers OpenHands sessions over HTTP, e.g. to drive the frontend
    • Session: holds a single EventStream, a single AgentController, and a single Runtime. Generally represents a single task (but potentially including several user prompts)
    • SessionManager: keeps a list of active sessions, and ensures requests are routed to the correct Session

Control Flow

Here's the basic loop (in pseudocode) that drives agents.

while True:
  prompt = agent.generate_prompt(state)
  response = llm.completion(prompt)
  action = agent.parse_response(response)
  observation = runtime.run(action)
  state = state.update(action, observation)

In reality, most of this is achieved through message passing, via the EventStream. The EventStream serves as the backbone for all communication in OpenHands.

flowchart LR
  Agent--Actions-->AgentController
  AgentController--State-->Agent
  AgentController--Actions-->EventStream
  EventStream--Observations-->AgentController
  Runtime--Observations-->EventStream
  EventStream--Actions-->Runtime
  Frontend--Actions-->EventStream

Runtime

Please refer to the documentation to learn more about Runtime.