Add architecture diagrams explaining system components and WebSocket flow (#12542)

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Saurya <saurya@openhands.dev>
Co-authored-by: Ray Myers <ray.myers@gmail.com>
This commit is contained in:
Saurya Velagapudi
2026-03-17 08:52:40 -07:00
committed by GitHub
parent d58e12ad74
commit b68c75252d
9 changed files with 510 additions and 2 deletions

View File

@@ -0,0 +1,13 @@
# Enterprise Architecture Documentation
Architecture diagrams specific to the OpenHands SaaS/Enterprise deployment.
## Documentation
- [Authentication Flow](./authentication.md) - Keycloak-based authentication for SaaS deployment
- [External Integrations](./external-integrations.md) - GitHub, Slack, Jira, and other service integrations
## Related Documentation
For core OpenHands architecture (applicable to all deployments), see:
- [Core Architecture Documentation](../../../openhands/architecture/README.md)

View File

@@ -0,0 +1,58 @@
# Authentication Flow (SaaS Deployment)
OpenHands uses Keycloak for identity management in the SaaS deployment. The authentication flow involves multiple services:
```mermaid
sequenceDiagram
autonumber
participant User as User (Browser)
participant App as App Server
participant KC as Keycloak
participant IdP as Identity Provider<br/>(GitHub, Google, etc.)
participant DB as User Database
Note over User,DB: OAuth 2.0 / OIDC Authentication Flow
User->>App: Access OpenHands
App->>User: Redirect to Keycloak
User->>KC: Login request
KC->>User: Show login options
User->>KC: Select provider (e.g., GitHub)
KC->>IdP: OAuth redirect
User->>IdP: Authenticate
IdP-->>KC: OAuth callback + tokens
Note over KC: Create/update user session
KC-->>User: Redirect with auth code
User->>App: Auth code
App->>KC: Exchange code for tokens
KC-->>App: Access token + Refresh token
Note over App: Create signed JWT cookie
App->>DB: Store/update user record
App-->>User: Set keycloak_auth cookie
Note over User,DB: Subsequent Requests
User->>App: Request with cookie
Note over App: Verify JWT signature
App->>KC: Validate token (if needed)
KC-->>App: Token valid
Note over App: Extract user context
App-->>User: Authorized response
```
### Authentication Components
| Component | Purpose | Location |
|-----------|---------|----------|
| **Keycloak** | Identity provider, SSO, token management | External service |
| **UserAuth** | Abstract auth interface | `openhands/server/user_auth/user_auth.py` |
| **SaasUserAuth** | Keycloak implementation | `enterprise/server/auth/saas_user_auth.py` |
| **JWT Service** | Token signing/verification | `openhands/app_server/services/jwt_service.py` |
| **Auth Routes** | Login/logout endpoints | `enterprise/server/routes/auth.py` |
### Token Flow
1. **Keycloak Access Token**: Short-lived token for API access
2. **Keycloak Refresh Token**: Long-lived token to obtain new access tokens
3. **Signed JWT Cookie**: App Server's session cookie containing encrypted Keycloak tokens
4. **Provider Tokens**: OAuth tokens for GitHub, GitLab, etc. (stored separately for git operations)

View File

@@ -0,0 +1,88 @@
# External Integrations
OpenHands integrates with external services (GitHub, Slack, Jira, etc.) through webhook-based event handling:
```mermaid
sequenceDiagram
autonumber
participant Ext as External Service<br/>(GitHub/Slack/Jira)
participant App as App Server
participant IntRouter as Integration Router
participant Manager as Integration Manager
participant Conv as Conversation Service
participant Sandbox as Sandbox
Note over Ext,Sandbox: Webhook Event Flow (e.g., GitHub Issue Created)
Ext->>App: POST /api/integration/{service}/events
App->>IntRouter: Route to service handler
Note over IntRouter: Verify signature (HMAC)
IntRouter->>Manager: Parse event payload
Note over Manager: Extract context (repo, issue, user)
Note over Manager: Map external user → OpenHands user
Manager->>Conv: Create conversation (with issue context)
Conv->>Sandbox: Provision sandbox
Sandbox-->>Conv: Ready
Manager->>Sandbox: Start agent with task
Note over Ext,Sandbox: Agent Works on Task...
Sandbox-->>Manager: Task complete
Manager->>Ext: POST result<br/>(PR, comment, etc.)
Note over Ext,Sandbox: Callback Flow (Agent → External Service)
Sandbox->>App: Webhook callback<br/>/api/v1/webhooks
App->>Manager: Process callback
Manager->>Ext: Update external service
```
### Supported Integrations
| Integration | Trigger Events | Agent Actions |
|-------------|----------------|---------------|
| **GitHub** | Issue created, PR opened, @mention | Create PR, comment, push commits |
| **GitLab** | Issue created, MR opened | Create MR, comment, push commits |
| **Slack** | @mention in channel | Reply in thread, create tasks |
| **Jira** | Issue created/updated | Update ticket, add comments |
| **Linear** | Issue created | Update status, add comments |
### Integration Components
| Component | Purpose | Location |
|-----------|---------|----------|
| **Integration Routes** | Webhook endpoints per service | `enterprise/server/routes/integration/` |
| **Integration Managers** | Business logic per service | `enterprise/integrations/{service}/` |
| **Token Manager** | Store/retrieve OAuth tokens | `enterprise/server/auth/token_manager.py` |
| **Callback Processor** | Handle agent → service updates | `enterprise/integrations/{service}/*_callback_processor.py` |
### Integration Authentication
```
External Service (e.g., GitHub)
┌─────────────────────────────────┐
│ GitHub App Installation │
│ - Webhook secret for signature │
│ - App private key for API calls │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ User Account Linking │
│ - Keycloak user ID │
│ - GitHub user ID │
│ - Stored OAuth tokens │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ Agent Execution │
│ - Uses linked tokens for API │
│ - Can push, create PRs, comment │
└─────────────────────────────────┘
```

View File

@@ -1,8 +1,14 @@
# OpenHands Architecture
This directory contains the core components of OpenHands. This directory contains the core components of OpenHands.
For an overview of the system architecture, see the [architecture documentation](https://docs.openhands.dev/usage/architecture/backend) (v0 backend architecture). ## Documentation
**[Architecture Documentation](./architecture/README.md)** with diagrams covering:
- System Architecture Overview
- Conversation Startup & WebSocket Flow
- Agent Execution & LLM Flow
- **[External Architecture Docs](https://docs.openhands.dev/usage/architecture/backend)** - Official documentation (v0 backend architecture)
## Classes ## Classes

View File

@@ -0,0 +1,10 @@
# OpenHands Architecture
Architecture diagrams and explanations for the OpenHands system.
## Documentation Sections
- [System Architecture Overview](./system-architecture.md) - Multi-tier architecture and component responsibilities
- [Conversation Startup & WebSocket Flow](./conversation-startup.md) - Runtime provisioning and real-time communication
- [Agent Execution & LLM Flow](./agent-execution.md) - LLM integration and action execution loop
- [Observability](./observability.md) - Logging, metrics, and monitoring

View File

@@ -0,0 +1,92 @@
# Agent Execution & LLM Flow
When the agent executes inside the sandbox, it makes LLM calls through LiteLLM:
```mermaid
sequenceDiagram
autonumber
participant User as User (Browser)
participant AS as Agent Server
participant Agent as Agent<br/>(CodeAct)
participant LLM as LLM Class
participant Lite as LiteLLM
participant Proxy as LLM Proxy<br/>(llm-proxy.app.all-hands.dev)
participant Provider as LLM Provider<br/>(OpenAI, Anthropic, etc.)
participant AES as Action Execution Server
Note over User,AES: Agent Loop - LLM Call Flow
User->>AS: WebSocket: User message
AS->>Agent: Process message
Note over Agent: Build prompt from state
Agent->>LLM: completion(messages, tools)
Note over LLM: Apply config (model, temp, etc.)
alt Using OpenHands Provider
LLM->>Lite: litellm_proxy/{model}
Lite->>Proxy: POST /chat/completions
Note over Proxy: Auth, rate limit, routing
Proxy->>Provider: Forward request
Provider-->>Proxy: Response
Proxy-->>Lite: Response
else Using Direct Provider
LLM->>Lite: {provider}/{model}
Lite->>Provider: Direct API call
Provider-->>Lite: Response
end
Lite-->>LLM: ModelResponse
Note over LLM: Track metrics (cost, tokens)
LLM-->>Agent: Parsed response
Note over Agent: Parse action from response
AS->>User: WebSocket: Action event
Note over User,AES: Action Execution
AS->>AES: HTTP: Execute action
Note over AES: Run command/edit file
AES-->>AS: Observation
AS->>User: WebSocket: Observation event
Note over Agent: Update state
Note over Agent: Loop continues...
```
### LLM Components
| Component | Purpose | Location |
|-----------|---------|----------|
| **LLM Class** | Wrapper with retries, metrics, config | `openhands/llm/llm.py` |
| **LiteLLM** | Universal LLM API adapter | External library |
| **LLM Proxy** | OpenHands managed proxy for billing/routing | `llm-proxy.app.all-hands.dev` |
| **LLM Registry** | Manages multiple LLM instances | `openhands/llm/llm_registry.py` |
### Model Routing
```
User selects model
┌───────────────────┐
│ Model prefix? │
└───────────────────┘
├── openhands/claude-3-5 ──► Rewrite to litellm_proxy/claude-3-5
│ Base URL: llm-proxy.app.all-hands.dev
├── anthropic/claude-3-5 ──► Direct to Anthropic API
│ (User's API key)
├── openai/gpt-4 ──► Direct to OpenAI API
│ (User's API key)
└── azure/gpt-4 ──► Direct to Azure OpenAI
(User's API key + endpoint)
```
### LLM Proxy
When using `openhands/` prefixed models, requests are routed through a managed proxy.
See the [OpenHands documentation](https://docs.openhands.dev/) for details on supported models.

View File

@@ -0,0 +1,68 @@
# Conversation Startup & WebSocket Flow
When a user starts a conversation, this sequence occurs:
```mermaid
sequenceDiagram
autonumber
participant User as User (Browser)
participant App as App Server
participant SS as Sandbox Service
participant RAPI as Runtime API
participant Pool as Warm Pool
participant Sandbox as Sandbox (Container)
participant AS as Agent Server
participant AES as Action Execution Server
Note over User,AES: Phase 1: Conversation Creation
User->>App: POST /api/conversations
Note over App: Authenticate user
App->>SS: Create sandbox
Note over SS,Pool: Phase 2: Runtime Provisioning
SS->>RAPI: POST /start (image, env, config)
RAPI->>Pool: Check for warm runtime
alt Warm runtime available
Pool-->>RAPI: Return warm runtime
Note over RAPI: Assign to session
else No warm runtime
RAPI->>Sandbox: Create new container
Sandbox->>AS: Start Agent Server
Sandbox->>AES: Start Action Execution Server
AES-->>AS: Ready
end
RAPI-->>SS: Runtime URL + session API key
SS-->>App: Sandbox info
App-->>User: Conversation ID + Sandbox URL
Note over User,AES: Phase 3: Direct WebSocket Connection
User->>AS: WebSocket: /sockets/events/{id}
AS-->>User: Connection accepted
AS->>User: Replay historical events
Note over User,AES: Phase 4: User Sends Message
User->>AS: WebSocket: SendMessageRequest
Note over AS: Agent processes message
Note over AS: LLM call → generate action
Note over User,AES: Phase 5: Action Execution Loop
loop Agent Loop
AS->>AES: HTTP: Execute action
Note over AES: Run in sandbox
AES-->>AS: Observation result
AS->>User: WebSocket: Event update
Note over AS: Update state, next action
end
Note over User,AES: Phase 6: Task Complete
AS->>User: WebSocket: AgentStateChanged (FINISHED)
```
### Key Points
1. **Initial Setup via App Server**: The App Server handles authentication and coordinates with the Sandbox Service
2. **Runtime API Provisioning**: The Sandbox Service calls the Runtime API, which checks for warm runtimes before creating new containers
3. **Warm Pool Optimization**: Pre-warmed runtimes reduce startup latency significantly
4. **Direct WebSocket to Sandbox**: Once created, the user's browser connects **directly** to the Agent Server inside the sandbox
5. **App Server Not in Hot Path**: After connection, all real-time communication bypasses the App Server entirely
6. **Agent Server Orchestrates**: The Agent Server manages the AI loop, calling the Action Execution Server for actual command execution

View File

@@ -0,0 +1,85 @@
# Observability
OpenHands provides structured logging and metrics collection for monitoring and debugging.
> **SDK Documentation**: For detailed guidance on observability and metrics in agent development, see:
> - [SDK Observability Guide](https://docs.openhands.dev/sdk/guides/observability)
> - [SDK Metrics Guide](https://docs.openhands.dev/sdk/guides/metrics)
```mermaid
flowchart LR
subgraph Sources["Sources"]
Agent["Agent Server"]
App["App Server"]
Frontend["Frontend"]
end
subgraph Collection["Collection"]
JSONLog["JSON Logs<br/>(stdout)"]
Metrics["Metrics<br/>(Internal)"]
end
subgraph External["External (Optional)"]
LogAgg["Log Aggregator"]
Analytics["Analytics Service"]
end
Agent --> JSONLog
App --> JSONLog
App --> Metrics
JSONLog --> LogAgg
Frontend --> Analytics
```
### Structured Logging
OpenHands uses Python's standard logging library with structured JSON output support.
| Component | Format | Destination | Purpose |
|-----------|--------|-------------|---------|
| **Application Logs** | JSON (when `LOG_JSON=1`) | stdout | Debugging, error tracking |
| **Access Logs** | JSON (Uvicorn) | stdout | Request tracing |
| **LLM Debug Logs** | Plain text | File (optional) | LLM call debugging |
### JSON Log Format
When `LOG_JSON=1` is set, logs are emitted as single-line JSON for ingestion by log aggregators:
```json
{
"message": "Conversation started",
"severity": "INFO",
"conversation_id": "abc-123",
"user_id": "user-456",
"timestamp": "2024-01-15T10:30:00Z"
}
```
Additional context can be added using Python's logger `extra=` parameter (see [Python logging docs](https://docs.python.org/3/library/logging.html)).
### Metrics
| Metric | Tracked By | Storage | Purpose |
|--------|------------|---------|---------|
| **LLM Cost** | `Metrics` class | Conversation stats file | Billing, budget limits |
| **Token Usage** | `Metrics` class | Conversation stats file | Usage analytics |
| **Response Latency** | `Metrics` class | Conversation stats file | Performance monitoring |
### Conversation Stats Persistence
Per-conversation metrics are persisted for analytics:
```python
# Location: openhands/server/services/conversation_stats.py
ConversationStats:
- service_to_metrics: Dict[str, Metrics]
- accumulated_cost: float
- token_usage: TokenUsage
# Stored at: {file_store}/conversation_stats/{conversation_id}.pkl
```
### Integration with External Services
Structured JSON logging allows integration with any log aggregation service (e.g., ELK Stack, Loki, Splunk). Configure your log collector to ingest from container stdout/stderr.

View File

@@ -0,0 +1,88 @@
# System Architecture Overview
OpenHands supports multiple deployment configurations. This document describes the core components and how they interact.
## Local/Docker Deployment
The simplest deployment runs everything locally or in Docker containers:
```mermaid
flowchart TB
subgraph Server["OpenHands Server"]
API["REST API<br/>(FastAPI)"]
ConvMgr["Conversation<br/>Manager"]
Runtime["Runtime<br/>Manager"]
end
subgraph Sandbox["Sandbox (Docker Container)"]
AES["Action Execution<br/>Server"]
Browser["Browser<br/>Environment"]
FS["File System"]
end
User["User"] -->|"HTTP/WebSocket"| API
API --> ConvMgr
ConvMgr --> Runtime
Runtime -->|"Provision"| Sandbox
Server -->|"Execute actions"| AES
AES --> Browser
AES --> FS
```
### Core Components
| Component | Purpose | Location |
|-----------|---------|----------|
| **Server** | REST API, conversation management, runtime orchestration | `openhands/server/` |
| **Runtime** | Abstract interface for sandbox execution | `openhands/runtime/` |
| **Action Execution Server** | Execute bash, file ops, browser actions | Inside sandbox |
| **EventStream** | Central event bus for all communication | `openhands/events/` |
## Scalable Deployment
For production deployments, OpenHands can be configured with a separate Runtime API service:
```mermaid
flowchart TB
subgraph AppServer["App Server"]
API["REST API"]
ConvMgr["Conversation<br/>Manager"]
end
subgraph RuntimeAPI["Runtime API (Optional)"]
RuntimeMgr["Runtime<br/>Manager"]
WarmPool["Warm Pool"]
end
subgraph Sandbox["Sandbox"]
AS["Agent Server"]
AES["Action Execution<br/>Server"]
end
User["User"] -->|"HTTP"| API
API --> ConvMgr
ConvMgr -->|"Provision"| RuntimeMgr
RuntimeMgr --> WarmPool
RuntimeMgr --> Sandbox
User -.->|"WebSocket"| AS
AS -->|"HTTP"| AES
```
This configuration enables:
- **Warm pool**: Pre-provisioned runtimes for faster startup
- **Direct WebSocket**: Users connect directly to their sandbox, bypassing the App Server
- **Horizontal scaling**: App Server and Runtime API can scale independently
### Runtime Options
OpenHands supports multiple runtime implementations:
| Runtime | Use Case |
|---------|----------|
| **DockerRuntime** | Local development, single-machine deployments |
| **RemoteRuntime** | Connect to externally managed sandboxes |
| **ModalRuntime** | Serverless execution via Modal |
See the [Runtime documentation](https://docs.openhands.dev/usage/architecture/runtime) for details.