diff --git a/enterprise/doc/architecture/README.md b/enterprise/doc/architecture/README.md new file mode 100644 index 0000000000..47d0217e71 --- /dev/null +++ b/enterprise/doc/architecture/README.md @@ -0,0 +1,13 @@ +# Enterprise Architecture Documentation + +Architecture diagrams specific to the OpenHands SaaS/Enterprise deployment. + +## Documentation + +- [Authentication Flow](./authentication.md) - Keycloak-based authentication for SaaS deployment +- [External Integrations](./external-integrations.md) - GitHub, Slack, Jira, and other service integrations + +## Related Documentation + +For core OpenHands architecture (applicable to all deployments), see: +- [Core Architecture Documentation](../../../openhands/architecture/README.md) diff --git a/enterprise/doc/architecture/authentication.md b/enterprise/doc/architecture/authentication.md new file mode 100644 index 0000000000..dedb201ae3 --- /dev/null +++ b/enterprise/doc/architecture/authentication.md @@ -0,0 +1,58 @@ +# Authentication Flow (SaaS Deployment) + +OpenHands uses Keycloak for identity management in the SaaS deployment. The authentication flow involves multiple services: + +```mermaid +sequenceDiagram + autonumber + participant User as User (Browser) + participant App as App Server + participant KC as Keycloak + participant IdP as Identity Provider
(GitHub, Google, etc.) + participant DB as User Database + + Note over User,DB: OAuth 2.0 / OIDC Authentication Flow + + User->>App: Access OpenHands + App->>User: Redirect to Keycloak + User->>KC: Login request + KC->>User: Show login options + User->>KC: Select provider (e.g., GitHub) + KC->>IdP: OAuth redirect + User->>IdP: Authenticate + IdP-->>KC: OAuth callback + tokens + Note over KC: Create/update user session + KC-->>User: Redirect with auth code + User->>App: Auth code + App->>KC: Exchange code for tokens + KC-->>App: Access token + Refresh token + Note over App: Create signed JWT cookie + App->>DB: Store/update user record + App-->>User: Set keycloak_auth cookie + + Note over User,DB: Subsequent Requests + + User->>App: Request with cookie + Note over App: Verify JWT signature + App->>KC: Validate token (if needed) + KC-->>App: Token valid + Note over App: Extract user context + App-->>User: Authorized response +``` + +### Authentication Components + +| Component | Purpose | Location | +|-----------|---------|----------| +| **Keycloak** | Identity provider, SSO, token management | External service | +| **UserAuth** | Abstract auth interface | `openhands/server/user_auth/user_auth.py` | +| **SaasUserAuth** | Keycloak implementation | `enterprise/server/auth/saas_user_auth.py` | +| **JWT Service** | Token signing/verification | `openhands/app_server/services/jwt_service.py` | +| **Auth Routes** | Login/logout endpoints | `enterprise/server/routes/auth.py` | + +### Token Flow + +1. **Keycloak Access Token**: Short-lived token for API access +2. **Keycloak Refresh Token**: Long-lived token to obtain new access tokens +3. **Signed JWT Cookie**: App Server's session cookie containing encrypted Keycloak tokens +4. **Provider Tokens**: OAuth tokens for GitHub, GitLab, etc. (stored separately for git operations) diff --git a/enterprise/doc/architecture/external-integrations.md b/enterprise/doc/architecture/external-integrations.md new file mode 100644 index 0000000000..d5e16a7590 --- /dev/null +++ b/enterprise/doc/architecture/external-integrations.md @@ -0,0 +1,88 @@ +# External Integrations + +OpenHands integrates with external services (GitHub, Slack, Jira, etc.) through webhook-based event handling: + +```mermaid +sequenceDiagram + autonumber + participant Ext as External Service
(GitHub/Slack/Jira) + participant App as App Server + participant IntRouter as Integration Router + participant Manager as Integration Manager + participant Conv as Conversation Service + participant Sandbox as Sandbox + + Note over Ext,Sandbox: Webhook Event Flow (e.g., GitHub Issue Created) + + Ext->>App: POST /api/integration/{service}/events + App->>IntRouter: Route to service handler + Note over IntRouter: Verify signature (HMAC) + + IntRouter->>Manager: Parse event payload + Note over Manager: Extract context (repo, issue, user) + Note over Manager: Map external user → OpenHands user + + Manager->>Conv: Create conversation (with issue context) + Conv->>Sandbox: Provision sandbox + Sandbox-->>Conv: Ready + + Manager->>Sandbox: Start agent with task + + Note over Ext,Sandbox: Agent Works on Task... + + Sandbox-->>Manager: Task complete + Manager->>Ext: POST result
(PR, comment, etc.) + + Note over Ext,Sandbox: Callback Flow (Agent → External Service) + + Sandbox->>App: Webhook callback
/api/v1/webhooks + App->>Manager: Process callback + Manager->>Ext: Update external service +``` + +### Supported Integrations + +| Integration | Trigger Events | Agent Actions | +|-------------|----------------|---------------| +| **GitHub** | Issue created, PR opened, @mention | Create PR, comment, push commits | +| **GitLab** | Issue created, MR opened | Create MR, comment, push commits | +| **Slack** | @mention in channel | Reply in thread, create tasks | +| **Jira** | Issue created/updated | Update ticket, add comments | +| **Linear** | Issue created | Update status, add comments | + +### Integration Components + +| Component | Purpose | Location | +|-----------|---------|----------| +| **Integration Routes** | Webhook endpoints per service | `enterprise/server/routes/integration/` | +| **Integration Managers** | Business logic per service | `enterprise/integrations/{service}/` | +| **Token Manager** | Store/retrieve OAuth tokens | `enterprise/server/auth/token_manager.py` | +| **Callback Processor** | Handle agent → service updates | `enterprise/integrations/{service}/*_callback_processor.py` | + +### Integration Authentication + +``` +External Service (e.g., GitHub) + │ + ▼ +┌─────────────────────────────────┐ +│ GitHub App Installation │ +│ - Webhook secret for signature │ +│ - App private key for API calls │ +└─────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────┐ +│ User Account Linking │ +│ - Keycloak user ID │ +│ - GitHub user ID │ +│ - Stored OAuth tokens │ +└─────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────┐ +│ Agent Execution │ +│ - Uses linked tokens for API │ +│ - Can push, create PRs, comment │ +└─────────────────────────────────┘ +``` diff --git a/openhands/README.md b/openhands/README.md index 12c599fcd8..85eefc48c0 100644 --- a/openhands/README.md +++ b/openhands/README.md @@ -1,8 +1,14 @@ -# OpenHands Architecture This directory contains the core components of OpenHands. -For an overview of the system architecture, see the [architecture documentation](https://docs.openhands.dev/usage/architecture/backend) (v0 backend architecture). +## Documentation + +**[Architecture Documentation](./architecture/README.md)** with diagrams covering: + - System Architecture Overview + - Conversation Startup & WebSocket Flow + - Agent Execution & LLM Flow + +- **[External Architecture Docs](https://docs.openhands.dev/usage/architecture/backend)** - Official documentation (v0 backend architecture) ## Classes diff --git a/openhands/architecture/README.md b/openhands/architecture/README.md new file mode 100644 index 0000000000..095a34db23 --- /dev/null +++ b/openhands/architecture/README.md @@ -0,0 +1,10 @@ +# OpenHands Architecture + +Architecture diagrams and explanations for the OpenHands system. + +## Documentation Sections + +- [System Architecture Overview](./system-architecture.md) - Multi-tier architecture and component responsibilities +- [Conversation Startup & WebSocket Flow](./conversation-startup.md) - Runtime provisioning and real-time communication +- [Agent Execution & LLM Flow](./agent-execution.md) - LLM integration and action execution loop +- [Observability](./observability.md) - Logging, metrics, and monitoring diff --git a/openhands/architecture/agent-execution.md b/openhands/architecture/agent-execution.md new file mode 100644 index 0000000000..4d2df3c130 --- /dev/null +++ b/openhands/architecture/agent-execution.md @@ -0,0 +1,92 @@ +# Agent Execution & LLM Flow + +When the agent executes inside the sandbox, it makes LLM calls through LiteLLM: + +```mermaid +sequenceDiagram + autonumber + participant User as User (Browser) + participant AS as Agent Server + participant Agent as Agent
(CodeAct) + participant LLM as LLM Class + participant Lite as LiteLLM + participant Proxy as LLM Proxy
(llm-proxy.app.all-hands.dev) + participant Provider as LLM Provider
(OpenAI, Anthropic, etc.) + participant AES as Action Execution Server + + Note over User,AES: Agent Loop - LLM Call Flow + + User->>AS: WebSocket: User message + AS->>Agent: Process message + Note over Agent: Build prompt from state + + Agent->>LLM: completion(messages, tools) + Note over LLM: Apply config (model, temp, etc.) + + alt Using OpenHands Provider + LLM->>Lite: litellm_proxy/{model} + Lite->>Proxy: POST /chat/completions + Note over Proxy: Auth, rate limit, routing + Proxy->>Provider: Forward request + Provider-->>Proxy: Response + Proxy-->>Lite: Response + else Using Direct Provider + LLM->>Lite: {provider}/{model} + Lite->>Provider: Direct API call + Provider-->>Lite: Response + end + + Lite-->>LLM: ModelResponse + Note over LLM: Track metrics (cost, tokens) + LLM-->>Agent: Parsed response + + Note over Agent: Parse action from response + AS->>User: WebSocket: Action event + + Note over User,AES: Action Execution + + AS->>AES: HTTP: Execute action + Note over AES: Run command/edit file + AES-->>AS: Observation + AS->>User: WebSocket: Observation event + + Note over Agent: Update state + Note over Agent: Loop continues... +``` + +### LLM Components + +| Component | Purpose | Location | +|-----------|---------|----------| +| **LLM Class** | Wrapper with retries, metrics, config | `openhands/llm/llm.py` | +| **LiteLLM** | Universal LLM API adapter | External library | +| **LLM Proxy** | OpenHands managed proxy for billing/routing | `llm-proxy.app.all-hands.dev` | +| **LLM Registry** | Manages multiple LLM instances | `openhands/llm/llm_registry.py` | + +### Model Routing + +``` +User selects model + │ + ▼ +┌───────────────────┐ +│ Model prefix? │ +└───────────────────┘ + │ + ├── openhands/claude-3-5 ──► Rewrite to litellm_proxy/claude-3-5 + │ Base URL: llm-proxy.app.all-hands.dev + │ + ├── anthropic/claude-3-5 ──► Direct to Anthropic API + │ (User's API key) + │ + ├── openai/gpt-4 ──► Direct to OpenAI API + │ (User's API key) + │ + └── azure/gpt-4 ──► Direct to Azure OpenAI + (User's API key + endpoint) +``` + +### LLM Proxy + +When using `openhands/` prefixed models, requests are routed through a managed proxy. +See the [OpenHands documentation](https://docs.openhands.dev/) for details on supported models. diff --git a/openhands/architecture/conversation-startup.md b/openhands/architecture/conversation-startup.md new file mode 100644 index 0000000000..4da15aba1d --- /dev/null +++ b/openhands/architecture/conversation-startup.md @@ -0,0 +1,68 @@ +# Conversation Startup & WebSocket Flow + +When a user starts a conversation, this sequence occurs: + +```mermaid +sequenceDiagram + autonumber + participant User as User (Browser) + participant App as App Server + participant SS as Sandbox Service + participant RAPI as Runtime API + participant Pool as Warm Pool + participant Sandbox as Sandbox (Container) + participant AS as Agent Server + participant AES as Action Execution Server + + Note over User,AES: Phase 1: Conversation Creation + User->>App: POST /api/conversations + Note over App: Authenticate user + App->>SS: Create sandbox + + Note over SS,Pool: Phase 2: Runtime Provisioning + SS->>RAPI: POST /start (image, env, config) + RAPI->>Pool: Check for warm runtime + alt Warm runtime available + Pool-->>RAPI: Return warm runtime + Note over RAPI: Assign to session + else No warm runtime + RAPI->>Sandbox: Create new container + Sandbox->>AS: Start Agent Server + Sandbox->>AES: Start Action Execution Server + AES-->>AS: Ready + end + RAPI-->>SS: Runtime URL + session API key + SS-->>App: Sandbox info + App-->>User: Conversation ID + Sandbox URL + + Note over User,AES: Phase 3: Direct WebSocket Connection + User->>AS: WebSocket: /sockets/events/{id} + AS-->>User: Connection accepted + AS->>User: Replay historical events + + Note over User,AES: Phase 4: User Sends Message + User->>AS: WebSocket: SendMessageRequest + Note over AS: Agent processes message + Note over AS: LLM call → generate action + + Note over User,AES: Phase 5: Action Execution Loop + loop Agent Loop + AS->>AES: HTTP: Execute action + Note over AES: Run in sandbox + AES-->>AS: Observation result + AS->>User: WebSocket: Event update + Note over AS: Update state, next action + end + + Note over User,AES: Phase 6: Task Complete + AS->>User: WebSocket: AgentStateChanged (FINISHED) +``` + +### Key Points + +1. **Initial Setup via App Server**: The App Server handles authentication and coordinates with the Sandbox Service +2. **Runtime API Provisioning**: The Sandbox Service calls the Runtime API, which checks for warm runtimes before creating new containers +3. **Warm Pool Optimization**: Pre-warmed runtimes reduce startup latency significantly +4. **Direct WebSocket to Sandbox**: Once created, the user's browser connects **directly** to the Agent Server inside the sandbox +5. **App Server Not in Hot Path**: After connection, all real-time communication bypasses the App Server entirely +6. **Agent Server Orchestrates**: The Agent Server manages the AI loop, calling the Action Execution Server for actual command execution diff --git a/openhands/architecture/observability.md b/openhands/architecture/observability.md new file mode 100644 index 0000000000..d7c798c309 --- /dev/null +++ b/openhands/architecture/observability.md @@ -0,0 +1,85 @@ +# Observability + +OpenHands provides structured logging and metrics collection for monitoring and debugging. + +> **SDK Documentation**: For detailed guidance on observability and metrics in agent development, see: +> - [SDK Observability Guide](https://docs.openhands.dev/sdk/guides/observability) +> - [SDK Metrics Guide](https://docs.openhands.dev/sdk/guides/metrics) + +```mermaid +flowchart LR + subgraph Sources["Sources"] + Agent["Agent Server"] + App["App Server"] + Frontend["Frontend"] + end + + subgraph Collection["Collection"] + JSONLog["JSON Logs
(stdout)"] + Metrics["Metrics
(Internal)"] + end + + subgraph External["External (Optional)"] + LogAgg["Log Aggregator"] + Analytics["Analytics Service"] + end + + Agent --> JSONLog + App --> JSONLog + App --> Metrics + + JSONLog --> LogAgg + Frontend --> Analytics +``` + +### Structured Logging + +OpenHands uses Python's standard logging library with structured JSON output support. + +| Component | Format | Destination | Purpose | +|-----------|--------|-------------|---------| +| **Application Logs** | JSON (when `LOG_JSON=1`) | stdout | Debugging, error tracking | +| **Access Logs** | JSON (Uvicorn) | stdout | Request tracing | +| **LLM Debug Logs** | Plain text | File (optional) | LLM call debugging | + +### JSON Log Format + +When `LOG_JSON=1` is set, logs are emitted as single-line JSON for ingestion by log aggregators: + +```json +{ + "message": "Conversation started", + "severity": "INFO", + "conversation_id": "abc-123", + "user_id": "user-456", + "timestamp": "2024-01-15T10:30:00Z" +} +``` + +Additional context can be added using Python's logger `extra=` parameter (see [Python logging docs](https://docs.python.org/3/library/logging.html)). + +### Metrics + +| Metric | Tracked By | Storage | Purpose | +|--------|------------|---------|---------| +| **LLM Cost** | `Metrics` class | Conversation stats file | Billing, budget limits | +| **Token Usage** | `Metrics` class | Conversation stats file | Usage analytics | +| **Response Latency** | `Metrics` class | Conversation stats file | Performance monitoring | + +### Conversation Stats Persistence + +Per-conversation metrics are persisted for analytics: + +```python +# Location: openhands/server/services/conversation_stats.py +ConversationStats: + - service_to_metrics: Dict[str, Metrics] + - accumulated_cost: float + - token_usage: TokenUsage + +# Stored at: {file_store}/conversation_stats/{conversation_id}.pkl +``` + +### Integration with External Services + +Structured JSON logging allows integration with any log aggregation service (e.g., ELK Stack, Loki, Splunk). Configure your log collector to ingest from container stdout/stderr. diff --git a/openhands/architecture/system-architecture.md b/openhands/architecture/system-architecture.md new file mode 100644 index 0000000000..7745270b31 --- /dev/null +++ b/openhands/architecture/system-architecture.md @@ -0,0 +1,88 @@ +# System Architecture Overview + +OpenHands supports multiple deployment configurations. This document describes the core components and how they interact. + +## Local/Docker Deployment + +The simplest deployment runs everything locally or in Docker containers: + +```mermaid +flowchart TB + subgraph Server["OpenHands Server"] + API["REST API
(FastAPI)"] + ConvMgr["Conversation
Manager"] + Runtime["Runtime
Manager"] + end + + subgraph Sandbox["Sandbox (Docker Container)"] + AES["Action Execution
Server"] + Browser["Browser
Environment"] + FS["File System"] + end + + User["User"] -->|"HTTP/WebSocket"| API + API --> ConvMgr + ConvMgr --> Runtime + Runtime -->|"Provision"| Sandbox + + Server -->|"Execute actions"| AES + AES --> Browser + AES --> FS +``` + +### Core Components + +| Component | Purpose | Location | +|-----------|---------|----------| +| **Server** | REST API, conversation management, runtime orchestration | `openhands/server/` | +| **Runtime** | Abstract interface for sandbox execution | `openhands/runtime/` | +| **Action Execution Server** | Execute bash, file ops, browser actions | Inside sandbox | +| **EventStream** | Central event bus for all communication | `openhands/events/` | + +## Scalable Deployment + +For production deployments, OpenHands can be configured with a separate Runtime API service: + +```mermaid +flowchart TB + subgraph AppServer["App Server"] + API["REST API"] + ConvMgr["Conversation
Manager"] + end + + subgraph RuntimeAPI["Runtime API (Optional)"] + RuntimeMgr["Runtime
Manager"] + WarmPool["Warm Pool"] + end + + subgraph Sandbox["Sandbox"] + AS["Agent Server"] + AES["Action Execution
Server"] + end + + User["User"] -->|"HTTP"| API + API --> ConvMgr + ConvMgr -->|"Provision"| RuntimeMgr + RuntimeMgr --> WarmPool + RuntimeMgr --> Sandbox + + User -.->|"WebSocket"| AS + AS -->|"HTTP"| AES +``` + +This configuration enables: +- **Warm pool**: Pre-provisioned runtimes for faster startup +- **Direct WebSocket**: Users connect directly to their sandbox, bypassing the App Server +- **Horizontal scaling**: App Server and Runtime API can scale independently + +### Runtime Options + +OpenHands supports multiple runtime implementations: + +| Runtime | Use Case | +|---------|----------| +| **DockerRuntime** | Local development, single-machine deployments | +| **RemoteRuntime** | Connect to externally managed sandboxes | +| **ModalRuntime** | Serverless execution via Modal | + +See the [Runtime documentation](https://docs.openhands.dev/usage/architecture/runtime) for details.