diff --git a/enterprise/doc/architecture/README.md b/enterprise/doc/architecture/README.md
new file mode 100644
index 0000000000..47d0217e71
--- /dev/null
+++ b/enterprise/doc/architecture/README.md
@@ -0,0 +1,13 @@
+# Enterprise Architecture Documentation
+
+Architecture diagrams specific to the OpenHands SaaS/Enterprise deployment.
+
+## Documentation
+
+- [Authentication Flow](./authentication.md) - Keycloak-based authentication for SaaS deployment
+- [External Integrations](./external-integrations.md) - GitHub, Slack, Jira, and other service integrations
+
+## Related Documentation
+
+For core OpenHands architecture (applicable to all deployments), see:
+- [Core Architecture Documentation](../../../openhands/architecture/README.md)
diff --git a/enterprise/doc/architecture/authentication.md b/enterprise/doc/architecture/authentication.md
new file mode 100644
index 0000000000..dedb201ae3
--- /dev/null
+++ b/enterprise/doc/architecture/authentication.md
@@ -0,0 +1,58 @@
+# Authentication Flow (SaaS Deployment)
+
+OpenHands uses Keycloak for identity management in the SaaS deployment. The authentication flow involves multiple services:
+
+```mermaid
+sequenceDiagram
+ autonumber
+ participant User as User (Browser)
+ participant App as App Server
+ participant KC as Keycloak
+ participant IdP as Identity Provider
(GitHub, Google, etc.)
+ participant DB as User Database
+
+ Note over User,DB: OAuth 2.0 / OIDC Authentication Flow
+
+ User->>App: Access OpenHands
+ App->>User: Redirect to Keycloak
+ User->>KC: Login request
+ KC->>User: Show login options
+ User->>KC: Select provider (e.g., GitHub)
+ KC->>IdP: OAuth redirect
+ User->>IdP: Authenticate
+ IdP-->>KC: OAuth callback + tokens
+ Note over KC: Create/update user session
+ KC-->>User: Redirect with auth code
+ User->>App: Auth code
+ App->>KC: Exchange code for tokens
+ KC-->>App: Access token + Refresh token
+ Note over App: Create signed JWT cookie
+ App->>DB: Store/update user record
+ App-->>User: Set keycloak_auth cookie
+
+ Note over User,DB: Subsequent Requests
+
+ User->>App: Request with cookie
+ Note over App: Verify JWT signature
+ App->>KC: Validate token (if needed)
+ KC-->>App: Token valid
+ Note over App: Extract user context
+ App-->>User: Authorized response
+```
+
+### Authentication Components
+
+| Component | Purpose | Location |
+|-----------|---------|----------|
+| **Keycloak** | Identity provider, SSO, token management | External service |
+| **UserAuth** | Abstract auth interface | `openhands/server/user_auth/user_auth.py` |
+| **SaasUserAuth** | Keycloak implementation | `enterprise/server/auth/saas_user_auth.py` |
+| **JWT Service** | Token signing/verification | `openhands/app_server/services/jwt_service.py` |
+| **Auth Routes** | Login/logout endpoints | `enterprise/server/routes/auth.py` |
+
+### Token Flow
+
+1. **Keycloak Access Token**: Short-lived token for API access
+2. **Keycloak Refresh Token**: Long-lived token to obtain new access tokens
+3. **Signed JWT Cookie**: App Server's session cookie containing encrypted Keycloak tokens
+4. **Provider Tokens**: OAuth tokens for GitHub, GitLab, etc. (stored separately for git operations)
diff --git a/enterprise/doc/architecture/external-integrations.md b/enterprise/doc/architecture/external-integrations.md
new file mode 100644
index 0000000000..d5e16a7590
--- /dev/null
+++ b/enterprise/doc/architecture/external-integrations.md
@@ -0,0 +1,88 @@
+# External Integrations
+
+OpenHands integrates with external services (GitHub, Slack, Jira, etc.) through webhook-based event handling:
+
+```mermaid
+sequenceDiagram
+ autonumber
+ participant Ext as External Service
(GitHub/Slack/Jira)
+ participant App as App Server
+ participant IntRouter as Integration Router
+ participant Manager as Integration Manager
+ participant Conv as Conversation Service
+ participant Sandbox as Sandbox
+
+ Note over Ext,Sandbox: Webhook Event Flow (e.g., GitHub Issue Created)
+
+ Ext->>App: POST /api/integration/{service}/events
+ App->>IntRouter: Route to service handler
+ Note over IntRouter: Verify signature (HMAC)
+
+ IntRouter->>Manager: Parse event payload
+ Note over Manager: Extract context (repo, issue, user)
+ Note over Manager: Map external user → OpenHands user
+
+ Manager->>Conv: Create conversation (with issue context)
+ Conv->>Sandbox: Provision sandbox
+ Sandbox-->>Conv: Ready
+
+ Manager->>Sandbox: Start agent with task
+
+ Note over Ext,Sandbox: Agent Works on Task...
+
+ Sandbox-->>Manager: Task complete
+ Manager->>Ext: POST result
(PR, comment, etc.)
+
+ Note over Ext,Sandbox: Callback Flow (Agent → External Service)
+
+ Sandbox->>App: Webhook callback
/api/v1/webhooks
+ App->>Manager: Process callback
+ Manager->>Ext: Update external service
+```
+
+### Supported Integrations
+
+| Integration | Trigger Events | Agent Actions |
+|-------------|----------------|---------------|
+| **GitHub** | Issue created, PR opened, @mention | Create PR, comment, push commits |
+| **GitLab** | Issue created, MR opened | Create MR, comment, push commits |
+| **Slack** | @mention in channel | Reply in thread, create tasks |
+| **Jira** | Issue created/updated | Update ticket, add comments |
+| **Linear** | Issue created | Update status, add comments |
+
+### Integration Components
+
+| Component | Purpose | Location |
+|-----------|---------|----------|
+| **Integration Routes** | Webhook endpoints per service | `enterprise/server/routes/integration/` |
+| **Integration Managers** | Business logic per service | `enterprise/integrations/{service}/` |
+| **Token Manager** | Store/retrieve OAuth tokens | `enterprise/server/auth/token_manager.py` |
+| **Callback Processor** | Handle agent → service updates | `enterprise/integrations/{service}/*_callback_processor.py` |
+
+### Integration Authentication
+
+```
+External Service (e.g., GitHub)
+ │
+ ▼
+┌─────────────────────────────────┐
+│ GitHub App Installation │
+│ - Webhook secret for signature │
+│ - App private key for API calls │
+└─────────────────────────────────┘
+ │
+ ▼
+┌─────────────────────────────────┐
+│ User Account Linking │
+│ - Keycloak user ID │
+│ - GitHub user ID │
+│ - Stored OAuth tokens │
+└─────────────────────────────────┘
+ │
+ ▼
+┌─────────────────────────────────┐
+│ Agent Execution │
+│ - Uses linked tokens for API │
+│ - Can push, create PRs, comment │
+└─────────────────────────────────┘
+```
diff --git a/openhands/README.md b/openhands/README.md
index 12c599fcd8..85eefc48c0 100644
--- a/openhands/README.md
+++ b/openhands/README.md
@@ -1,8 +1,14 @@
-# OpenHands Architecture
This directory contains the core components of OpenHands.
-For an overview of the system architecture, see the [architecture documentation](https://docs.openhands.dev/usage/architecture/backend) (v0 backend architecture).
+## Documentation
+
+**[Architecture Documentation](./architecture/README.md)** with diagrams covering:
+ - System Architecture Overview
+ - Conversation Startup & WebSocket Flow
+ - Agent Execution & LLM Flow
+
+- **[External Architecture Docs](https://docs.openhands.dev/usage/architecture/backend)** - Official documentation (v0 backend architecture)
## Classes
diff --git a/openhands/architecture/README.md b/openhands/architecture/README.md
new file mode 100644
index 0000000000..095a34db23
--- /dev/null
+++ b/openhands/architecture/README.md
@@ -0,0 +1,10 @@
+# OpenHands Architecture
+
+Architecture diagrams and explanations for the OpenHands system.
+
+## Documentation Sections
+
+- [System Architecture Overview](./system-architecture.md) - Multi-tier architecture and component responsibilities
+- [Conversation Startup & WebSocket Flow](./conversation-startup.md) - Runtime provisioning and real-time communication
+- [Agent Execution & LLM Flow](./agent-execution.md) - LLM integration and action execution loop
+- [Observability](./observability.md) - Logging, metrics, and monitoring
diff --git a/openhands/architecture/agent-execution.md b/openhands/architecture/agent-execution.md
new file mode 100644
index 0000000000..4d2df3c130
--- /dev/null
+++ b/openhands/architecture/agent-execution.md
@@ -0,0 +1,92 @@
+# Agent Execution & LLM Flow
+
+When the agent executes inside the sandbox, it makes LLM calls through LiteLLM:
+
+```mermaid
+sequenceDiagram
+ autonumber
+ participant User as User (Browser)
+ participant AS as Agent Server
+ participant Agent as Agent
(CodeAct)
+ participant LLM as LLM Class
+ participant Lite as LiteLLM
+ participant Proxy as LLM Proxy
(llm-proxy.app.all-hands.dev)
+ participant Provider as LLM Provider
(OpenAI, Anthropic, etc.)
+ participant AES as Action Execution Server
+
+ Note over User,AES: Agent Loop - LLM Call Flow
+
+ User->>AS: WebSocket: User message
+ AS->>Agent: Process message
+ Note over Agent: Build prompt from state
+
+ Agent->>LLM: completion(messages, tools)
+ Note over LLM: Apply config (model, temp, etc.)
+
+ alt Using OpenHands Provider
+ LLM->>Lite: litellm_proxy/{model}
+ Lite->>Proxy: POST /chat/completions
+ Note over Proxy: Auth, rate limit, routing
+ Proxy->>Provider: Forward request
+ Provider-->>Proxy: Response
+ Proxy-->>Lite: Response
+ else Using Direct Provider
+ LLM->>Lite: {provider}/{model}
+ Lite->>Provider: Direct API call
+ Provider-->>Lite: Response
+ end
+
+ Lite-->>LLM: ModelResponse
+ Note over LLM: Track metrics (cost, tokens)
+ LLM-->>Agent: Parsed response
+
+ Note over Agent: Parse action from response
+ AS->>User: WebSocket: Action event
+
+ Note over User,AES: Action Execution
+
+ AS->>AES: HTTP: Execute action
+ Note over AES: Run command/edit file
+ AES-->>AS: Observation
+ AS->>User: WebSocket: Observation event
+
+ Note over Agent: Update state
+ Note over Agent: Loop continues...
+```
+
+### LLM Components
+
+| Component | Purpose | Location |
+|-----------|---------|----------|
+| **LLM Class** | Wrapper with retries, metrics, config | `openhands/llm/llm.py` |
+| **LiteLLM** | Universal LLM API adapter | External library |
+| **LLM Proxy** | OpenHands managed proxy for billing/routing | `llm-proxy.app.all-hands.dev` |
+| **LLM Registry** | Manages multiple LLM instances | `openhands/llm/llm_registry.py` |
+
+### Model Routing
+
+```
+User selects model
+ │
+ ▼
+┌───────────────────┐
+│ Model prefix? │
+└───────────────────┘
+ │
+ ├── openhands/claude-3-5 ──► Rewrite to litellm_proxy/claude-3-5
+ │ Base URL: llm-proxy.app.all-hands.dev
+ │
+ ├── anthropic/claude-3-5 ──► Direct to Anthropic API
+ │ (User's API key)
+ │
+ ├── openai/gpt-4 ──► Direct to OpenAI API
+ │ (User's API key)
+ │
+ └── azure/gpt-4 ──► Direct to Azure OpenAI
+ (User's API key + endpoint)
+```
+
+### LLM Proxy
+
+When using `openhands/` prefixed models, requests are routed through a managed proxy.
+See the [OpenHands documentation](https://docs.openhands.dev/) for details on supported models.
diff --git a/openhands/architecture/conversation-startup.md b/openhands/architecture/conversation-startup.md
new file mode 100644
index 0000000000..4da15aba1d
--- /dev/null
+++ b/openhands/architecture/conversation-startup.md
@@ -0,0 +1,68 @@
+# Conversation Startup & WebSocket Flow
+
+When a user starts a conversation, this sequence occurs:
+
+```mermaid
+sequenceDiagram
+ autonumber
+ participant User as User (Browser)
+ participant App as App Server
+ participant SS as Sandbox Service
+ participant RAPI as Runtime API
+ participant Pool as Warm Pool
+ participant Sandbox as Sandbox (Container)
+ participant AS as Agent Server
+ participant AES as Action Execution Server
+
+ Note over User,AES: Phase 1: Conversation Creation
+ User->>App: POST /api/conversations
+ Note over App: Authenticate user
+ App->>SS: Create sandbox
+
+ Note over SS,Pool: Phase 2: Runtime Provisioning
+ SS->>RAPI: POST /start (image, env, config)
+ RAPI->>Pool: Check for warm runtime
+ alt Warm runtime available
+ Pool-->>RAPI: Return warm runtime
+ Note over RAPI: Assign to session
+ else No warm runtime
+ RAPI->>Sandbox: Create new container
+ Sandbox->>AS: Start Agent Server
+ Sandbox->>AES: Start Action Execution Server
+ AES-->>AS: Ready
+ end
+ RAPI-->>SS: Runtime URL + session API key
+ SS-->>App: Sandbox info
+ App-->>User: Conversation ID + Sandbox URL
+
+ Note over User,AES: Phase 3: Direct WebSocket Connection
+ User->>AS: WebSocket: /sockets/events/{id}
+ AS-->>User: Connection accepted
+ AS->>User: Replay historical events
+
+ Note over User,AES: Phase 4: User Sends Message
+ User->>AS: WebSocket: SendMessageRequest
+ Note over AS: Agent processes message
+ Note over AS: LLM call → generate action
+
+ Note over User,AES: Phase 5: Action Execution Loop
+ loop Agent Loop
+ AS->>AES: HTTP: Execute action
+ Note over AES: Run in sandbox
+ AES-->>AS: Observation result
+ AS->>User: WebSocket: Event update
+ Note over AS: Update state, next action
+ end
+
+ Note over User,AES: Phase 6: Task Complete
+ AS->>User: WebSocket: AgentStateChanged (FINISHED)
+```
+
+### Key Points
+
+1. **Initial Setup via App Server**: The App Server handles authentication and coordinates with the Sandbox Service
+2. **Runtime API Provisioning**: The Sandbox Service calls the Runtime API, which checks for warm runtimes before creating new containers
+3. **Warm Pool Optimization**: Pre-warmed runtimes reduce startup latency significantly
+4. **Direct WebSocket to Sandbox**: Once created, the user's browser connects **directly** to the Agent Server inside the sandbox
+5. **App Server Not in Hot Path**: After connection, all real-time communication bypasses the App Server entirely
+6. **Agent Server Orchestrates**: The Agent Server manages the AI loop, calling the Action Execution Server for actual command execution
diff --git a/openhands/architecture/observability.md b/openhands/architecture/observability.md
new file mode 100644
index 0000000000..d7c798c309
--- /dev/null
+++ b/openhands/architecture/observability.md
@@ -0,0 +1,85 @@
+# Observability
+
+OpenHands provides structured logging and metrics collection for monitoring and debugging.
+
+> **SDK Documentation**: For detailed guidance on observability and metrics in agent development, see:
+> - [SDK Observability Guide](https://docs.openhands.dev/sdk/guides/observability)
+> - [SDK Metrics Guide](https://docs.openhands.dev/sdk/guides/metrics)
+
+```mermaid
+flowchart LR
+ subgraph Sources["Sources"]
+ Agent["Agent Server"]
+ App["App Server"]
+ Frontend["Frontend"]
+ end
+
+ subgraph Collection["Collection"]
+ JSONLog["JSON Logs
(stdout)"]
+ Metrics["Metrics
(Internal)"]
+ end
+
+ subgraph External["External (Optional)"]
+ LogAgg["Log Aggregator"]
+ Analytics["Analytics Service"]
+ end
+
+ Agent --> JSONLog
+ App --> JSONLog
+ App --> Metrics
+
+ JSONLog --> LogAgg
+ Frontend --> Analytics
+```
+
+### Structured Logging
+
+OpenHands uses Python's standard logging library with structured JSON output support.
+
+| Component | Format | Destination | Purpose |
+|-----------|--------|-------------|---------|
+| **Application Logs** | JSON (when `LOG_JSON=1`) | stdout | Debugging, error tracking |
+| **Access Logs** | JSON (Uvicorn) | stdout | Request tracing |
+| **LLM Debug Logs** | Plain text | File (optional) | LLM call debugging |
+
+### JSON Log Format
+
+When `LOG_JSON=1` is set, logs are emitted as single-line JSON for ingestion by log aggregators:
+
+```json
+{
+ "message": "Conversation started",
+ "severity": "INFO",
+ "conversation_id": "abc-123",
+ "user_id": "user-456",
+ "timestamp": "2024-01-15T10:30:00Z"
+}
+```
+
+Additional context can be added using Python's logger `extra=` parameter (see [Python logging docs](https://docs.python.org/3/library/logging.html)).
+
+### Metrics
+
+| Metric | Tracked By | Storage | Purpose |
+|--------|------------|---------|---------|
+| **LLM Cost** | `Metrics` class | Conversation stats file | Billing, budget limits |
+| **Token Usage** | `Metrics` class | Conversation stats file | Usage analytics |
+| **Response Latency** | `Metrics` class | Conversation stats file | Performance monitoring |
+
+### Conversation Stats Persistence
+
+Per-conversation metrics are persisted for analytics:
+
+```python
+# Location: openhands/server/services/conversation_stats.py
+ConversationStats:
+ - service_to_metrics: Dict[str, Metrics]
+ - accumulated_cost: float
+ - token_usage: TokenUsage
+
+# Stored at: {file_store}/conversation_stats/{conversation_id}.pkl
+```
+
+### Integration with External Services
+
+Structured JSON logging allows integration with any log aggregation service (e.g., ELK Stack, Loki, Splunk). Configure your log collector to ingest from container stdout/stderr.
diff --git a/openhands/architecture/system-architecture.md b/openhands/architecture/system-architecture.md
new file mode 100644
index 0000000000..7745270b31
--- /dev/null
+++ b/openhands/architecture/system-architecture.md
@@ -0,0 +1,88 @@
+# System Architecture Overview
+
+OpenHands supports multiple deployment configurations. This document describes the core components and how they interact.
+
+## Local/Docker Deployment
+
+The simplest deployment runs everything locally or in Docker containers:
+
+```mermaid
+flowchart TB
+ subgraph Server["OpenHands Server"]
+ API["REST API
(FastAPI)"]
+ ConvMgr["Conversation
Manager"]
+ Runtime["Runtime
Manager"]
+ end
+
+ subgraph Sandbox["Sandbox (Docker Container)"]
+ AES["Action Execution
Server"]
+ Browser["Browser
Environment"]
+ FS["File System"]
+ end
+
+ User["User"] -->|"HTTP/WebSocket"| API
+ API --> ConvMgr
+ ConvMgr --> Runtime
+ Runtime -->|"Provision"| Sandbox
+
+ Server -->|"Execute actions"| AES
+ AES --> Browser
+ AES --> FS
+```
+
+### Core Components
+
+| Component | Purpose | Location |
+|-----------|---------|----------|
+| **Server** | REST API, conversation management, runtime orchestration | `openhands/server/` |
+| **Runtime** | Abstract interface for sandbox execution | `openhands/runtime/` |
+| **Action Execution Server** | Execute bash, file ops, browser actions | Inside sandbox |
+| **EventStream** | Central event bus for all communication | `openhands/events/` |
+
+## Scalable Deployment
+
+For production deployments, OpenHands can be configured with a separate Runtime API service:
+
+```mermaid
+flowchart TB
+ subgraph AppServer["App Server"]
+ API["REST API"]
+ ConvMgr["Conversation
Manager"]
+ end
+
+ subgraph RuntimeAPI["Runtime API (Optional)"]
+ RuntimeMgr["Runtime
Manager"]
+ WarmPool["Warm Pool"]
+ end
+
+ subgraph Sandbox["Sandbox"]
+ AS["Agent Server"]
+ AES["Action Execution
Server"]
+ end
+
+ User["User"] -->|"HTTP"| API
+ API --> ConvMgr
+ ConvMgr -->|"Provision"| RuntimeMgr
+ RuntimeMgr --> WarmPool
+ RuntimeMgr --> Sandbox
+
+ User -.->|"WebSocket"| AS
+ AS -->|"HTTP"| AES
+```
+
+This configuration enables:
+- **Warm pool**: Pre-provisioned runtimes for faster startup
+- **Direct WebSocket**: Users connect directly to their sandbox, bypassing the App Server
+- **Horizontal scaling**: App Server and Runtime API can scale independently
+
+### Runtime Options
+
+OpenHands supports multiple runtime implementations:
+
+| Runtime | Use Case |
+|---------|----------|
+| **DockerRuntime** | Local development, single-machine deployments |
+| **RemoteRuntime** | Connect to externally managed sandboxes |
+| **ModalRuntime** | Serverless execution via Modal |
+
+See the [Runtime documentation](https://docs.openhands.dev/usage/architecture/runtime) for details.