diff --git a/.agents/skills/cross-repo-testing/SKILL.md b/.agents/skills/cross-repo-testing/SKILL.md new file mode 100644 index 0000000000..3ca98ac58d --- /dev/null +++ b/.agents/skills/cross-repo-testing/SKILL.md @@ -0,0 +1,202 @@ +--- +name: cross-repo-testing +description: This skill should be used when the user asks to "test a cross-repo feature", "deploy a feature branch to staging", "test SDK against OH Cloud", "e2e test a cloud workspace feature", "test provider tokens", "test secrets inheritance", or when changes span the SDK and OpenHands server repos and need end-to-end validation against a staging deployment. +triggers: +- cross-repo +- staging deployment +- feature branch deploy +- test against cloud +- e2e cloud +--- + +# Cross-Repo Testing: SDK ↔ OpenHands Cloud + +How to end-to-end test features that span `OpenHands/software-agent-sdk` and `OpenHands/OpenHands` (the Cloud backend). + +## Repository Map + +| Repo | Role | What lives here | +|------|------|-----------------| +| [`software-agent-sdk`](https://github.com/OpenHands/software-agent-sdk) | Agent core | `openhands-sdk`, `openhands-workspace`, `openhands-tools` packages. `OpenHandsCloudWorkspace` lives here. | +| [`OpenHands`](https://github.com/OpenHands/OpenHands) | Cloud backend | FastAPI server (`openhands/app_server/`), sandbox management, auth, enterprise integrations. Deployed as OH Cloud. | +| [`deploy`](https://github.com/OpenHands/deploy) | Infrastructure | Helm charts + GitHub Actions that build the enterprise Docker image and deploy to staging/production. | + +**Data flow:** SDK client → OH Cloud API (`/api/v1/...`) → sandbox agent-server (inside runtime container) + +## When You Need This + +There are **two flows** depending on which direction the dependency goes: + +| Flow | When | Example | +|------|------|---------| +| **A — SDK client → new Cloud API** | The SDK calls an API that doesn't exist yet on production | `workspace.get_llm()` calling `GET /api/v1/users/me?expose_secrets=true` | +| **B — OH server → new SDK code** | The Cloud server needs unreleased SDK packages or a new agent-server image | Server consumes a new tool, agent behavior, or workspace method from the SDK | + +Flow A only requires deploying the server PR. Flow B requires pinning the SDK to an unreleased commit in the server PR **and** using the SDK PR's agent-server image. Both flows may apply simultaneously. + +--- + +## Flow A: SDK Client Tests Against New Cloud API + +Use this when the SDK calls an endpoint that only exists on the server PR branch. + +### A1. Write and test the server-side changes + +In the `OpenHands` repo, implement the new API endpoint(s). Run unit tests: + +```bash +cd OpenHands +poetry run pytest tests/unit/app_server/test_.py -v +``` + +Push a PR. Wait for the **"Push Enterprise Image" (Docker) CI job** to succeed — this builds `ghcr.io/openhands/enterprise-server:sha-`. + +### A2. Write the SDK-side changes + +In `software-agent-sdk`, implement the client code (e.g., new methods on `OpenHandsCloudWorkspace`). Run SDK unit tests: + +```bash +cd software-agent-sdk +pip install -e openhands-sdk -e openhands-workspace +pytest tests/ -v +``` + +Push a PR. SDK CI is independent — it doesn't need the server changes to pass unit tests. + +### A3. Deploy the server PR to staging + +See [Deploying to a Staging Feature Environment](#deploying-to-a-staging-feature-environment) below. + +### A4. Run the SDK e2e test against staging + +See [Running E2E Tests Against Staging](#running-e2e-tests-against-staging) below. + +--- + +## Flow B: OH Server Needs Unreleased SDK Code + +Use this when the Cloud server depends on SDK changes that haven't been released to PyPI yet. The server's runtime containers run the `agent-server` image built from the SDK repo, so the server PR must be configured to use the SDK PR's image and packages. + +### B1. Get the SDK PR merged (or identify the commit) + +The SDK PR must have CI pass so its agent-server Docker image is built. The image is tagged with the **merge-commit SHA** from GitHub Actions — NOT the head-commit SHA shown in the PR. + +Find the correct image tag: +- Check the SDK PR description for an `AGENT_SERVER_IMAGES` section +- Or check the "Consolidate Build Information" CI job for `"short_sha": ""` + +### B2. Pin SDK packages to the commit in the OpenHands PR + +In the `OpenHands` repo PR, pin all 3 SDK packages (`openhands-sdk`, `openhands-agent-server`, `openhands-tools`) to the unreleased commit and update the agent-server image tag. This involves editing 3 files and regenerating 3 lock files. + +Follow the **`update-sdk` skill** → "Development: Pin SDK to an Unreleased Commit" section for the full procedure and file-by-file instructions. + +### B3. Wait for the OpenHands enterprise image to build + +Push the pinned changes. The OpenHands CI will build a new enterprise Docker image (`ghcr.io/openhands/enterprise-server:sha-`) that bundles the unreleased SDK. Wait for the "Push Enterprise Image" job to succeed. + +### B4. Deploy and test + +Follow [Deploying to a Staging Feature Environment](#deploying-to-a-staging-feature-environment) using the new OpenHands commit SHA. + +### B5. Before merging: remove the pin + +**CI guard:** `check-package-versions.yml` blocks merge to `main` if `[tool.poetry.dependencies]` contains `rev` fields. Before the OpenHands PR can merge, the SDK PR must be merged and released to PyPI, then the pin must be replaced with the released version number. + +--- + +## Deploying to a Staging Feature Environment + +The `deploy` repo creates preview environments from OpenHands PRs. + +**Option A — GitHub Actions UI (preferred):** +Go to `OpenHands/deploy` → Actions → "Create OpenHands preview PR" → enter the OpenHands PR number. This creates a branch `ohpr--` and opens a deploy PR. + +**Option B — Update an existing feature branch:** +```bash +cd deploy +git checkout ohpr-- +# In .github/workflows/deploy.yaml, update BOTH: +# OPENHANDS_SHA: "" +# OPENHANDS_RUNTIME_IMAGE_TAG: "-nikolaik" +git commit -am "Update OPENHANDS_SHA to " && git push +``` + +**Before updating the SHA**, verify the enterprise Docker image exists: +```bash +gh api repos/OpenHands/OpenHands/actions/runs \ + --jq '.workflow_runs[] | select(.head_sha=="") | "\(.name): \(.conclusion)"' \ + | grep Docker +# Must show: "Docker: success" +``` + +The deploy CI auto-triggers and creates the environment at: +``` +https://ohpr--.staging.all-hands.dev +``` + +**Wait for it to be live:** +```bash +curl -s -o /dev/null -w "%{http_code}" https://ohpr--.staging.all-hands.dev/api/v1/health +# 401 = server is up (auth required). DNS may take 1-2 min on first deploy. +``` + +## Running E2E Tests Against Staging + +**Critical: Feature deployments have their own Keycloak instance.** API keys from `app.all-hands.dev` or `$OPENHANDS_API_KEY` will NOT work. You need a test API key issued by the specific feature deployment's Keycloak. + +**You (the agent) cannot obtain this key yourself** — the feature environment requires interactive browser login with credentials you do not have. You must **ask the user** to: +1. Log in to the feature deployment at `https://ohpr--.staging.all-hands.dev` in their browser +2. Generate a test API key from the UI +3. Provide the key to you so you can proceed with e2e testing + +Do **not** attempt to log in via the browser or guess credentials. Wait for the user to supply the key before running any e2e tests. + +```python +from openhands.workspace import OpenHandsCloudWorkspace + +STAGING = "https://ohpr--.staging.all-hands.dev" + +with OpenHandsCloudWorkspace( + cloud_api_url=STAGING, + cloud_api_key="", +) as workspace: + # Test the new feature + llm = workspace.get_llm() + secrets = workspace.get_secrets() + print(f"LLM: {llm.model}, secrets: {list(secrets.keys())}") +``` + +Or run an example script: +```bash +OPENHANDS_CLOUD_API_KEY="" \ +OPENHANDS_CLOUD_API_URL="https://ohpr--.staging.all-hands.dev" \ +python examples/02_remote_agent_server/10_cloud_workspace_saas_credentials.py +``` + +### Recording results + +Both repos support a `.pr/` directory for temporary PR artifacts (design docs, test logs, scripts). These files are automatically removed when the PR is approved — see `.github/workflows/pr-artifacts.yml` and the "PR-Specific Artifacts" section in each repo's `AGENTS.md`. + +Push test output to the `.pr/logs/` directory of whichever repo you're working in: +```bash +mkdir -p .pr/logs +python test_script.py 2>&1 | tee .pr/logs/.log +git add -f .pr/logs/ +git commit -m "docs: add e2e test results" && git push +``` + +Comment on **both PRs** with pass/fail summary and link to logs. + +## Key Gotchas + +| Gotcha | Details | +|--------|---------| +| **Feature env auth is isolated** | Each `ohpr-*` deployment has its own Keycloak. Production API keys don't work. Agents cannot log in — you must ask the user to provide a test API key from the feature deployment's UI. | +| **Two SHAs in deploy.yaml** | `OPENHANDS_SHA` and `OPENHANDS_RUNTIME_IMAGE_TAG` must both be updated. The runtime tag is `-nikolaik`. | +| **Enterprise image must exist** | The Docker CI job on the OpenHands PR must succeed before you can deploy. If it hasn't run, push an empty commit to trigger it. | +| **DNS propagation** | First deployment of a new branch takes 1-2 min for DNS. Subsequent deploys are instant. | +| **Merge-commit SHA ≠ head SHA** | SDK CI tags Docker images with GitHub Actions' merge-commit SHA, not the PR head SHA. Check the SDK PR description or CI logs for the correct tag. | +| **SDK pin blocks merge** | `check-package-versions.yml` prevents merging an OpenHands PR that has `rev` fields in `[tool.poetry.dependencies]`. The SDK must be released to PyPI first. | +| **Flow A: stock agent-server is fine** | When only the Cloud API changes, `OpenHandsCloudWorkspace` talks to the Cloud server, not the agent-server. No custom image needed. | +| **Flow B: agent-server image is required** | When the server needs new SDK code inside runtime containers, you must pin to the SDK PR's agent-server image. | diff --git a/.github/workflows/pr-artifacts.yml b/.github/workflows/pr-artifacts.yml new file mode 100644 index 0000000000..ef656c1119 --- /dev/null +++ b/.github/workflows/pr-artifacts.yml @@ -0,0 +1,136 @@ +--- +name: PR Artifacts + +on: + workflow_dispatch: # Manual trigger for testing + pull_request: + types: [opened, synchronize, reopened] + branches: [main] + pull_request_review: + types: [submitted] + +jobs: + # Auto-remove .pr/ directory when a reviewer approves + cleanup-on-approval: + concurrency: + group: cleanup-pr-artifacts-${{ github.event.pull_request.number }} + cancel-in-progress: false + if: github.event_name == 'pull_request_review' && github.event.review.state == 'approved' + runs-on: ubuntu-latest + permissions: + contents: write + pull-requests: write + steps: + - name: Check if fork PR + id: check-fork + run: | + if [ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.event.pull_request.base.repo.full_name }}" ]; then + echo "is_fork=true" >> $GITHUB_OUTPUT + echo "::notice::Fork PR detected - skipping auto-cleanup (manual removal required)" + else + echo "is_fork=false" >> $GITHUB_OUTPUT + fi + + - uses: actions/checkout@v5 + if: steps.check-fork.outputs.is_fork == 'false' + with: + ref: ${{ github.event.pull_request.head.ref }} + token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }} + + - name: Remove .pr/ directory + id: remove + if: steps.check-fork.outputs.is_fork == 'false' + run: | + if [ -d ".pr" ]; then + git config user.name "allhands-bot" + git config user.email "allhands-bot@users.noreply.github.com" + git rm -rf .pr/ + git commit -m "chore: Remove PR-only artifacts [automated]" + git push || { + echo "::error::Failed to push cleanup commit. Check branch protection rules." + exit 1 + } + echo "removed=true" >> $GITHUB_OUTPUT + echo "::notice::Removed .pr/ directory" + else + echo "removed=false" >> $GITHUB_OUTPUT + echo "::notice::No .pr/ directory to remove" + fi + + - name: Update PR comment after cleanup + if: steps.check-fork.outputs.is_fork == 'false' && steps.remove.outputs.removed == 'true' + uses: actions/github-script@v7 + with: + script: | + const marker = ''; + const body = `${marker} + ✅ **PR Artifacts Cleaned Up** + + The \`.pr/\` directory has been automatically removed. + `; + + const { data: comments } = await github.rest.issues.listComments({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: context.issue.number, + }); + + const existing = comments.find(c => c.body.includes(marker)); + if (existing) { + await github.rest.issues.updateComment({ + owner: context.repo.owner, + repo: context.repo.repo, + comment_id: existing.id, + body: body, + }); + } + + # Warn if .pr/ directory exists (will be auto-removed on approval) + check-pr-artifacts: + if: github.event_name == 'pull_request' + runs-on: ubuntu-latest + permissions: + contents: read + pull-requests: write + steps: + - uses: actions/checkout@v5 + + - name: Check for .pr/ directory + id: check + run: | + if [ -d ".pr" ]; then + echo "exists=true" >> $GITHUB_OUTPUT + echo "::warning::.pr/ directory exists and will be automatically removed when the PR is approved. For fork PRs, manual removal is required before merging." + else + echo "exists=false" >> $GITHUB_OUTPUT + fi + + - name: Post or update PR comment + if: steps.check.outputs.exists == 'true' + uses: actions/github-script@v7 + with: + script: | + const marker = ''; + const body = `${marker} + 📁 **PR Artifacts Notice** + + This PR contains a \`.pr/\` directory with PR-specific documents. This directory will be **automatically removed** when the PR is approved. + + > For fork PRs: Manual removal is required before merging. + `; + + const { data: comments } = await github.rest.issues.listComments({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: context.issue.number, + }); + + const existing = comments.find(c => c.body.includes(marker)); + if (!existing) { + await github.rest.issues.createComment({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: context.issue.number, + body: body, + }); + } diff --git a/AGENTS.md b/AGENTS.md index 7a0bbc044a..811f3bdcf0 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -36,6 +36,40 @@ then re-run the command to ensure it passes. Common issues include: - Be especially careful with `git reset --hard` after staging files, as it will remove accidentally staged files - When remote has new changes, use `git fetch upstream && git rebase upstream/` on the same branch +## PR-Specific Artifacts (`.pr/` directory) + +When working on a PR that requires design documents, scripts meant for development-only, or other temporary artifacts that should NOT be merged to main, store them in a `.pr/` directory at the repository root. + +### Usage + +``` +.pr/ +├── design.md # Design decisions and architecture notes +├── analysis.md # Investigation or debugging notes +├── logs/ # Test output or CI logs for reviewer reference +└── notes.md # Any other PR-specific content +``` + +### How It Works + +1. **Notification**: When `.pr/` exists, a comment is posted to the PR conversation alerting reviewers +2. **Auto-cleanup**: When the PR is approved, the `.pr/` directory is automatically removed via `.github/workflows/pr-artifacts.yml` +3. **Fork PRs**: Auto-cleanup cannot push to forks, so manual removal is required before merging + +### Important Notes + +- Do NOT put anything in `.pr/` that needs to be preserved after merge +- The `.pr/` check passes (green ✅) during development — it only posts a notification, not a blocking error +- For fork PRs: You must manually remove `.pr/` before the PR can be merged + +### When to Use + +- Complex refactoring that benefits from written design rationale +- Debugging sessions where you want to document your investigation +- E2E test results or logs that demonstrate a cross-repo feature works +- Feature implementations that need temporary planning docs +- Any analysis that helps reviewers understand the PR but isn't needed long-term + ## Repository Structure Backend: - Located in the `openhands` directory