mirror of
https://github.com/OpenHands/OpenHands.git
synced 2025-12-26 05:48:36 +08:00
chore: add claude 4 to verified mode & global replace 3.7 to claude 4 (#8665)
Co-authored-by: openhands <openhands@all-hands.dev>
This commit is contained in:
parent
5e43dbadcb
commit
31ad7fc175
2
.github/workflows/openhands-resolver.yml
vendored
2
.github/workflows/openhands-resolver.yml
vendored
@ -24,7 +24,7 @@ on:
|
||||
LLM_MODEL:
|
||||
required: false
|
||||
type: string
|
||||
default: "anthropic/claude-3-7-sonnet-20250219"
|
||||
default: "anthropic/claude-sonnet-4-20250514"
|
||||
LLM_API_VERSION:
|
||||
required: false
|
||||
type: string
|
||||
|
||||
@ -67,7 +67,7 @@ docker run -it --rm --pull=always \
|
||||
You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!
|
||||
|
||||
When you open the application, you'll be asked to choose an LLM provider and add an API key.
|
||||
[Anthropic's Claude 3.7 Sonnet](https://www.anthropic.com/api) (`anthropic/claude-3-7-sonnet-20250219`)
|
||||
[Anthropic's Claude Sonnet 4](https://www.anthropic.com/api) (`anthropic/claude-sonnet-4-20250514`)
|
||||
works best, but you have [many options](https://docs.all-hands.dev/modules/usage/llms).
|
||||
|
||||
## 💡 Other ways to run OpenHands
|
||||
|
||||
@ -52,4 +52,4 @@ $ poetry run python docs/translation_updater.py
|
||||
# ...
|
||||
```
|
||||
|
||||
This process uses `claude-3-7-sonnet-20250219` as base model and each language consumes at least ~30k input tokens and ~35k output tokens.
|
||||
This process uses `claude-sonnet-4-20250514` as base model and each language consumes at least ~30k input tokens and ~35k output tokens.
|
||||
|
||||
@ -13,7 +13,7 @@ recommandations pour la sélection de modèles. Nos derniers résultats d'évalu
|
||||
|
||||
Sur la base de ces résultats et des retours de la communauté, les modèles suivants ont été vérifiés comme fonctionnant raisonnablement bien avec OpenHands :
|
||||
|
||||
- [anthropic/claude-3-7-sonnet-20250219](https://www.anthropic.com/api) (recommandé)
|
||||
- [anthropic/claude-sonnet-4-20250514](https://www.anthropic.com/api) (recommandé)
|
||||
- [gemini/gemini-2.5-pro](https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/)
|
||||
- [deepseek/deepseek-chat](https://api-docs.deepseek.com/)
|
||||
- [openai/o3-mini](https://openai.com/index/openai-o3-mini/)
|
||||
|
||||
@ -13,7 +13,7 @@ OpenHandsはLiteLLMでサポートされているあらゆるLLMに接続でき
|
||||
|
||||
これらの調査結果とコミュニティからのフィードバックに基づき、以下のモデルはOpenHandsでうまく動作することが確認されています:
|
||||
|
||||
- [anthropic/claude-3-7-sonnet-20250219](https://www.anthropic.com/api) (推奨)
|
||||
- [anthropic/claude-sonnet-4-20250514](https://www.anthropic.com/api) (推奨)
|
||||
- [gemini/gemini-2.5-pro](https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/)
|
||||
- [deepseek/deepseek-chat](https://api-docs.deepseek.com/)
|
||||
- [openai/o3-mini](https://openai.com/index/openai-o3-mini/)
|
||||
|
||||
@ -13,7 +13,7 @@ recomendações para seleção de modelos. Nossos resultados de benchmarking mai
|
||||
|
||||
Com base nessas descobertas e feedback da comunidade, os seguintes modelos foram verificados e funcionam razoavelmente bem com o OpenHands:
|
||||
|
||||
- [anthropic/claude-3-7-sonnet-20250219](https://www.anthropic.com/api) (recomendado)
|
||||
- [anthropic/claude-sonnet-4-20250514](https://www.anthropic.com/api) (recomendado)
|
||||
- [gemini/gemini-2.5-pro](https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/)
|
||||
- [deepseek/deepseek-chat](https://api-docs.deepseek.com/)
|
||||
- [openai/o3-mini](https://openai.com/index/openai-o3-mini/)
|
||||
|
||||
@ -12,7 +12,7 @@ OpenHands 可以连接到任何 LiteLLM 支持的 LLM。但是,它需要一个
|
||||
|
||||
基于这些发现和社区反馈,以下模型已被验证可以与 OpenHands 合理地配合使用:
|
||||
|
||||
- [anthropic/claude-3-7-sonnet-20250219](https://www.anthropic.com/api)(推荐)
|
||||
- [anthropic/claude-sonnet-4-20250514](https://www.anthropic.com/api)(推荐)
|
||||
- [gemini/gemini-2.5-pro](https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/)
|
||||
- [deepseek/deepseek-chat](https://api-docs.deepseek.com/)
|
||||
- [openai/o3-mini](https://openai.com/index/openai-o3-mini/)
|
||||
|
||||
@ -23,7 +23,7 @@ This command opens an interactive prompt where you can type tasks or commands an
|
||||
|
||||
1. Set the following environment variables in your terminal:
|
||||
- `SANDBOX_VOLUMES` to specify the directory you want OpenHands to access ([See using SANDBOX_VOLUMES for more info](../runtimes/docker#using-sandbox_volumes))
|
||||
- `LLM_MODEL` - the LLM model to use (e.g. `export LLM_MODEL="anthropic/claude-3-7-sonnet-20250219"`)
|
||||
- `LLM_MODEL` - the LLM model to use (e.g. `export LLM_MODEL="anthropic/claude-sonnet-4-20250514"`)
|
||||
- `LLM_API_KEY` - your API key (e.g. `export LLM_API_KEY="sk_test_12345"`)
|
||||
|
||||
2. Run the following command:
|
||||
|
||||
@ -23,7 +23,7 @@ To run OpenHands in Headless mode with Docker:
|
||||
|
||||
1. Set the following environment variables in your terminal:
|
||||
- `SANDBOX_VOLUMES` to specify the directory you want OpenHands to access ([See using SANDBOX_VOLUMES for more info](../runtimes/docker#using-sandbox_volumes))
|
||||
- `LLM_MODEL` - the LLM model to use (e.g. `export LLM_MODEL="anthropic/claude-3-7-sonnet-20250219"`)
|
||||
- `LLM_MODEL` - the LLM model to use (e.g. `export LLM_MODEL="anthropic/claude-sonnet-4-20250514"`)
|
||||
- `LLM_API_KEY` - your API key (e.g. `export LLM_API_KEY="sk_test_12345"`)
|
||||
|
||||
2. Run the following Docker command:
|
||||
|
||||
@ -13,7 +13,7 @@ recommendations for model selection. Our latest benchmarking results can be foun
|
||||
|
||||
Based on these findings and community feedback, these are the latest models that have been verified to work reasonably well with OpenHands:
|
||||
|
||||
- [anthropic/claude-3-7-sonnet-20250219](https://www.anthropic.com/api) (recommended)
|
||||
- [anthropic/claude-sonnet-4-20250514](https://www.anthropic.com/api) (recommended)
|
||||
- [openai/o4-mini](https://openai.com/index/introducing-o3-and-o4-mini/)
|
||||
- [gemini/gemini-2.5-pro](https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/)
|
||||
- [deepseek/deepseek-chat](https://api-docs.deepseek.com/)
|
||||
|
||||
@ -57,7 +57,7 @@ def translate_content(content, target_lang):
|
||||
system_prompt = f'You are a professional translator. Translate the following content into {target_lang}. Preserve all Markdown formatting, code blocks, and front matter. Keep any {{% jsx %}} tags and similar intact. Do not translate code examples, URLs, or technical terms.'
|
||||
|
||||
message = client.messages.create(
|
||||
model='claude-3-7-sonnet-20250219',
|
||||
model='claude-sonnet-4-20250514',
|
||||
max_tokens=4096,
|
||||
temperature=0,
|
||||
system=system_prompt,
|
||||
|
||||
@ -48,7 +48,7 @@ describe("Content", () => {
|
||||
|
||||
await waitFor(() => {
|
||||
expect(provider).toHaveValue("Anthropic");
|
||||
expect(model).toHaveValue("claude-3-7-sonnet-20250219");
|
||||
expect(model).toHaveValue("claude-sonnet-4-20250514");
|
||||
|
||||
expect(apiKey).toHaveValue("");
|
||||
expect(apiKey).toHaveProperty("placeholder", "");
|
||||
@ -135,7 +135,7 @@ describe("Content", () => {
|
||||
);
|
||||
const condensor = screen.getByTestId("enable-memory-condenser-switch");
|
||||
|
||||
expect(model).toHaveValue("anthropic/claude-3-7-sonnet-20250219");
|
||||
expect(model).toHaveValue("anthropic/claude-sonnet-4-20250514");
|
||||
expect(baseUrl).toHaveValue("");
|
||||
expect(apiKey).toHaveValue("");
|
||||
expect(apiKey).toHaveProperty("placeholder", "");
|
||||
@ -542,7 +542,7 @@ describe("Form submission", () => {
|
||||
|
||||
// select model
|
||||
await userEvent.click(model);
|
||||
const modelOption = screen.getByText("claude-3-7-sonnet-20250219");
|
||||
const modelOption = screen.getByText("claude-sonnet-4-20250514");
|
||||
await userEvent.click(modelOption);
|
||||
|
||||
const submitButton = screen.getByTestId("submit-button");
|
||||
@ -550,7 +550,7 @@ describe("Form submission", () => {
|
||||
|
||||
expect(saveSettingsSpy).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
llm_model: "anthropic/claude-3-7-sonnet-20250219",
|
||||
llm_model: "anthropic/claude-sonnet-4-20250514",
|
||||
llm_base_url: "",
|
||||
confirmation_mode: false,
|
||||
}),
|
||||
|
||||
@ -71,6 +71,18 @@ describe("extractModelAndProvider", () => {
|
||||
separator: "/",
|
||||
});
|
||||
|
||||
expect(extractModelAndProvider("claude-sonnet-4-20250514")).toEqual({
|
||||
provider: "anthropic",
|
||||
model: "claude-sonnet-4-20250514",
|
||||
separator: "/",
|
||||
});
|
||||
|
||||
expect(extractModelAndProvider("claude-opus-4-20250514")).toEqual({
|
||||
provider: "anthropic",
|
||||
model: "claude-opus-4-20250514",
|
||||
separator: "/",
|
||||
});
|
||||
|
||||
expect(extractModelAndProvider("claude-3-haiku-20240307")).toEqual({
|
||||
provider: "anthropic",
|
||||
model: "claude-3-haiku-20240307",
|
||||
|
||||
@ -100,7 +100,7 @@ const openHandsHandlers = [
|
||||
"gpt-4o",
|
||||
"gpt-4o-mini",
|
||||
"anthropic/claude-3.5",
|
||||
"anthropic/claude-3-7-sonnet-20250219",
|
||||
"anthropic/claude-sonnet-4-20250514",
|
||||
]),
|
||||
),
|
||||
|
||||
|
||||
@ -279,7 +279,7 @@ function LlmSettingsScreen() {
|
||||
<ModelSelector
|
||||
models={modelsAndProviders}
|
||||
currentModel={
|
||||
settings.LLM_MODEL || "anthropic/claude-3-5-sonnet-20241022"
|
||||
settings.LLM_MODEL || "anthropic/claude-sonnet-4-20250514"
|
||||
}
|
||||
onChange={handleModelIsDirty}
|
||||
/>
|
||||
@ -342,9 +342,9 @@ function LlmSettingsScreen() {
|
||||
name="llm-custom-model-input"
|
||||
label={t(I18nKey.SETTINGS$CUSTOM_MODEL)}
|
||||
defaultValue={
|
||||
settings.LLM_MODEL || "anthropic/claude-3-7-sonnet-20250219"
|
||||
settings.LLM_MODEL || "anthropic/claude-sonnet-4-20250514"
|
||||
}
|
||||
placeholder="anthropic/claude-3-7-sonnet-20250219"
|
||||
placeholder="anthropic/claude-sonnet-4-20250514"
|
||||
type="text"
|
||||
className="w-[680px]"
|
||||
onChange={handleCustomModelIsDirty}
|
||||
|
||||
@ -3,7 +3,7 @@ import { Settings } from "#/types/settings";
|
||||
export const LATEST_SETTINGS_VERSION = 5;
|
||||
|
||||
export const DEFAULT_SETTINGS: Settings = {
|
||||
LLM_MODEL: "anthropic/claude-3-7-sonnet-20250219",
|
||||
LLM_MODEL: "anthropic/claude-sonnet-4-20250514",
|
||||
LLM_BASE_URL: "",
|
||||
AGENT: "CodeActAgent",
|
||||
LANGUAGE: "en",
|
||||
|
||||
@ -6,6 +6,8 @@ export const VERIFIED_MODELS = [
|
||||
"o4-mini-2025-04-16",
|
||||
"claude-3-5-sonnet-20241022",
|
||||
"claude-3-7-sonnet-20250219",
|
||||
"claude-sonnet-4-20250514",
|
||||
"claude-opus-4-20250514",
|
||||
"deepseek-chat",
|
||||
];
|
||||
|
||||
@ -39,4 +41,6 @@ export const VERIFIED_ANTHROPIC_MODELS = [
|
||||
"claude-3-opus-20240229",
|
||||
"claude-3-sonnet-20240229",
|
||||
"claude-3-7-sonnet-20250219",
|
||||
"claude-sonnet-4-20250514",
|
||||
"claude-opus-4-20250514",
|
||||
];
|
||||
|
||||
@ -167,6 +167,8 @@ VERIFIED_ANTHROPIC_MODELS = [
|
||||
'claude-3-opus-20240229',
|
||||
'claude-3-sonnet-20240229',
|
||||
'claude-3-7-sonnet-20250219',
|
||||
'claude-sonnet-4-20250514',
|
||||
'claude-opus-4-20250514',
|
||||
]
|
||||
|
||||
|
||||
|
||||
@ -47,7 +47,7 @@ class LLMConfig(BaseModel):
|
||||
seed: The seed to use for the LLM.
|
||||
"""
|
||||
|
||||
model: str = Field(default='claude-3-7-sonnet-20250219')
|
||||
model: str = Field(default='claude-sonnet-4-20250514')
|
||||
api_key: SecretStr | None = Field(default=None)
|
||||
base_url: str | None = Field(default=None)
|
||||
api_version: str | None = Field(default=None)
|
||||
|
||||
@ -109,7 +109,7 @@ export GIT_USERNAME="your-gitlab-username" # Optional, defaults to token owner
|
||||
|
||||
# LLM configuration
|
||||
|
||||
export LLM_MODEL="anthropic/claude-3-7-sonnet-20250219" # Recommended
|
||||
export LLM_MODEL="anthropic/claude-sonnet-4-20250514" # Recommended
|
||||
export LLM_API_KEY="your-llm-api-key"
|
||||
export LLM_BASE_URL="your-api-url" # Optional, for API proxies
|
||||
```
|
||||
|
||||
@ -24,7 +24,7 @@ jobs:
|
||||
macro: ${{ vars.OPENHANDS_MACRO || '@openhands-agent' }}
|
||||
max_iterations: ${{ fromJson(vars.OPENHANDS_MAX_ITER || 50) }}
|
||||
base_container_image: ${{ vars.OPENHANDS_BASE_CONTAINER_IMAGE || '' }}
|
||||
LLM_MODEL: ${{ vars.LLM_MODEL || 'anthropic/claude-3-7-sonnet-20250219' }}
|
||||
LLM_MODEL: ${{ vars.LLM_MODEL || 'anthropic/claude-sonnet-4-20250514' }}
|
||||
target_branch: ${{ vars.TARGET_BRANCH || 'main' }}
|
||||
runner: ${{ vars.TARGET_RUNNER }}
|
||||
secrets:
|
||||
|
||||
@ -354,11 +354,11 @@ class TestModelAndProviderFunctions:
|
||||
assert result['separator'] == '/'
|
||||
|
||||
def test_extract_model_and_provider_anthropic_implicit(self):
|
||||
model = 'claude-3-7-sonnet-20250219'
|
||||
model = 'claude-sonnet-4-20250514'
|
||||
result = extract_model_and_provider(model)
|
||||
|
||||
assert result['provider'] == 'anthropic'
|
||||
assert result['model'] == 'claude-3-7-sonnet-20250219'
|
||||
assert result['model'] == 'claude-sonnet-4-20250514'
|
||||
assert result['separator'] == '/'
|
||||
|
||||
def test_extract_model_and_provider_versioned(self):
|
||||
@ -380,7 +380,7 @@ class TestModelAndProviderFunctions:
|
||||
def test_organize_models_and_providers(self):
|
||||
models = [
|
||||
'openai/gpt-4o',
|
||||
'anthropic/claude-3-7-sonnet-20250219',
|
||||
'anthropic/claude-sonnet-4-20250514',
|
||||
'o3-mini',
|
||||
'anthropic.claude-3-5', # Should be ignored as it uses dot separator for anthropic
|
||||
'unknown-model',
|
||||
@ -397,7 +397,7 @@ class TestModelAndProviderFunctions:
|
||||
assert 'o3-mini' in result['openai']['models']
|
||||
|
||||
assert len(result['anthropic']['models']) == 1
|
||||
assert 'claude-3-7-sonnet-20250219' in result['anthropic']['models']
|
||||
assert 'claude-sonnet-4-20250514' in result['anthropic']['models']
|
||||
|
||||
assert len(result['other']['models']) == 1
|
||||
assert 'unknown-model' in result['other']['models']
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user