Refactor llm config from toml and clean up (#6923)

This commit is contained in:
Engel Nyst
2025-02-26 15:20:58 +01:00
committed by GitHub
parent 34fa9ed4db
commit f8045784b6
8 changed files with 191 additions and 218 deletions

View File

@@ -42,10 +42,11 @@ Créez un fichier ```config.toml``` dans le répertoire OpenHands et entrez ces
[core]
workspace_base="./workspace"
run_as_openhands=true
sandbox_base_container_image="image_personnalisée"
[sandbox]
base_container_image="image_personnalisée"
```
> Assurez-vous que ```sandbox_base_container_image``` est défini sur le nom de votre image personnalisée précédente.
> Assurez-vous que ```base_container_image``` est défini sur le nom de votre image personnalisée précédente.
## Exécution
@@ -82,14 +83,15 @@ dockerfile_content = (
## Dépannage / Erreurs
### Erreur: ```useradd: UID 1000 est non unique```
Si vous voyez cette erreur dans la sortie de la console, il s'agit du fait que OpenHands essaie de créer le utilisateur openhands dans le sandbox avec un ID d'utilisateur de 1000, cependant cet ID d'utilisateur est déjà utilisé dans l'image (pour une raison inconnue). Pour résoudre ce problème, changez la valeur du champ sandbox_user_id dans le fichier config.toml en une valeur différente:
Si vous voyez cette erreur dans la sortie de la console, il s'agit du fait que OpenHands essaie de créer le utilisateur openhands dans le sandbox avec un ID d'utilisateur de 1000, cependant cet ID d'utilisateur est déjà utilisé dans l'image (pour une raison inconnue). Pour résoudre ce problème, changez la valeur du champ user_id dans le fichier config.toml en une valeur différente:
```toml
[core]
workspace_base="./workspace"
run_as_openhands=true
sandbox_base_container_image="image_personnalisée"
sandbox_user_id="1001"
[sandbox]
base_container_image="image_personnalisée"
user_id="1001"
```
### Erreurs de port d'utilisation

View File

@@ -44,12 +44,13 @@ Tout d'abord, assurez-vous de pouvoir exécuter OpenHands en suivant les instruc
### Spécifier l'Image de Base du Sandbox
Dans le fichier `config.toml` dans le répertoire OpenHands, définissez `sandbox_base_container_image` sur l'image que vous souhaitez utiliser. Cela peut être une image que vous avez déjà extraite ou une que vous avez construite :
Dans le fichier `config.toml` dans le répertoire OpenHands, définissez `base_container_image` sur l'image que vous souhaitez utiliser. Cela peut être une image que vous avez déjà extraite ou une que vous avez construite :
```bash
[core]
...
sandbox_base_container_image="custom-image"
[sandbox]
base_container_image="custom-image"
```
### Exécution

View File

@@ -58,10 +58,11 @@ docker build -t custom_image .
[core]
workspace_base="./workspace"
run_as_openhands=true
sandbox_base_container_image="custom_image"
[sandbox]
base_container_image="custom_image"
```
对于 `sandbox_base_container_image` 的值, 您可以选择以下任意一项:
对于 `base_container_image` 的值, 您可以选择以下任意一项:
1. 在上一步中您构建的自定义镜像的名称(例如,`“custom_image”`
2. 从 Docker Hub 拉取的镜像(例如,`“node:20”`,如果你需要一个预装 `Node.js` 的沙箱环境)
@@ -83,14 +84,15 @@ sandbox_base_container_image="custom_image"
### 错误:```useradd: UID 1000 is not unique```
如果在控制台输出中看到此错误,说明 OpenHands 尝试在沙箱中以 UID 1000 创建 openhands 用户,但该 UID 已经被映像中的其他部分使用(不知何故)。要解决这个问题,请更改 config.toml 文件中的 sandbox_user_id 字段为不同的值:
如果在控制台输出中看到此错误,说明 OpenHands 尝试在沙箱中以 UID 1000 创建 openhands 用户,但该 UID 已经被映像中的其他部分使用(不知何故)。要解决这个问题,请更改 config.toml 文件中的 user_id 字段为不同的值:
```
[core]
workspace_base="./workspace"
run_as_openhands=true
sandbox_base_container_image="custom_image"
sandbox_user_id="1001"
[sandbox]
base_container_image="custom_image"
user_id="1001"
```
### 端口使用错误

View File

@@ -42,12 +42,13 @@ docker build -t custom-image .
### 指定基础沙箱镜像
在 OpenHands 目录中的 `config.toml` 文件中,将 `sandbox_base_container_image` 设置为你要使用的镜像。这可以是你已经拉取的镜像或你构建的镜像:
在 OpenHands 目录中的 `config.toml` 文件中,将 `base_container_image` 设置为你要使用的镜像。这可以是你已经拉取的镜像或你构建的镜像:
```bash
[core]
...
sandbox_base_container_image="custom-image"
[sandbox]
base_container_image="custom-image"
```
### 运行

View File

@@ -60,13 +60,14 @@ First, ensure you can run OpenHands by following the instructions in [Developmen
### Specify the Base Sandbox Image
In the `config.toml` file within the OpenHands directory, set the `sandbox_base_container_image` to the image you want to use.
In the `config.toml` file within the OpenHands directory, set the `base_container_image` to the image you want to use.
This can be an image youve already pulled or one youve built:
```bash
[core]
...
sandbox_base_container_image="custom-image"
[sandbox]
base_container_image="custom-image"
```
### Additional Configuration Options

View File

@@ -3,9 +3,10 @@ from __future__ import annotations
import os
from typing import Any
from pydantic import BaseModel, Field, SecretStr
from pydantic import BaseModel, Field, SecretStr, ValidationError
from openhands.core.logger import LOG_DIR
from openhands.core.logger import openhands_logger as logger
class LLMConfig(BaseModel):
@@ -90,6 +91,70 @@ class LLMConfig(BaseModel):
model_config = {'extra': 'forbid'}
@classmethod
def from_toml_section(cls, data: dict) -> dict[str, LLMConfig]:
"""
Create a mapping of LLMConfig instances from a toml dictionary representing the [llm] section.
The default configuration is built from all non-dict keys in data.
Then, each key with a dict value (e.g. [llm.random_name]) is treated as a custom LLM configuration,
and its values override the default configuration.
Example:
Apply generic LLM config with custom LLM overrides, e.g.
[llm]
model=...
num_retries = 5
[llm.claude]
model="claude-3-5-sonnet"
results in num_retries APPLIED to claude-3-5-sonnet.
Returns:
dict[str, LLMConfig]: A mapping where the key "llm" corresponds to the default configuration
and additional keys represent custom configurations.
"""
# Initialize the result mapping
llm_mapping: dict[str, LLMConfig] = {}
# Extract base config data (non-dict values)
base_data = {}
custom_sections: dict[str, dict] = {}
for key, value in data.items():
if isinstance(value, dict):
custom_sections[key] = value
else:
base_data[key] = value
# Try to create the base config
try:
base_config = cls.model_validate(base_data)
llm_mapping['llm'] = base_config
except ValidationError:
logger.warning(
'Cannot parse [llm] config from toml. Continuing with defaults.'
)
# If base config fails, create a default one
base_config = cls()
# Still add it to the mapping
llm_mapping['llm'] = base_config
# Process each custom section independently
for name, overrides in custom_sections.items():
try:
# Merge base config with overrides
merged = {**base_config.model_dump(), **overrides}
custom_config = cls.model_validate(merged)
llm_mapping[name] = custom_config
except ValidationError:
logger.warning(
f'Cannot parse [{name}] config from toml. This section will be skipped.'
)
# Skip this custom section but continue with others
continue
return llm_mapping
def model_post_init(self, __context: Any):
"""Post-initialization hook to assign OpenRouter-related variables to environment variables.

View File

@@ -123,142 +123,104 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml') -> None:
)
return
# if there was an exception or core is not in the toml, try to use the old-style toml
# Check for the [core] section
if 'core' not in toml_config:
# re-use the env loader to set the config from env-style vars
load_from_env(cfg, toml_config)
return
core_config = toml_config['core']
# load llm configs and agent configs
for key, value in toml_config.items():
if isinstance(value, dict):
try:
if key.lower() == 'extended':
# For ExtendedConfig (RootModel), pass the entire dict as the root value
cfg.extended = ExtendedConfig(value)
continue
if key is not None and key.lower() == 'agent':
# Every entry here is either a field for the default `agent` config group, or itself a group
# The best way to tell the difference is to try to parse it as an AgentConfig object
agent_group_ids: set[str] = set()
for nested_key, nested_value in value.items():
if isinstance(nested_value, dict):
try:
agent_config = AgentConfig(**nested_value)
except ValidationError:
continue
agent_group_ids.add(nested_key)
cfg.set_agent_config(agent_config, nested_key)
logger.openhands_logger.debug(
'Attempt to load default agent config from config toml'
)
value_without_groups = {
k: v for k, v in value.items() if k not in agent_group_ids
}
agent_config = AgentConfig(**value_without_groups)
cfg.set_agent_config(agent_config, 'agent')
elif key is not None and key.lower() == 'llm':
# Every entry here is either a field for the default `llm` config group, or itself a group
# The best way to tell the difference is to try to parse it as an LLMConfig object
llm_group_ids: set[str] = set()
for nested_key, nested_value in value.items():
if isinstance(nested_value, dict):
try:
llm_config = LLMConfig(**nested_value)
except ValidationError:
continue
llm_group_ids.add(nested_key)
cfg.set_llm_config(llm_config, nested_key)
logger.openhands_logger.debug(
'Attempt to load default LLM config from config toml'
)
# Extract generic LLM fields, which are not nested LLM configs
generic_llm_fields = {}
for k, v in value.items():
if not isinstance(v, dict):
generic_llm_fields[k] = v
generic_llm_config = LLMConfig(**generic_llm_fields)
cfg.set_llm_config(generic_llm_config, 'llm')
# Process custom named LLM configs
for nested_key, nested_value in value.items():
if isinstance(nested_value, dict):
logger.openhands_logger.debug(
f'Processing custom LLM config "{nested_key}":'
)
# Apply generic LLM config with custom LLM overrides, e.g.
# [llm]
# model="..."
# num_retries = 5
# [llm.claude]
# model="claude-3-5-sonnet"
# results in num_retries APPLIED to claude-3-5-sonnet
custom_fields = {}
for k, v in nested_value.items():
if not isinstance(v, dict):
custom_fields[k] = v
merged_llm_dict = generic_llm_fields.copy()
merged_llm_dict.update(custom_fields)
custom_llm_config = LLMConfig(**merged_llm_dict)
cfg.set_llm_config(custom_llm_config, nested_key)
elif key is not None and key.lower() == 'security':
logger.openhands_logger.debug(
'Attempt to load security config from config toml'
)
security_config = SecurityConfig(**value)
cfg.security = security_config
elif not key.startswith('sandbox') and key.lower() != 'core':
logger.openhands_logger.warning(
f'Unknown key in {toml_file}: "{key}"'
)
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [{key}] config from toml, values have not been applied.\nError: {e}',
)
else:
logger.openhands_logger.warning(f'Unknown section [{key}] in {toml_file}')
try:
# set sandbox config from the toml file
sandbox_config = cfg.sandbox
# migrate old sandbox configs from [core] section to sandbox config
keys_to_migrate = [key for key in core_config if key.startswith('sandbox_')]
for key in keys_to_migrate:
new_key = key.replace('sandbox_', '')
if new_key in sandbox_config.__annotations__:
# read the key in sandbox and remove it from core
setattr(sandbox_config, new_key, core_config.pop(key))
else:
logger.openhands_logger.warning(
f'Unknown config key "{key}" in [sandbox] section'
)
# the new style values override the old style values
if 'sandbox' in toml_config:
sandbox_config = SandboxConfig(**toml_config['sandbox'])
# update the config object with the new values
cfg.sandbox = sandbox_config
for key, value in core_config.items():
if hasattr(cfg, key):
setattr(cfg, key, value)
else:
logger.openhands_logger.warning(
f'Unknown config key "{key}" in [core] section'
)
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [sandbox] config from toml, values have not been applied.\nError: {e}',
f'No [core] section found in {toml_file}. Core settings will use defaults.'
)
core_config = {}
else:
core_config = toml_config['core']
# Process core section if present
for key, value in core_config.items():
if hasattr(cfg, key):
setattr(cfg, key, value)
else:
logger.openhands_logger.warning(
f'Unknown config key "{key}" in [core] section'
)
# Process agent section if present
if 'agent' in toml_config:
try:
value = toml_config['agent']
# Every entry here is either a field for the default `agent` config group, or itself a group
# The best way to tell the difference is to try to parse it as an AgentConfig object
agent_group_ids: set[str] = set()
for nested_key, nested_value in value.items():
if isinstance(nested_value, dict):
try:
agent_config = AgentConfig(**nested_value)
except ValidationError:
continue
agent_group_ids.add(nested_key)
cfg.set_agent_config(agent_config, nested_key)
logger.openhands_logger.debug(
'Attempt to load default agent config from config toml'
)
value_without_groups = {
k: v for k, v in value.items() if k not in agent_group_ids
}
agent_config = AgentConfig(**value_without_groups)
cfg.set_agent_config(agent_config, 'agent')
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [agent] config from toml, values have not been applied.\nError: {e}'
)
# Process llm section if present
if 'llm' in toml_config:
try:
llm_mapping = LLMConfig.from_toml_section(toml_config['llm'])
for llm_key, llm_conf in llm_mapping.items():
cfg.set_llm_config(llm_conf, llm_key)
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [llm] config from toml, values have not been applied.\nError: {e}'
)
# Process security section if present
if 'security' in toml_config:
try:
logger.openhands_logger.debug(
'Attempt to load security config from config toml'
)
security_config = SecurityConfig(**toml_config['security'])
cfg.security = security_config
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [security] config from toml, values have not been applied.\nError: {e}'
)
# Process sandbox section if present
if 'sandbox' in toml_config:
try:
logger.openhands_logger.debug(
'Attempt to load sandbox config from config toml'
)
sandbox_config = SandboxConfig(**toml_config['sandbox'])
cfg.sandbox = sandbox_config
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [sandbox] config from toml, values have not been applied.\nError: {e}'
)
# Process extended section if present
if 'extended' in toml_config:
try:
cfg.extended = ExtendedConfig(toml_config['extended'])
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [extended] config from toml, values have not been applied.\nError: {e}'
)
# Check for unknown sections
known_sections = {'core', 'extended', 'agent', 'llm', 'security', 'sandbox'}
for key in toml_config:
if key.lower() not in known_sections:
logger.openhands_logger.warning(f'Unknown section [{key}] in {toml_file}')
def get_or_create_jwt_secret(file_store: FileStore) -> str:

View File

@@ -166,19 +166,6 @@ def test_llm_config_native_tool_calling(default_config, temp_toml_file, monkeypa
# default is None
assert default_config.get_llm_config().native_tool_calling is None
# without `[core]` section, native_tool_calling is not set because the file is not loaded
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write(
"""
[llm.gpt4o-mini]
native_tool_calling = true
"""
)
load_from_toml(default_config, temp_toml_file)
assert default_config.get_llm_config().native_tool_calling is None
assert default_config.get_llm_config('gpt4o-mini').native_tool_calling is None
# set to false
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write(
@@ -216,51 +203,6 @@ native_tool_calling = true
) # load_from_env didn't override the named config set in the toml file under [llm.gpt4o-mini]
def test_compat_load_sandbox_from_toml(default_config: AppConfig, temp_toml_file: str):
# test loading configuration from a new-style TOML file
# uses a toml file with sandbox_vars instead of a sandbox section
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write(
"""
[llm]
model = "test-model"
[agent]
memory_enabled = true
[core]
workspace_base = "/opt/files2/workspace"
sandbox_timeout = 500
sandbox_base_container_image = "node:14"
sandbox_user_id = 1001
default_agent = "TestAgent"
"""
)
load_from_toml(default_config, temp_toml_file)
assert default_config.get_llm_config().model == 'test-model'
assert default_config.get_llm_config_from_agent().model == 'test-model'
assert default_config.default_agent == 'TestAgent'
assert default_config.get_agent_config().memory_enabled is True
assert default_config.workspace_base == '/opt/files2/workspace'
assert default_config.sandbox.timeout == 500
assert default_config.sandbox.base_container_image == 'node:14'
assert default_config.sandbox.user_id == 1001
assert default_config.workspace_mount_path_in_sandbox == '/workspace'
finalize_config(default_config)
# app config doesn't have fields sandbox_*
assert not hasattr(default_config, 'sandbox_timeout')
assert not hasattr(default_config, 'sandbox_base_container_image')
assert not hasattr(default_config, 'sandbox_user_id')
# after finalize_config, workspace_mount_path is set to the absolute path of workspace_base
# if it was undefined
assert default_config.workspace_mount_path == '/opt/files2/workspace'
def test_env_overrides_compat_toml(monkeypatch, default_config, temp_toml_file):
# test that environment variables override TOML values using monkeypatch
# uses a toml file with sandbox_vars instead of a sandbox section
@@ -506,14 +448,11 @@ security_analyzer = "semgrep"
""")
load_from_toml(default_config, temp_toml_file)
assert default_config.get_llm_config().model == 'claude-3-5-sonnet-20241022'
assert default_config.get_agent_config().memory_enabled is False
assert (
default_config.sandbox.base_container_image
== 'nikolaik/python-nodejs:python3.12-nodejs22'
)
# assert default_config.sandbox.user_id == 1007
assert default_config.security.security_analyzer is None
assert default_config.get_llm_config().model == 'test-model'
assert default_config.get_agent_config().memory_enabled is True
assert default_config.sandbox.base_container_image == 'custom_image'
assert default_config.sandbox.user_id == 1001
assert default_config.security.security_analyzer == 'semgrep'
def test_load_from_toml_partial_invalid(default_config, temp_toml_file, caplog):
@@ -562,7 +501,7 @@ invalid_field_in_sandbox = "test"
assert 'Cannot parse [llm] config from toml' in log_content
assert 'values have not been applied' in log_content
# Error: LLMConfig.__init__() got an unexpected keyword argume
assert 'Error: 1 validation error for LLMConfig' in log_content
assert 'Cannot parse [llm] config from toml' in log_content
assert 'invalid_field' in log_content
# invalid [sandbox] config
@@ -572,7 +511,7 @@ invalid_field_in_sandbox = "test"
# Verify valid configurations are loaded. Load from default instead of `config.toml`
# assert default_config.debug is True
assert default_config.debug is False
assert default_config.debug is True
assert default_config.get_llm_config().model == 'claude-3-5-sonnet-20241022'
assert default_config.get_agent_config().memory_enabled is True
finally: