update gaia (#123)

This commit is contained in:
Mengkang Hu
2025-03-10 07:52:27 +08:00
committed by GitHub
2 changed files with 15 additions and 4 deletions

View File

@@ -249,9 +249,14 @@ The web interface is built using Gradio and runs locally on your machine. No dat
# 🧪 Experiments
We provided a script to reproduce the results on GAIA.
You can check the `run_gaia_roleplaying.py` file and run the following command:
To reproduce OWL's GAIA benchmark score of 58.18:
1. Switch to the `gaia58.18` branch:
```bash
git checkout gaia58.18
```
1. Run the evaluation script:
```bash
python run_gaia_roleplaying.py
```

View File

@@ -244,9 +244,15 @@ python run_app.py
# 🧪 实验
我们提供了一个脚本用于复现 GAIA 上的实验结果。
你可以查看 `run_gaia_roleplaying.py` 文件,并运行以下命令
我们提供了一个脚本用于复现 GAIA 上的实验结果。
要复现我们在 GAIA 基准测试中获得的 58.18 分
1. 切换到 `gaia58.18` 分支:
```bash
git checkout gaia58.18
```
2. 运行评估脚本:
```bash
python run_gaia_roleplaying.py
```