mirror of
https://github.com/camel-ai/owl.git
synced 2026-03-22 05:57:17 +08:00
update gaia (#123)
This commit is contained in:
@@ -249,9 +249,14 @@ The web interface is built using Gradio and runs locally on your machine. No dat
|
|||||||
|
|
||||||
# 🧪 Experiments
|
# 🧪 Experiments
|
||||||
|
|
||||||
We provided a script to reproduce the results on GAIA.
|
To reproduce OWL's GAIA benchmark score of 58.18:
|
||||||
You can check the `run_gaia_roleplaying.py` file and run the following command:
|
|
||||||
|
|
||||||
|
1. Switch to the `gaia58.18` branch:
|
||||||
|
```bash
|
||||||
|
git checkout gaia58.18
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Run the evaluation script:
|
||||||
```bash
|
```bash
|
||||||
python run_gaia_roleplaying.py
|
python run_gaia_roleplaying.py
|
||||||
```
|
```
|
||||||
|
|||||||
10
README_zh.md
10
README_zh.md
@@ -244,9 +244,15 @@ python run_app.py
|
|||||||
|
|
||||||
# 🧪 实验
|
# 🧪 实验
|
||||||
|
|
||||||
我们提供了一个脚本用于复现 GAIA 上的实验结果。
|
我们提供了一个脚本用于复现 GAIA 上的实验结果。
|
||||||
你可以查看 `run_gaia_roleplaying.py` 文件,并运行以下命令:
|
要复现我们在 GAIA 基准测试中获得的 58.18 分:
|
||||||
|
|
||||||
|
1. 切换到 `gaia58.18` 分支:
|
||||||
|
```bash
|
||||||
|
git checkout gaia58.18
|
||||||
|
```
|
||||||
|
|
||||||
|
2. 运行评估脚本:
|
||||||
```bash
|
```bash
|
||||||
python run_gaia_roleplaying.py
|
python run_gaia_roleplaying.py
|
||||||
```
|
```
|
||||||
|
|||||||
Reference in New Issue
Block a user