From 49979497d745fc5cc690f183402805551188eb27 Mon Sep 17 00:00:00 2001
From: lazychih114 <55657767+Aaron617@users.noreply.github.com>
Date: Mon, 10 Mar 2025 07:52:07 +0800
Subject: [PATCH] update gaia

---
 README.md    |  9 +++++++--
 README_zh.md | 10 ++++++++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index adb70af..b3419c6 100644
--- a/README.md
+++ b/README.md
@@ -172,9 +172,14 @@ Example tasks you can try:
 
 # 🧪 Experiments
 
-We provided a script to reproduce the results on GAIA. 
-You can check the `run_gaia_roleplaying.py` file and run the following command:
+To reproduce OWL's GAIA benchmark score of 58.18:
 
+1. Switch to the `gaia58.18` branch:
+```bash
+git checkout gaia58.18
+```
+
+1. Run the evaluation script:
 ```bash
 python run_gaia_roleplaying.py
 ```
diff --git a/README_zh.md b/README_zh.md
index e26338e..64af309 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -164,9 +164,15 @@ logger.success(f"Answer: {answer}")
 - "总结这篇研究论文的主要观点：[论文URL]"
 # 🧪 实验
 
-我们提供了一个脚本用于复现 GAIA 上的实验结果。  
-你可以查看 `run_gaia_roleplaying.py` 文件，并运行以下命令：
+我们提供了一个脚本用于复现 GAIA 上的实验结果。
+要复现我们在 GAIA 基准测试中获得的 58.18 分：
 
+1. 切换到 `gaia58.18` 分支：
+```bash
+git checkout gaia58.18
+```
+
+2. 运行评估脚本：
 ```bash
 python run_gaia_roleplaying.py
 ```