diff --git a/README.md b/README.md index 61a7906..e31d00c 100644 --- a/README.md +++ b/README.md @@ -568,6 +568,9 @@ The web interface is built using Gradio and runs locally on your machine. No dat # 🧪 Experiments To reproduce OWL's GAIA benchmark score of 58.18: +Furthermore, to ensure optimal performance on the GAIA benchmark, please note that our `gaia58.18` branch includes a customized version of the CAMEL framework in the `owl/camel` directory. This version contains enhanced toolkits with improved stability for gaia benchmark compared to the standard CAMEL installation. + +When running the benchmark evaluation: 1. Switch to the `gaia58.18` branch: ```bash @@ -581,6 +584,7 @@ To reproduce OWL's GAIA benchmark score of 58.18: This will execute the same configuration that achieved our top-ranking performance on the GAIA benchmark. + # ⏱️ Future Plans We're continuously working to improve OWL. Here's what's on our roadmap: