From eb7f19a9834c66d6a263a0a66b916559a01a8003 Mon Sep 17 00:00:00 2001 From: Yuhang Zhou <86864241+Ralph-Zhou@users.noreply.github.com> Date: Sun, 18 May 2025 17:29:25 +0800 Subject: [PATCH] Initial update README.md --- README.md | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..f41852a --- /dev/null +++ b/README.md @@ -0,0 +1,47 @@ +# OWL: Optimized Workforce Learning for General Multi-Agent Assistance for Real-World Task Automation + +We present Workforce, a hierarchical multi-agent framework that decouples planning from execution through a modular +architecture with a domain-agnostic Planner, Coordinator, and specialized Workers. This enables cross-domain transfer by +allowing worker modification without full system retraining. On the GAIA benchmark, Workforce achieves state-of-the-art +69.70% accuracy, outperforming commercial systems. + +This repository contains inference part code for the OWL framework (Workforce). + +## Inference + +The camel version we use is `0.2.46`. To reproduce Workforce inference performance (69.70% - Claude-3.7 accuracy on GAIA benchmark and 60.61% - GPT-4o +accuracy on GAIA benchmark), follow the steps below: + +### Installation and Setup + +1. Create a Python 3.11 Conda environment: + +```bash +conda create -n owl python=3.11 +``` + +2. Install the required packages: + +```bash +pip install -r requirements.txt +``` + +3. Set up envionment variables: + +copy `.env.example` to `.env` and set the environment variables, and set the keys in `.env` file. + +4. Run the inference: + +- For reproducing results using GPT-4o, run: + +```bash +python run_gaia_workforce.py +``` + +- For reproducing results using Claude-3.7, run: + +```bash +python run_gaia_workforce_claude.py +``` + +