mirror of
https://github.com/camel-ai/owl.git
synced 2026-03-22 14:07:17 +08:00
48 lines
1.5 KiB
Markdown
48 lines
1.5 KiB
Markdown
# OWL: Optimized Workforce Learning for General Multi-Agent Assistance for Real-World Task Automation
|
|
|
|
We present Workforce, a hierarchical multi-agent framework that decouples planning from execution through a modular
|
|
architecture with a domain-agnostic Planner, Coordinator, and specialized Workers. This enables cross-domain transfer by
|
|
allowing worker modification without full system retraining. On the GAIA benchmark, Workforce achieves state-of-the-art
|
|
69.70% accuracy, outperforming commercial systems.
|
|
|
|
This repository contains inference part code for the OWL framework (Workforce).
|
|
|
|
## Inference
|
|
|
|
The framework is based on `camel-0.2.46` version with minor modifications. To reproduce Workforce inference performance on GAIA benchmark (69.70% - Claude-3.7 accuracy on GAIA benchmark, pass@1, and 60.61% - GPT-4o accuracy on GAIA benchmark, pass@3), follow the steps below:
|
|
|
|
### Installation and Setup
|
|
|
|
1. Create a Python 3.11 Conda environment:
|
|
|
|
```bash
|
|
conda create -n owl python=3.11
|
|
```
|
|
|
|
2. Install the required packages:
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. Set up envionment variables:
|
|
|
|
copy `.env.example` to `.env` and set the environment variables, and set the keys in `.env` file.
|
|
|
|
4. Run the inference:
|
|
|
|
- For reproducing results using GPT-4o, run:
|
|
|
|
```bash
|
|
python run_gaia_workforce.py
|
|
```
|
|
|
|
- For reproducing results using Claude-3.7, run:
|
|
|
|
```bash
|
|
python run_gaia_workforce_claude.py
|
|
```
|
|
|
|
You can modify `test_idx` variable to specify the test case.
|
|
|