chore: update readme

2026-03-22 07:29:35 +08:00 · 2025-02-06 23:33:38 +08:00
parent c76ab3415c
commit d6811fc2eb
4 changed files with 147 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -217,3 +217,13 @@ flowchart TD

    BeastMode --> FinalAnswer[Generate final answer] --> End
 ```
+
+## Evaluation
+
+I kept the evaluation simple, LLM-as-a-judge and collect some ego questions (i.e. questions about Jina AI that I know 100% the answer) for evaluation.
+
+I mainly look at 3 things: total steps, total tokens, and the correctness of the final answer.
+
+```bash
+npm run eval ./src/evals/ego-questions.json
+```