chore: initialize sandbox and overwrite remote content
Some checks failed
Pre-commit / run (ubuntu-latest) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_en (ubuntu-latest, 3.10) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_zh (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.12) (push) Has been cancelled

This commit is contained in:
codex-bot
2026-03-02 22:32:27 +08:00
commit a64378956a
584 changed files with 93604 additions and 0 deletions

View File

@@ -0,0 +1,18 @@
# ACEBench Example
This is an example of agent-oriented evaluation in AgentScope.
We take [ACEBench](https://github.com/ACEBench/ACEBench) as an example benchmark, and run
a ReAct agent with [Ray](https://github.com/ray-project/ray)-based evaluator, which supports
**distributed** and **parallel** evaluation.
To run the example, you need to install AgentScope first, and then run the evaluation with the following command:
```bash
python main.py --data_dir {data_dir} --result_dir {result_dir}
```
## Further Reading
- [ACEBench](https://github.com/ACEBench/ACEBench)
- [Ray](https://github.com/ray-project/ray)