Shanghai Jiao Tong University · 2026-05-04 · major
ARIS — SJTU's Open Research Harness Hits 8.1k Stars With Cross-Model Adversarial Review
ARIS is an open-source markdown-skill harness for autonomous ML research. An executor model drives progress while a reviewer from a different model family critiques each step before it commits.
Open research harness that pits one model against another to catch unsupported claims before they ship.
Key specs
| License | MIT |
|---|---|
| GitHub stars | 8,149 |
| Skills | 65+ |
| Arxiv id | 2605.03042 |
What is it?
ARIS (Autonomous Research via Adversarial Multi-Agent Collaboration) is an open-source harness from Shanghai Jiao Tong University for running long-horizon ML research workflows. It coordinates an executor LLM and a separate reviewer LLM that critiques intermediate artifacts and forces revisions, aiming to catch the 'plausible but unsupported' failure mode that plagues long agent runs.
How does it work?
Three layers sit on top of any agent host (Claude Code, Codex, OpenClaw, etc.). The execution layer ships 65+ markdown-defined skills, MCP-based model integrations, a persistent research wiki, and deterministic figure generation. The orchestration layer routes five end-to-end workflows between executor and reviewer with adjustable effort levels. The assurance layer adds a three-stage pipeline that checks evidence and rewrites scientific text.
Why does it matter?
Most autonomous-research demos collapse on long tasks because the executor's framing leaks into its self-review. Splitting executor and reviewer across model families turns that into a deliberate cross-check, and the markdown skill format means teams can fork or adapt the harness without committing to a framework.
Who is it for?
ML researchers and agent builders running long-horizon experiments
Try it
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep