Shanghai Jiao Tong University · 2026-05-04 · major

ARIS — SJTU's Open Research Harness Hits 8.1k Stars With Cross-Model Adversarial Review

ARIS is an open-source markdown-skill harness for autonomous ML research. An executor model drives progress while a reviewer from a different model family critiques each step before it commits.

ARIS GitHub repository social preview banner

Open research harness that pits one model against another to catch unsupported claims before they ship.

Key specs

License	MIT
GitHub stars	8,149
Skills	65+
Arxiv id	2605.03042

What is it?

ARIS (Autonomous Research via Adversarial Multi-Agent Collaboration) is an open-source harness from Shanghai Jiao Tong University for running long-horizon ML research workflows. It coordinates an executor LLM and a separate reviewer LLM that critiques intermediate artifacts and forces revisions, aiming to catch the 'plausible but unsupported' failure mode that plagues long agent runs.

How does it work?

Three layers sit on top of any agent host (Claude Code, Codex, OpenClaw, etc.). The execution layer ships 65+ markdown-defined skills, MCP-based model integrations, a persistent research wiki, and deterministic figure generation. The orchestration layer routes five end-to-end workflows between executor and reviewer with adjustable effort levels. The assurance layer adds a three-stage pipeline that checks evidence and rewrites scientific text.

Why does it matter?

Most autonomous-research demos collapse on long tasks because the executor's framing leaks into its self-review. Splitting executor and reviewer across model families turns that into a deliberate cross-check, and the markdown skill format means teams can fork or adapt the harness without committing to a framework.

Who is it for?

ML researchers and agent builders running long-horizon experiments

Try it

git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep