An automated AI research-paper writer based off Google's PaperOrchestra paper's implementation through a skills - benchmark + autoraters using any coding agent (Claude Code, Cursor, Antigravity, Cline, Aider). No API keys, no LLM SDKs.
# PaperOrchestra
A pluggable skill pack that lets **any coding agent** in Claude Code, Cursor,
Antigravity, Cline, Aider, OpenCode, etc. which can run the
[**PaperOrchestra**](https://arxiv.org/pdf/2604.05018) multi-agent pipeline for
turning unstructured research materials into a submission-ready LaTeX paper.
> Song, Y., Song, Y., Pfister, T., Yoon, J.
> *PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing.*
> arXiv:2604.05018, 2026. <https://arxiv.org/pdf/2604.05018>
<p align="center">
<a href="https://arxiv.org/pdf/2604.05018">
<img src="docs/assets/paper-preview.png" alt="PaperOrchestra paper — first page preview" width="420"/>
</a>
<br/>
<em>Click to read the paper on arXiv</em>
</p>
## Why this exists
The paper defines a five-agent pipeline
- Outline
- Plotting
- Literature Review
- Section Writing
- Content Refinement
that substantially outperforms single-agent and tree-search baselines on the `PaperWritingBench` benchmark (50–68% absolute win margin on literature review quality; 14–38% on overall quality). The paper ships the exact prompts for every agent in Appendix F.
This repo turns those prompts, schemas, halt rules, and verification pipelines into a set of **host-agent-executable skills**. There are **no API keys**, no SDK dependencies, no embedded LLM calls. The skills are instruction documents plus deterministic helpers; your coding agent does all LLM reasoning and web search using its own tools.
<img width="640" height="413" alt="image" src="https://github.com/user-attachments/assets/073630c8-9790-4b38-b8c4-184cec6eee06" />
## How skills work here
Each skill is:
- `SKILL.md` — a dense instruction document the host agent reads and follows.
- `references/` — reference material: verbatim paper prompts (Appendix F), JSON
schemas, rubrics, halt rules, example outputs.
- `scripts/` — **purely deterministic** local helpers: JSON schema validation,
Levenshtein fuzzy matching, BibTeX formatting, dedup,Agent that generates comprehensive documentation, API references, architecture diagrams, and developer onboarding guides from existing code.
Agent configuration for systematic bug investigation that traces issues from error logs through the codebase to root cause with suggested fixes.
Agent for integrating third-party APIs including SDK setup, type generation, error handling, retry logic, and rate limit management.
Cursor's built-in autonomous coding agent that can make multi-file edits, run terminal commands, search the codebase, and iteratively build features with minimal human intervention.
Cloud-based autonomous coding agent that runs in the background on remote sandboxed environments, handling complex multi-step tasks while you continue working.
Cursor's multi-file editing agent within Composer mode that can create, edit, and delete files across your entire project in a single conversation.