AgentGym — DeepSeek AI Agent

Neura Market/DeepSeek

Back to Agents

AgentGym

Name: AgentGym
Author: The-Swarm-Corporation

The-Swarm-Corporation January 29, 2025

24 copies 0 downloads

A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1

Agent Gym

Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.

Installation

pip3 install -U agentgym

Usage

from agentgym.r1_pipeline import R1Pipeline, SFTConfig

r1_pipeline = R1Pipeline(
    sft_model="Qwen/Qwen2-0.5B-Instruct",
    tokenizer_name="Qwen/Qwen2-0.5B-Instruct",
    sft_dataset="trl-lib/tldr",
    sft_args=SFTConfig(output_dir="/tmp"),
    only_grpo=True,
    model_name="Qwen/Qwen2-0.5B-Instruct"
)

r1_pipeline.run()

Architecture

The architecture is as follows:

SFT: Supervised Fine-Tuning
GRPO: Generative Reinforcement Policy Optimization

-> model -> sft -> grpo -> model

graph TD;
    A[model] --> B[sft]
    B --> C[grpo]
    C --> D[reasoning model]

License

MIT

Comments

More Agents

View all

agentic-ai

Klaatcode

Open-source AI coding agent for the terminal. Claude Code-grade accuracy with smart model routing — uses the right AI model for each task, cutting costs 10x. Supports Claude, GPT, Gemini, DeepSeek & more.

KlaatAI

139

agent

Agentmaker

A general-purpose Python framework for building LLM agents and multi-agent systems. "Four lines of code, an agent with memory."

xinhuangcs

ai-api

Api Model Playground Cookbook

Ultimate LLM API Integration Cookbook 2026 for Cursor & AI Agents

09omerdgn-droid

150

agent-framework

Agent Ecologies

Ultimate Multi-Agent OS for Autonomous AI NPCs 2026

israriqbal

153

Private Agent

PrivateAgent is an open-source Android automation agent built with Flutter. It utilizes the DeepSeek API and native Android Accessibility Services to interpret screen layouts and execute multi-step tasks across any installed application via natural language commands.

orailnoor

123

Loom Novel

把一队分工 Agent 织成一条写小说的流水线,做成桌面客户端;写作指纹让它越写越像你(BYO DeepSeek key,纯本地)。

WadeZhao23

184

Ready-made automations for this

Workflows from the Neura Market marketplace related to this DeepSeek resource

Browse all workflows