AgentGym — DeepSeek Agents | Neura Market
    Neura MarketNeura Market/DeepSeek
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityTrendingGenerate
    DeepSeekAgentsAgentGym
    Back to Agents
    AgentGym

    AgentGym

    The-Swarm-Corporation January 29, 2025
    24 copies 0 downloads

    A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1

    Agent Definition
    # Agent Gym
    ![Agent Gym](images/steps.png)
    
    
    [![Join our Discord](https://img.shields.io/badge/Discord-Join%20our%20server-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/swarms) [![Subscribe on YouTube](https://img.shields.io/badge/YouTube-Subscribe-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@kyegomez3242) [![Connect on LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/kye-g-38759a207/) [![Follow on X.com](https://img.shields.io/badge/X.com-Follow-1DA1F2?style=for-the-badge&logo=x&logoColor=white)](https://x.com/kyegomezb)
    
    Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.
    
    
    - Sources:
    - [Open R1 Blog](https://huggingface.co/blog/open-r1)
    - [GRPO Documentation from trl](https://huggingface.co/docs/trl/main/en/grpo_trainer)
    - [Huggingface Docs](https://huggingface.co/docs/transformers/main/en/index)
    - [GRPO Docs](https://huggingface.co/docs/trl/main/en/grpo_trainer)
    
    
    ## Installation
    
    ```bash
    pip3 install -U agentgym
    ```
    
    ## Usage
    
    ```python
    from agentgym.r1_pipeline import R1Pipeline, SFTConfig
    
    r1_pipeline = R1Pipeline(
        sft_model="Qwen/Qwen2-0.5B-Instruct",
        tokenizer_name="Qwen/Qwen2-0.5B-Instruct",
        sft_dataset="trl-lib/tldr",
        sft_args=SFTConfig(output_dir="/tmp"),
        only_grpo=True,
        model_name="Qwen/Qwen2-0.5B-Instruct"
    )
    
    r1_pipeline.run()
    
    ```
    
    ## Architecture
    
    The architecture is as follows:
    
    - SFT: Supervised Fine-Tuning
    - GRPO: Generative Reinforcement Policy Optimization
    
    -> model -> sft -> grpo -> model
    
    ```mermaid
    graph TD;
        A[model] --> B[sft]
        B --> C[grpo]
        C --> D[reasoning model]
    ```
    
    # License
    MIT
    

    Tags

    agentsaialibabadeepseekllmso1qwenr1rl

    Comments

    More Agents

    View all
    hybrid-model-workflow

    hybrid-model-workflow

    HAL 分层混合模型工作流 — 强模型(Claude)负责理解/拆解/验收,低成本模型(DeepSeek)负责检索/提取/清洗。Hermes Agent skill。

    P
    ph4ble
    1
    Dynamic-Review-Agent

    Dynamic-Review-Agent

    An LLM agent fine-tuned on DeepSeek for spaced repetition, dynamically integrating knowledge points based on the Ebbinghaus forgetting curve.

    1
    1838177
    1
    StellarOS-Watch

    StellarOS-Watch

    基于 STM32F103 构建的端到端 AI 智能手表生态。自研“零重定位”原生机器码动态加载引擎与页面栈式 UI 框架;集成生产级 OTA 回滚保护机制与高带宽(921600 baud)串口协议栈。通过 Node.js 中继实现 DeepSeek AI 语义控制及 ASRPRO 语音全双工交互,是一个集成了分布式计算、现代存储管理与 AI Agent 的嵌入式全栈工程。

    C
    chenshuang888
    1
    UAVagent1.0deepseek

    UAVagent1.0

    A Meta-Agent-Driven Self-Evolving Multi-Agent System for UAV Detection and Tracking

    S
    StarlitPupils
    2
    hermes-goai-agent

    hermes-go

    One command to run Hermes AI Agent with a browser UI. Zero prerequisites. 一行命令,AI 就位。

    L
    LAI-755
    1
    Agent

    Agent

    网页应用Agent,接入DeepSeek、Mimo等模型

    C
    Cosmos-815
    1

    Stay up to date

    Get the latest DeepSeek prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for DeepSeek and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.