crucible

Name: crucible
Author: TheApexWu

TheApexWu February 21, 2026

4 copies 0 downloads

1st Place Winner (General Judge) - Datadog Self-Improving Agents Hack. Two identical AI agents play Split or Steal. No pre-programmed betrayal. They discover deception on their own. Built with @evancorrea.

CRUCIBLE

1st Place, Datadog Self-Improving Agents Hackathon (Feb 2026, NYC)

Two AI agents play 100 rounds of Split or Steal. Through private reflection and experience, they discover deception, trust manipulation, and counter-deception. Nothing is prompted. Everything emerges.

What this is

An adversarial simulation engine for studying emergent deception in LLM agents. Both agents start with identical naive prompts and zero strategic priming. Deceptive behavior develops purely through experience and private reflection. CRUCIBLE measures how it happens, when it happens, and distills defensive skills from the patterns that emerge.

The security application: AI copilots are entering every enterprise workflow. CRUCIBLE stress-tests how these agents behave under adversarial pressure and produces deployable countermeasures.

Key findings

Metric	Gemini 2.0 Flash	Gemini 2.5 Flash
Mutual destruction rate	86%	0%
Cooperation rate	6%	100%
Deception Index	22.9 / 100	0
First betrayal	Round 6	Never

Same prompts, same environment. Swapping the model changes the security posture entirely. Five runs on 2.5 Flash: zero betrayal across all of them.

Round 6 is the inflection point. After five rounds of cooperation, Agent A identifies Agent B's trust pattern and exploits it. Agent B develops a theory of mind about the attacker within one round. From there, 86% mutual destruction. The trust never recovers.

Stack

Game engine: Google Gemini (configurable model, default gemini-2.5-flash)
Metrics pipeline: Mutual information decay, strategy entropy, exploitation windows, language drift, composite Deception Index
Skill distillation: Converts emergent strategy patterns into deployable prompt modules for hardening customer-facing agents
Voice rendering: ElevenLabs TTS with emotion-mapped parameters (two distinct agent voices)
Observability: Datadog LLM Observability integration
**Evaluat

Comments

More Agents

View all

agentic-ai

Agentsmith

Universal, model-agnostic operating harness for AI agents (Claude, Codex, Gemini, …) — a lean core + work-type profiles assembled by one setup script.

PromptPartner

308

agent-skills

Awesome Gamedev Agent Skills

Game-development Agent Skills for AI coding agents: install once and a master router loads the right skill for your engine and task. 66 original, version-pinned skills (plus a master router) in the portable SKILL.md format that runs across Claude Code, Cursor, Codex, Copilot, Gemini CLI and more, for Godot, Unity, Unreal, web and beyond.

gamedev-skills

303

ai-agents

Agentpet

A desktop pet for macOS & Windows that monitors your AI coding agents (Claude Code, Codex, Cursor, Gemini...) in real time, and grows as you code, feed it tokens, level it up, climb the leaderboard.

ntd4996

279

ai-agent

UltraGameStudio

UltraGameStudio - AI coding agent for game development: engine workflows, gameplay code, and asset generation.

wellingfeng

260

Zero

The coding agent that answers to you, your model, your machine, your rules.

Gitlawb

1,099

agent-bridge

Lucarne

Stop babysitting local AI agents. Just notifications, approve, and resume your Codex,Pi,Grok, or Claude code sessions anywhere. 0-Intrusion mobile control bridge via Telegram/微信/飞书. No hooks, no skills, no MCP.

tuchg

314