Multi-agent strategic deception evaluation framework for LLMs using Secret Hitler as a testbed. Analyzes AI reasoning, trust dynamics, and deceptive behavior patterns.
# Secret Hitler LLM Evaluation Framework [](https://www.python.org/downloads/) [](https://creativecommons.org/licenses/by-nc-sa/4.0/) [](https://github.com/stchakwdev/Secret_H_Evals) Multi-agent strategic deception evaluation for large language models using Secret Hitler as a testbed. This framework enables researchers to study AI reasoning, trust dynamics, and deceptive behavior patterns in a controlled game environment. **Author**: Samuel T. Chakwera ([stchakwdev](https://github.com/stchakwdev)) --- ## Table of Contents - [Why This Project?](#why-this-project) - [Quick Start](#quick-start) - [Batch Evaluation Monitor](#batch-evaluation-monitor) - [Evaluation Results](#evaluation-results-300-games) - [Visual Analytics](#visual-analytics) - [Features](#features) - [Architecture](#architecture) - [Documentation](#documentation) - [Citation](#citation) - [Recent Updates](#recent-updates) - [Acknowledgments](#acknowledgments) - [License](#license) - [Contact](#contact) --- ## Why This Project? Understanding how AI systems engage in strategic deception is critical for AI safety research. Secret Hitler provides an ideal testbed because it: - **Requires hidden information management** - Players must reason about unknown roles and hidden agendas - **Involves coalition formation** - Trust and betrayal dynamics emerge naturally from gameplay - **Tests deceptive reasoning** - Fascists must convincingly lie while Liberals must detect deception - **Produces measurable outcomes** - Win rates, voting patterns, and policy outcomes provide quantifiable metrics This framework enables researchers to: 1. **Evaluate deception capabilities** across different LLM architectures 2. **Study emergent social behaviors** in mult
HAL 分层混合模型工作流 — 强模型(Claude)负责理解/拆解/验收,低成本模型(DeepSeek)负责检索/提取/清洗。Hermes Agent skill。
An LLM agent fine-tuned on DeepSeek for spaced repetition, dynamically integrating knowledge points based on the Ebbinghaus forgetting curve.
基于 STM32F103 构建的端到端 AI 智能手表生态。自研“零重定位”原生机器码动态加载引擎与页面栈式 UI 框架;集成生产级 OTA 回滚保护机制与高带宽(921600 baud)串口协议栈。通过 Node.js 中继实现 DeepSeek AI 语义控制及 ASRPRO 语音全双工交互,是一个集成了分布式计算、现代存储管理与 AI Agent 的嵌入式全栈工程。
A Meta-Agent-Driven Self-Evolving Multi-Agent System for UAV Detection and Tracking
One command to run Hermes AI Agent with a browser UI. Zero prerequisites. 一行命令,AI 就位。
网页应用Agent,接入DeepSeek、Mimo等模型