Persona Agent Changes

Summary of modifications to the OpenAI CUA sample app to support persona-driven user testing.
muhammadsr
May 2, 2026
0 upvotes
0 downloads
0 views
ai agent eval openai
View source
Content
# Persona Agent Changes

Summary of modifications to the OpenAI CUA sample app to support persona-driven user testing.

## Modified Files

### 1. `agent/agent.py` (Minimal Changes)

**Added**:
- `instructions` parameter to `__init__()` - accepts system-level persona instructions
- `enable_reasoning` parameter to `__init__()` - enables CUA reasoning API
- Logic in `run_full_turn()` to pass both parameters to the Responses API

**Why**: Allows dynamic persona instructions and captures reasoning for each action.

**Backward Compatible**: Yes - both parameters default to `None`/`False`.

### 2. `persona_agent.py` (New File)

**Core Features**:
- Accepts instructions and scenario as command-line arguments (not hardcoded)
- `PersonaTestingReport` class for comprehensive tracking:
  - Action timeline with timestamps
  - Reasoning trail from CUA API
  - Auto-detection of friction points (keyword-based)
  - Final evaluation capture
  - JSON export and human-readable output

**Command-line Arguments**:
- `--instructions`: Path to file or inline string (required)
- `--scenario`: Path to JSON file or inline JSON (required)
- `--url`: Website to evaluate (required)
- `--persona-name`: Name for report (optional, default "Persona")
- `--computer`: Browser environment (optional, default "local-playwright")
- `--debug`: Debug output (optional)
- `--show`: Show screenshots (optional)
- `--output`: Save report to JSON file (optional)

**Scenario Format** (Exactly 5 fields required):
```json
{
  "scenario": "string",
  "entry_point": "string",
  "device": "mobile|desktop|tablet",
  "time_pressure": "low|medium|high",
  "emotional_state": "string"
}
```

## New Files Created

### Documentation
- `PERSONA_AGENT.md` - Full documentation
- `QUICK_START.md` - Quick reference guide
- `examples/PERSONA_EXAMPLES.md` - Example personas guide

### Example Personas

**Sarah Kim** (New Parent):
- `examples/sarah_kim_instructions.txt`
- `examples/sarah_kim_scenario.json`

**Alex Chen** (Tech Shopper):
- `examples/alex_chen_instructions.txt`
- `examples/alex_chen_scenario.json`

## Key Design Decisions

### ✅ What We Did

1. **No Hardcoding**: Instructions and scenarios are inputs, not constants
2. **Generic Reports**: Work for any persona/scenario (not tied to specific domains)
3. **CUA Reasoning API**: Uses built-in reasoning instead of text parsing
4. **Friction Detection**: Automatically flags issues based on reasoning keywords
5. **Minimal Changes**: Only extended Agent class, didn't modify core logic
6. **Flexible Input**: Support both file-based and inline inputs

### ❌ What We Avoided

1. **Domain-Specific Parsing**: No hardcoded extraction of "pricing" or "delivery terms"
2. **Hardcoded Personas**: Everything is parameterized
3. **Complex Text Analysis**: Let the model provide structured reasoning
4. **Breaking Changes**: All changes are backward compatible

## Usage Examples

### Basic
```bash
python persona_agent.py \
  --instructions examples/sarah_kim_instructions.txt \
  --scenario examples/sarah_kim_scenario.json \
  --url https://example.com \
  --persona-name "Sarah Kim"
```

### Inline
```bash
python persona_agent.py \
  --instructions "You are a shopper..." \
  --scenario '{"scenario":"...","entry_point":"...","device":"mobile","time_pressure":"high","emotional_state":"..."}' \
  --url https://example.com
```

### With Output
```bash
python persona_agent.py \
  --instructions examples/sarah_kim_instructions.txt \
  --scenario examples/sarah_kim_scenario.json \
  --url https://example.com \
  --output report.json
```

## Testing Report Structure

```json
{
  "persona": "Sarah Kim",
  "scenario": {...},
  "test_details": {
    "start_url": "...",
    "duration_seconds": 67.3,
    "total_actions": 5,
    "timestamp": "2025-10-10T..."
  },
  "actions_taken": [
    {
      "timestamp": 3.2,
      "action_type": "scroll",
      "details": {"direction": "down"},
      "reasoning": "Looking for pricing..."
    }
  ],
  "reasoning_trail": [...],
  "friction_points": [
    {
      "timestamp": 12.1,
      "action": "scroll",
      "issue": "Pricing not clearly visible"
    }
  ],
  "final_evaluation": "...",
  "summary": {
    "action_types": {"scroll": 2, "click": 3},
    "friction_count": 1,
    "completed": true
  }
}
```

## Integration with Original Code

The persona agent reuses all existing infrastructure:
- `utils.create_response()` - API calls
- `Agent.run_full_turn()` - CUA loop
- `computers` module - Browser environments
- Safety checks and callbacks

Only addition: optional `instructions` and `enable_reasoning` parameters to Agent.
Persona Agent Changes

Related Documents

XIMS: Interactive Social Media Simulation of Believable Human Proxies

Who I Am

.claude/PERSONA.md — Claude Agent Contract for Circuit Breaker

Atom Notebook