Loading...
Loading...
Loading...
A user says: *"Write 10 blogs on the topic XYZ."*
# Intent × Shape Classification: Diagnosis & Robustness Strategy
## The Problem
A user says: *"Write 10 blogs on the topic XYZ."*
**Expected:** 10 blog titles (content mode, checklist/calendar shape)
**Got:** A project plan to write 10 blogs (action mode, project shape)
The engine classified this as `intent=create, shape=project, content_mode=false` — interpreting "write 10 blogs" as "here's a plan to produce 10 blogs" (action steps like "Research blog #1 topic", "Draft outline", "Write first draft") instead of "give me the 10 blog titles" (content items like "Why TypeScript Beats JavaScript in 2026").
This is a **fundamental ambiguity** in natural language that your engine must resolve correctly.
---
## How the Engine Currently Decides
### The 6 Orthogonal Dimensions
Your engine has **6 independent classification dimensions**. Every prompt is projected onto all 6:
```
┌─────────────────────────────────────────────────────────┐
│ USER PROMPT │
│ "Write 10 blogs on AI safety" │
└────────────────────┬────────────────────────────────────┘
│
┌───────────┼───────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐
│ INTENT │ │ SHAPE │ │ CONTENT │
│ (12) │ │ (11) │ │ MODE │
│ │ │ │ │ (bool) │
└────┬────┘ └────┬────┘ └────┬─────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴──────┐
│COMPLEX- │ │TIME │ │CUSTOM │
│ITY (4) │ │HORIZON │ │PROPERTIES │
│ │ │ (days) │ │ (dynamic) │
└─────────┘ └─────────┘ └───────────┘
```
| Dimension | Values | What it controls |
|-----------|--------|------------------|
| **Intent** | 12: build, launch, campaign, pipeline, roadmap, ops, event, create, learn, research, travel, personal | Domain knowledge injection (expert tips, expected phases, terminology) |
| **Shape** | 11: project, schedule, itinerary, calendar, routine, budget, curriculum, checklist, comparison, tracker, document | Output structure (phase semantics, task detail focus, scheduler mode) |
| **Content Mode** | true/false | Whether tasks ARE items (nouns) vs. action steps (verbs) |
| **Complexity** | 4: simple, medium, complex, massive | Scale (task count, group count, time horizon, token budget) |
| **Time Horizon** | integer (days) | Scheduling, estimate calibration |
| **Custom Properties** | dynamic list | Additional task dimensions (budget, sprint, etc.) |
### The Decision Flow
```
User Prompt
│
┌──────────┴──────────┐
│ Pre-selected? │
│ (complexity+intent │
│ provided by UI) │
└──────┬──────┬───────┘
│ │
YES │ │ NO
▼ ▼
┌────────────┐ ┌────────────────┐
│ FAST PATH │ │ ROUTER LLM │
│ │ │ │
│ • intent │ │ Classifies ALL │
│ = given │ │ 6 dimensions │
│ • complex │ │ in one call │
│ = given │ │ │
│ • shape │ │ │
│ = regex │ │ │
│ • content │ │ │
│ = regex │ │ │
└────────────┘ └────────────────┘
│ │
└──────┬───────┘
▼
Organization Proposal
(grouping options)
│
▼
Task Skeletons
(titles + groups)
│
▼
Task Enrichment
(full details)
```
### Where Misclassification Happens
**Path A (Fast Path — pre-selected complexity+intent):** Shape and content_mode are inferred via regex patterns in `shapes.py`. The regex for "write" matches the `document` shape pattern (`r"\b(write\s+(a|an|my|the)|draft\s+(a|an|my|the)|...)`), but **only** if followed by articles (a/an/my/the). "Write 10 blogs" doesn't match because "10" isn't an article — so it falls through to `_INTENT_SHAPE_DEFAULTS`, and if intent=create, there's no default shape for "create", so it becomes **"project"**.
For content_mode: "Write 10 blogs" doesn't match any `_CONTENT_MODE_PATTERNS` because blogs aren't in the food/books/movies/exercises/places/gifts lists. So content_mode=**false**.
**Path B (Router LLM):** The LLM must simultaneously decide intent, shape, AND content_mode. But the prompt is genuinely ambiguous — the word "write" could mean:
- "I want to write these blogs" → action mode, project shape
- "Give me 10 blog titles to write about" → content mode, checklist/calendar shape
- "Write a series of blog posts" → document shape, each task is a section
Without knowing the user's mental model, the LLM makes a guess — and often defaults to the more "complete" interpretation (a plan to write blogs), which is not what the user wanted.
---
## The Root Cause: Ambiguity in `content_mode × shape`
The critical insight is that **content_mode and shape interact non-linearly**. They're not truly orthogonal for many prompts:
```
"Write 10 blogs on AI"
Interpretation 1: intent=create, shape=project, content_mode=false
→ Phases: Research, Outline, Writing, Editing, Publishing
→ Tasks: "Research AI safety landscape", "Draft blog #1 outline", ...
Interpretation 2: intent=create, shape=checklist, content_mode=true
→ Phase: "Blog Topics"
→ Tasks: "The Hidden Costs of AI Alignment", "Why AI Safety ≠ AI Ethics", ...
Interpretation 3: intent=create, shape=calendar, content_mode=true
→ Phases: Week 1, Week 2, Week 3, ...
→ Tasks: titles scheduled across weeks
Interpretation 4: intent=create, shape=document, content_mode=false
→ Phase: "Sections"
→ Tasks: each blog as a document section with full content
```
**All 4 interpretations are valid.** The engine currently has no mechanism to disambiguate.
---
## Proposed Solutions
### Solution 1: User-Facing Shape + Mode Selector (Quick Win)
**Problem:** Users can pre-select intent and complexity in the UI, but not shape or content_mode. These are inferred by regex/LLM — and get it wrong.
**Fix:** Expose shape and content_mode as optional UI controls during generation:
```
┌─────────────────────────────────────┐
│ What do you want? │
│ ┌─────────────────────────────────┐│
│ │ Write 10 blogs on AI safety ││
│ └─────────────────────────────────┘│
│ │
│ Intent: [Create ▼] │ ← existing
│ Complexity: [Simple ▼] │ ← existing
│ │
│ Output as: [Auto ▼] │ ← NEW: shape selector
│ • Auto (let AI decide) │
│ • List of items │ → checklist + content_mode=true
│ • Action plan │ → project + content_mode=false
│ • Content calendar │ → calendar + content_mode=true
│ • Document │ → document
│ • ... │
│ │
│ [Generate →] │
└─────────────────────────────────────┘
```
**Why this works:** The user knows what they want. Let them tell you directly instead of making the AI guess.
**Implementation cost:** Low — just pass the shape/content_mode through the existing `start_generation()` parameters and skip shape inference when provided.
---
### Solution 2: Explicit Classification Confirmation (Medium Effort)
**Problem:** The organization proposal already pauses for user confirmation — but it only confirms the grouping structure, not the fundamental interpretation (shape + content_mode).
**Fix:** Add a **classification confirmation step** before the organization proposal. After the Router classifies, show the user what the engine understood:
```json
{
"type": "classification_proposal",
"data": {
"title": "10 AI Safety Blog Posts",
"intent": "create",
"shape": "checklist",
"content_mode": true,
"shape_label": "List of items",
"content_mode_label": "I'll give you the actual blog titles",
"alternatives": [
{
"shape": "project",
"content_mode": false,
"label": "Action plan to write 10 blogs",
"description": "Phases: Research → Outline → Write → Edit → Publish"
},
{
"shape": "calendar",
"content_mode": true,
"label": "Blog titles on a publishing schedule",
"description": "Titles organized by week/month"
}
]
}
}
```
The user clicks the interpretation they want → engine proceeds with that classification.
**Why this works:** Moves the ambiguity resolution to the user instead of making the AI guess. Takes ~2 seconds of user time. The organization proposal already has this pattern — extend it.
**Implementation:**
1. Add a `classification_proposal` SSE event type
2. Have the Router return 2-3 alternative interpretations (shape × content_mode combos)
3. Wait for user confirmation (reuse `_org_confirmed` pattern)
4. Proceed with confirmed classification
---
### Solution 3: Improve the Router's Disambiguation (LLM-side fix)
**Problem:** The Router prompt doesn't adequately teach the LLM about the `content_mode × shape` interaction. It treats them as independent decisions.
**Fix:** Add explicit disambiguation rules to `ROUTER_SYSTEM`:
```python
ROUTER_SYSTEM += """
DISAMBIGUATION RULES (content_mode × shape):
When a prompt could mean EITHER "give me the items" OR "give me a plan to create the items",
use these heuristics:
1. QUANTITY SIGNAL: If the user specifies a number ("10 blogs", "5 recipes", "20 questions"),
they almost always want that many ITEMS, not a plan to produce them.
→ content_mode=true, shape=checklist (or calendar if scheduling matters)
2. "FOR" SIGNAL: "blogs FOR my website" → action plan (content_mode=false)
"blogs ON topic X" → content list (content_mode=true)
3. VERB SIGNAL:
- "Write X" alone → ambiguous (use quantity signal to break tie)
- "Write AND publish X" → action plan (content_mode=false)
- "Help me write X" → action plan (content_mode=false)
- "Give me X" → content list (content_mode=true)
- "List X" → content list (content_mode=true)
4. PROCESS WORDS: If the prompt mentions process steps (outline, draft, edit, review,
publish, schedule, coordinate), lean toward content_mode=false.
5. DEFAULT: When still ambiguous after all signals, prefer content_mode=true with
shape=checklist. Users more often want the output than a meta-plan about producing it.
It's less annoying to get items when you wanted a plan than to get a plan when you wanted items.
"""
```
**Why this works:** The LLM is already doing this classification — it just needs better instructions for the ambiguous cases. The "quantity signal" rule alone would fix your "10 blogs" example.
**Implementation cost:** Extremely low — just edit `ROUTER_SYSTEM` in `prompts.py`.
---
### Solution 4: Two-Pass Inference with Confidence Gating (Robust)
**Problem:** The fast path (pre-selected complexity+intent) uses regex for shape/content_mode — and regex can't handle semantic ambiguity.
**Fix:** When the fast path's regex inference has low confidence (no pattern matched, fell through to defaults), escalate to a lightweight LLM call specifically for shape+content_mode:
```python
async def infer_shape_with_fallback(intent: str, prompt: str, complexity: str) -> tuple[str, bool]:
"""Infer shape + content_mode. Uses regex first, LLM fallback for ambiguous cases."""
# Try regex (fast, deterministic)
shape = _regex_infer_shape(intent, prompt, complexity)
content_mode = _regex_infer_content_mode(intent, prompt)
confidence = "high" if shape != "project" else "low" # "project" is the fallback
if confidence == "high":
return shape, content_mode
# Ambiguous — ask a fast LLM
result = await fast_llm.generate_structured(
system=SHAPE_DISAMBIGUATOR_SYSTEM,
user=f"Intent: {intent}\nPrompt: {prompt}",
output_model=ShapeDecision, # {shape: str, content_mode: bool, reasoning: str}
max_tokens=200,
temperature=0.1,
)
return result.shape, result.content_mode
```
**Why this works:** Keeps the fast path fast for unambiguous cases (wedding → schedule, trip → itinerary), but catches ambiguous cases with a cheap LLM call (~200 tokens, ~0.3s with Haiku).
---
### Solution 5: Expand Content Mode Pattern Coverage
**Problem:** `_CONTENT_MODE_PATTERNS` in `shapes.py` doesn't cover blogs, articles, emails, social posts, or creative writing — which are very common "give me items" prompts.
**Fix:** Add missing patterns:
```python
_CONTENT_MODE_PATTERNS.extend([
# Blog / article titles
r"\b(\d+\s+(blog|article|post|essay|piece|column|editorial|op-ed)s?\b)",
# Social media / content ideas
r"\b(\d+\s+(tweet|reel|story|caption|hook|headline|tagline|slogan)s?\b)",
# Email subjects / templates
r"\b(\d+\s+(email|subject\s+line|newsletter|drip)s?\b)",
# Names / titles as creative output
r"\b(\d+\s+(title|name|topic|theme|prompt|idea|concept|pitch)s?\b)",
# Generic "N things" pattern
r"\b(\d+\s+\w+\s+(idea|tip|trick|hack|way|reason|example|strategy|tactic|step|lesson|rule|principle|mistake|myth|fact|stat|quote)s?\b)",
# "list of N" or "top N"
r"\b(list\s+of\s+\d+|top\s+\d+|best\s+\d+|\d+\s+best)\b",
])
```
**Why this works:** The "N blogs" pattern is a strong signal for content_mode=true. When someone says "10 blogs", they want titles, not a production plan.
**Implementation cost:** Trivial — just add regex patterns to `shapes.py`.
---
### Solution 6: Shape × Intent Compatibility Matrix (Guardrail)
**Problem:** Some shape × intent × content_mode combinations don't make sense, but the engine doesn't enforce constraints. For example, `intent=create + shape=project + content_mode=true` is contradictory.
**Fix:** Add a compatibility matrix that corrects invalid combinations:
```python
# shapes.py
_SHAPE_CONTENT_MODE_OVERRIDES: dict[str, bool | None] = {
# These shapes ALWAYS imply a specific content_mode
"document": False, # Documents are always action/structure mode
"comparison": False, # Comparisons are analysis, not items
"tracker": False, # Trackers track items through stages
# These shapes ALWAYS imply content_mode=true
# (none currently — but could add)
# These shapes respect the inferred content_mode
"project": None, # Could go either way
"checklist": None, # Flat list of actions OR items
"calendar": None, # Content calendar OR schedule of actions
"routine": None, # Specific exercises OR "do X" actions
"itinerary": None, # Specific places OR travel logistics
"schedule": None,
"budget": None,
"curriculum": None,
}
_VALID_COMBINATIONS: dict[tuple[str, str], list[str]] = {
# (intent, content_mode) → allowed shapes
("create", "true"): ["checklist", "calendar", "document"],
("create", "false"): ["project", "calendar", "document"],
("learn", "true"): ["checklist", "curriculum"],
("learn", "false"): ["curriculum", "project"],
("research", "true"): ["checklist", "comparison"],
("research", "false"):["comparison", "project"],
# ... etc.
}
def validate_classification(cls: Classification) -> Classification:
"""Fix contradictory shape × intent × content_mode combos."""
override = _SHAPE_CONTENT_MODE_OVERRIDES.get(cls.output_shape)
if override is not None:
cls.content_mode = override
key = (cls.intent, str(cls.content_mode).lower())
allowed = _VALID_COMBINATIONS.get(key)
if allowed and cls.output_shape not in allowed:
cls.output_shape = allowed[0] # Use first (preferred) option
return cls
```
**Why this works:** Even if the LLM or regex makes a mistake, this guardrail catches contradictory combinations and corrects them. Defense in depth.
---
## Recommended Implementation Order
| Priority | Solution | Effort | Impact | Description |
|----------|----------|--------|--------|-------------|
| **1** | #5: Expand content_mode patterns | 30 min | High | Catches "N blogs/articles/emails" immediately |
| **2** | #3: Improve Router disambiguation | 1 hour | High | Better LLM classification for ambiguous prompts |
| **3** | #6: Compatibility matrix | 2 hours | Medium | Guardrail against contradictory classifications |
| **4** | #1: UI shape selector | 4 hours | High | User explicitly chooses, bypasses all ambiguity |
| **5** | #4: Two-pass inference | 3 hours | Medium | LLM fallback for fast-path ambiguity |
| **6** | #2: Classification confirmation | 6 hours | High | Full user-in-the-loop disambiguation |
**Start with #5 + #3** — they're cheap and fix the immediate problem. Then add #6 as a safety net. Finally, #1 or #2 for full user control.
---
## Appendix: Current Parameter Reference
### All 12 Intents
| Intent | Domain | Default Shape | Content Mode Likely? |
|--------|--------|---------------|---------------------|
| build | Software/products | project | No |
| launch | Go-to-market | project | No |
| campaign | Marketing/outreach | calendar | Sometimes (content ideas) |
| pipeline | Stages/workflow | project | No |
| roadmap | Long-term direction | project | No |
| ops | Processes/compliance | routine | No |
| event | Coordinated moment | schedule | No |
| create | Body of work | project | **Often yes** (titles, topics) |
| learn | Knowledge/skills | curriculum | Sometimes (reading lists) |
| research | Evaluate/compare | comparison | Sometimes (tool lists) |
| travel | Trips/journeys | itinerary | Yes (places, activities) |
| personal | Life goals | routine | Sometimes (habits) |
### All 11 Shapes
| Shape | Phase Semantics | Scheduler | Content Mode Compatible? |
|-------|----------------|-----------|------------------------|
| project | Work stages | DAG | Both |
| schedule | Time periods | SLOT | Both |
| itinerary | Journey days | SLOT | Yes (places) |
| calendar | Publishing buckets | PASS | Yes (content items) |
| routine | Cycle segments | SLOT | Yes (exercises) |
| budget | Spending categories | PASS | No (always action) |
| curriculum | Learning stages | DAG | Both |
| checklist | Single flat group | PASS | Both |
| comparison | Options evaluated | PASS | No (always analysis) |
| tracker | Status stages | PASS | No (always tracking) |
| document | Document sections | PASS | No (always structure) |
### Content Mode Detection
Currently detected by regex patterns in `shapes.py:_CONTENT_MODE_PATTERNS`. **Missing coverage:**
| Category | Currently Covered | Missing |
|----------|------------------|---------|
| Meals/food | Yes | - |
| Books/movies/shows | Yes | - |
| Gifts/shopping | Yes | - |
| Travel activities | Yes | - |
| Exercises/workouts | Yes | - |
| Interview questions | Yes | - |
| Names/ideas | Yes | - |
| Tools/apps | Yes | - |
| Team activities | Yes | - |
| **Blog/article titles** | **No** | `\d+ blogs/articles/posts` |
| **Email subjects** | **No** | `\d+ emails/subject lines` |
| **Social media content** | **No** | `\d+ tweets/reels/captions` |
| **Creative titles** | **No** | `\d+ titles/topics/themes` |
| **Generic "N things"** | **No** | `\d+ tips/tricks/hacks/ways` |
| **"Top N" / "Best N"** | **No** | `top \d+, best \d+, list of \d+` |
### How Shape is Inferred (Fast Path)
```
1. Regex patterns (_PROMPT_SHAPE_PATTERNS) ← first match wins
└─ 12 patterns: checklist, document, comparison, tracker, calendar,
budget, routine, schedule, itinerary, curriculum
2. Intent defaults (_INTENT_SHAPE_DEFAULTS) ← if no regex matched
└─ learn→curriculum, event→schedule, ops→routine,
campaign→calendar, research→comparison, travel→itinerary
3. Complexity fallback ← if no intent default
└─ simple + short prompt → checklist
└─ else → project (THE CATCH-ALL)
```
**The "project" fallback is the source of most misclassifications.** When neither regex nor intent default matches, everything becomes a project plan — even when the user wanted a simple list of items.
---
## Appendix: The "10 Blogs" Example Traced Through the Engine
### Current Behavior
```
Input: "Write 10 blogs on AI safety"
Fast Path (complexity=simple, intent=create):
1. Regex shape check:
- "write\s+(a|an|my|the)" → doesn't match ("write 10" not "write a")
- No other pattern matches
2. Intent default for "create": NOT IN _INTENT_SHAPE_DEFAULTS → no match
3. Fallback: complexity=simple, words=6 (< 12) → "checklist" ← Actually correct!
But content_mode check:
1. _CONTENT_MODE_PATTERNS: no pattern matches "blogs"
2. Result: content_mode = false ← WRONG
Final: shape=checklist, content_mode=false
→ Flat list of ACTION steps like "Research AI safety topics", "Draft blog #1 outline"
→ User wanted: "The Hidden Costs of AI Alignment", "Why AI Safety ≠ AI Ethics", ...
Router LLM Path (no pre-selection):
1. LLM sees "Write 10 blogs on AI safety"
2. LLM interprets "write" as an action verb → intent=create, content_mode=false
3. LLM picks shape=project (most "complete" interpretation)
4. Final: intent=create, shape=project, content_mode=false
→ Phases: Research, Outline, Writing, Editing, Publishing
→ Tasks: "Research AI safety landscape", "Create editorial calendar", ...
→ User wanted: just the 10 titles!
```
### After Fixes (#5 + #3)
```
Input: "Write 10 blogs on AI safety"
With expanded content_mode patterns (#5):
Pattern: r"\b(\d+\s+(blog|article|post)s?\b)" → MATCHES "10 blogs"
→ content_mode = true ← CORRECT
With improved Router disambiguation (#3):
Quantity Signal: "10 blogs" → user wants 10 items, not a plan
→ content_mode = true, shape = checklist ← CORRECT
Final: shape=checklist, content_mode=true
→ Phase: "Blog Topics"
→ Tasks: "The Hidden Costs of AI Alignment", "Why AI Safety ≠ AI Ethics", ...
→ Matches user expectation!
```
This comprehensive guide covers best practices for using ALwrity effectively, optimizing your content strategy, and maximizing the value of your AI-powered content creation platform.
This Airtable project demonstrates a comprehensive content calendar management system designed for content creators, marketers, and social media managers. The base consists of three main tables: Content Ideas, Content Pieces, and Editorial Calendar, with automations to streamline the content creation and publishing process.
_Target audience: New users learning about ambient sound benefits_
A comprehensive Notion template for planning and managing content across multiple platforms.