Reddit Virality Grading Rubric

This document defines the scoring criteria for evaluating Reddit post/rumour virality potential. Each attribute is scored from 0.0 (no presence) to 1.0 (very strong). These scores are used by the LLM to grade injected rumours before simulation.

Overview

The virality score is computed using a weighted combination of 6 active attributes (out of 7 scored):

ViralScore = 0.225 × EmotionalContent
           + 0.22  × Curiosity
           + 0.205 × NarrativeUrgency
           + 0.20  × EngagementPrompting
           + 0.10  × Complexity  (inverted-U: optimal ~0.5)
           + 0.05  × Authenticity

Note: community_fit is scored and stored in rubric data, but intentionally excluded from the aggregate virality score. It is used separately as a hard visibility constraint by the platform layer.

Maximum possible score: 1.0
Target for viral content: 0.65+

Core Narrative & Linguistic Attributes

1. Curiosity/Hook (Weight: 0.22)

Description: How strongly the title/content invokes unanswered questions or narrative tension.

Score	Criteria	Examples
0.0-0.2	No hook, purely informational, boring headline	"Company releases quarterly report"
0.3-0.4	Mild interest, standard news format	"New study finds link between X and Y"
0.5-0.6	Moderate curiosity, some tension or question	"Scientists discover unexpected result in X"
0.7-0.8	Strong hook, creates information gap	"You won't believe what happened when..."
0.9-1.0	Irresistible curiosity, must-click tension	"I found something in my attic that changes everything"

Key Indicators:

Open-ended questions ("What would you do if...?")
Incomplete information that demands resolution
Surprising claims or contradictions
Mystery or suspense elements
Superlatives ("largest", "first ever", "never before")

2. Narrative Urgency (Weight: 0.205)

Description: Does the post feel like something that needs to be read/discussed NOW?

Score	Criteria	Examples
0.0-0.2	Timeless content, no urgency	"History of hydrogen fuel cells"
0.3-0.4	General interest, could be read anytime	"Review of new technology"
0.5-0.6	Timely but not urgent	"This week's industry news"
0.7-0.8	Breaking news, immediate relevance	"Just announced: Major policy change"
0.9-1.0	Crisis/urgent, requires immediate action/discussion	"BREAKING: Critical safety issue discovered"

Key Indicators:

Time-sensitive language ("just", "breaking", "now", "urgent", "today")
Conflict or controversy requiring immediate resolution
Deadlines or time-limited opportunities
Unfolding situations ("developing", "update")
Policy changes or announcements with immediate effect

3. Emotional Content (Weight: 0.225)

Description: Sentiment intensity conveyed (positive, negative, or tense).

Score	Criteria	Examples
0.0-0.2	Neutral, factual, no emotional appeal	"Technical specifications released"
0.3-0.4	Slightly emotional, mild sentiment	"Good news for the industry"
0.5-0.6	Moderate emotion, clear sentiment	"Exciting breakthrough announced"
0.7-0.8	Strong emotion, evokes feelings	"Devastating blow to hopes for..."
0.9-1.0	Highly charged, anger/joy/outrage	"Outrageous decision destroys..."

Key Indicators:

Emotional adjectives (amazing, terrible, shocking, heartbreaking)
Exclamation marks and emphatic language
Personal stakes or human interest angles
Moral outrage or celebration
Words like "finally", "unbelievable", "devastating"

4. Authenticity/First-Person (Weight: 0.05)

Description: Presence of personal recounting or authentic voice.

Score	Criteria	Examples
0.0-0.2	Corporate/robotic, third-person only	"The company announced today..."
0.3-0.4	Professional but somewhat personal	"Our team discovered..."
0.5-0.6	Mixed personal and factual	"I read about this and think..."
0.7-0.8	Strong personal voice, experience-based	"I've been in this field for 10 years and..."
0.9-1.0	Deeply personal, vulnerable, authentic	"I need to share my experience because..."

Key Indicators:

First-person pronouns ("I", "my", "we")
Personal anecdotes or experiences
Admission of uncertainty or learning
Conversational tone
Sharing of personal journey or discovery

5. Complexity (Weight: 0.05)

Description: Structural richness of the title/text. Note: Optimal is 0.3-0.6

Score	Criteria	Examples
0.0-0.2	Too simple, no depth	"Thing is good"
0.3-0.4	OPTIMAL: Clear and accessible	"New product launches today"
0.5-0.6	OPTIMAL: Clear but nuanced	"New study challenges assumptions about X"
0.7-0.8	Getting complex, multiple clauses	Multi-part title with conditions
0.9-1.0	Too complex, hard to parse	Wall of text, jargon-heavy

Scoring Note: For viral potential, scores of 0.3-0.6 are BEST. Very simple (boring) or very complex (inaccessible) reduce virality.

Engagement & Social Interaction Attributes

6. Engagement Prompting (Weight: 0.20)

Description: How well the post invites comments, discussion, or interaction.

Score	Criteria	Examples
0.0-0.2	No invitation to engage, closed statement	"This happened. The end."
0.3-0.4	Implicit discussion potential	"Interesting development in X"
0.5-0.6	Some engagement hooks	"What do you think about...?"
0.7-0.8	Strong call for opinions/debate	"Which side are you on? I think X but..."
0.9-1.0	Perfect engagement design	Open dilemma, poll, AMA, "prove me wrong"

Key Indicators:

Direct questions to the audience
Controversial or debatable claims
Requests for advice or opinions
"Change my view" or "prove me wrong" framing
Polls, choices, or "would you rather" scenarios
Incomplete information inviting speculation

7. Community Fit (Visibility Constraint — not in aggregate score)

Description: Relevance to subreddit interests, tone, and norms.

Score	Criteria	Examples
0.0-0.2	Completely off-topic, wrong subreddit	Posting memes in a serious discussion sub
0.3-0.4	Tangentially related	Adjacent topic, might get removed
0.5-0.6	On-topic but generic	Standard topic for the sub
0.7-0.8	Well-matched, uses community language	Uses subreddit-specific terms/culture
0.9-1.0	Perfect fit, hits community sweet spot	Addresses core community interest/pain point

Subreddit-Specific Guidance

r/HydrogenSocieties

Core interests: Green hydrogen technology, fuel cells, renewable energy, industry news, policy High-fit keywords: hydrogen, fuel cell, electrolyzer, green hydrogen, renewable, infrastructure, energy transition, storage, production, efficiency Community tone: Technical but accessible, optimistic about hydrogen future, data-driven Hot topics: Cost reduction, infrastructure buildout, government incentives, major projects

r/SGExams

Core interests: Singapore education system, exam stress, study tips, school experiences High-fit keywords: O-levels, A-levels, JC, poly, ITE, MOE, PSLE, results, stress, study, grades, tuition, university Community tone: Supportive, relatable, Singlish acceptable, peer-to-peer advice Hot topics: Exam stress, grade anxiety, school comparisons, study strategies, university applications

r/conspiracytheories

Core interests: Alternative narratives, questioning official stories, cover-ups, mysteries High-fit keywords: hidden truth, cover-up, evidence, they don't want you to know, question everything, mainstream media, government, elite Community tone: Skeptical, investigative, presenting "evidence", connecting dots Hot topics: Government secrets, corporate cover-ups, unexplained events, alternative history

LLM Grading Instructions

When grading a post/rumour for simulation, follow these steps:

Step 1: Read the Content

Read the full title and body text
Identify the target subreddit context
Note any visual content indicators

Step 2: Score Each Attribute

For each of the 7 attributes, assign a score from 0.0 to 1.0 using the rubrics above.

Step 3: Calculate Viral Score

ViralScore = 0.225 × EmotionalContent
           + 0.22  × Curiosity
           + 0.205 × NarrativeUrgency
           + 0.20  × EngagementPrompting
           + 0.10  × Complexity  (inverted-U)
           + 0.05  × Authenticity

community_fit is scored but excluded from aggregate — used as a visibility constraint.

Step 4: Output JSON Format

{
  "curiosity": 0.75,
  "narrative_urgency": 0.60,
  "community_fit": 0.85,
  "engagement_prompting": 0.80,
  "emotional_content": 0.55,
  "authenticity": 0.70,
  "complexity": 0.45,
  "viral_score": 0.67,
  "justification": {
    "curiosity": "Strong hook with unanswered question about major breakthrough",
    "narrative_urgency": "Moderate urgency due to recent announcement but not crisis",
    "community_fit": "Perfect fit for subreddit's core interest in green technology",
    "engagement_prompting": "Asks for community opinions and debate",
    "emotional_content": "Moderately positive sentiment about progress",
    "authenticity": "Written in first person with personal analysis",
    "complexity": "Clear and accessible, good balance"
  }
}

Interpreting the Viral Score

Score Range	Interpretation	Expected Simulation Outcome
0.00-0.25	Low viral potential	Minimal engagement, limited spread
0.26-0.40	Below average	Some engagement, stays within niche
0.41-0.55	Moderate	Standard engagement for community
0.56-0.70	Above average	Good engagement, potential for spread
0.71-0.85	High potential	Likely to achieve significant engagement
0.86-1.00	Exceptional	Strong viral candidate, wide spread expected

Implementation Details

Centralised Weights

All weights are defined in IntrinsicViralityScorer._WEIGHTS in src/rumour/intrinsic_virality.py:

_WEIGHTS = {
    "curiosity": 0.22,
    "narrative_urgency": 0.205,
    "engagement_prompting": 0.20,
    "emotional_content": 0.225,
    "authenticity": 0.05,
    "complexity": 0.10,
}

Score Validation

LLM rubric parsing is validated via validate_rubric_scores():

Required fields enforced
Values converted to float
Returns frozen IntrinsicRubricScores dataclass

Usage in Simulation Pipeline

Pre-simulation Grading
- LLM receives rumour content and target subreddit
- Returns JSON with all attribute scores
- Viral score computed automatically
Intrinsic Virality
- Viral score used as base spread probability modifier
- Higher scores = higher initial engagement likelihood
Agent Reactions
- Agents with matching interests more likely to engage
- Emotional content triggers stronger reactions
Platform Mechanics
- Feed ranking incorporates engagement velocity
- Higher viral score = higher initial boost
Outcome Comparison
- Compare simulated engagement with actual Reddit engagement
- Use for model calibration and accuracy assessment

Version History

v2.2 (2026-02): Removed post_timing and visual_richness (no discriminative signal). Updated weights after empirical calibration. community_fit excluded from aggregate (used as visibility constraint).
v2.1 (2026-01-15): Added implementation details, centralised weights module
v2.0 (2026-01-14): Comprehensive rubric based on Reddit virality research
v1.0: Legacy rubric (deprecated)

Reddit Virality Grading Rubric

Reddit Virality Grading Rubric

Overview

Core Narrative & Linguistic Attributes

1. Curiosity/Hook (Weight: 0.22)

2. Narrative Urgency (Weight: 0.205)

3. Emotional Content (Weight: 0.225)

4. Authenticity/First-Person (Weight: 0.05)

5. Complexity (Weight: 0.05)

Engagement & Social Interaction Attributes

6. Engagement Prompting (Weight: 0.20)

7. Community Fit (Visibility Constraint — not in aggregate score)

Subreddit-Specific Guidance

r/HydrogenSocieties

r/SGExams

r/conspiracytheories

LLM Grading Instructions

Step 1: Read the Content

Step 2: Score Each Attribute

Step 3: Calculate Viral Score

Step 4: Output JSON Format

Interpreting the Viral Score

Implementation Details

Centralised Weights

Score Validation

Usage in Simulation Pipeline

Version History

Related Documents

Judging Rubric

Multi-Framework Scoring Rubric

Introduction

📑 Project Index - Quick Navigation