Safety

DeepSeek Content Moderation Rules

Name: DeepSeek Content Moderation Rules
Author: Community

Community April 23, 2026

0 copies 0 downloads

Rules for implementing content moderation when using DeepSeek models in user-facing applications, covering input screening, output filtering, and escalation.

Rule Content

## DeepSeek Content Moderation Rules

### Input Screening
Before sending user input to DeepSeek:
1. Check against a blocklist of prohibited terms and phrases
2. Detect prompt injection patterns:
   - Attempts to override system instructions
   - Requests to reveal system prompts
   - Instructions to ignore safety guidelines
3. Classify content risk level: safe, needs_review, blocked
4. Log all blocked inputs for pattern analysis

### Output Filtering
After receiving DeepSeek responses:
1. Scan for PII patterns (regex-based):
   - Email addresses, phone numbers, SSN/ID numbers
   - Physical addresses, credit card numbers
2. Check for harmful content categories:
   - Violence or self-harm instructions
   - Illegal activity guidance
   - Hate speech or discriminatory content
3. Verify output format matches expected schema
4. Truncate unexpectedly long responses

### Escalation Protocol
- Auto-block: Known harmful patterns -> immediate block + log
- Review queue: Ambiguous content -> flag for human review within 24h
- Pass-through: Clean content -> deliver to user
- False positive feedback: Allow reviewers to mark false positives to improve filters

### User Communication
- Never show raw error messages from the API
- Provide helpful alternative suggestions when content is blocked
- Include a feedback mechanism for users to report issues
- Maintain transparency about AI content moderation

### Compliance
- Maintain audit logs of all moderation decisions
- Review and update blocklists monthly
- Train moderation classifiers on domain-specific data
- Document moderation policies and make them accessible to users
- Comply with platform-specific content policies (App Store, Google Play, etc.)

Comments

More Rules

View all

Architecture

Claude Coder Microservices Communication Rules

System rules for designing inter-service communication in microservices architectures with DeepSeek Coder, covering sync/async patterns, error handling, and resilience.

Neura Market

Content

DeepSeek V3 Multilingual Content Generation Rules

System rules for generating content in multiple languages with DeepSeek V3, covering translation quality, cultural adaptation, locale-specific formatting, and quality assurance.

Neura Market

Development

DeepSeek R1 Code Refactoring Safety Rules

System rules for safe code refactoring with DeepSeek R1, requiring test coverage verification, incremental changes, and behavior preservation checks.

Neura Market

Documentation

DeepSeek V3 Technical Documentation Writer Rules

System rules for using DeepSeek V3 to generate clear, maintainable technical documentation including API docs, architecture docs, and onboarding guides.

Neura Market

Database

Claude Coder Database Query Optimization Rules

System rules for DeepSeek Coder to generate optimized database queries, with requirements for EXPLAIN analysis, indexing recommendations, and performance targets.

Neura Market

DevOps

DeepSeek V3 DevOps and Infrastructure Rules

System rules for using DeepSeek V3 to generate infrastructure code, CI/CD pipelines, and operational runbooks with security and reliability best practices.

Neura Market

DeepSeek Content Moderation Rules

Tags

Comments

More Rules

Claude Coder Microservices Communication Rules

DeepSeek V3 Multilingual Content Generation Rules

DeepSeek R1 Code Refactoring Safety Rules

DeepSeek V3 Technical Documentation Writer Rules

Claude Coder Database Query Optimization Rules

DeepSeek V3 DevOps and Infrastructure Rules