Enterprise

Custom Guardrails for Claude in Multi-User SaaS Apps

Claude Directory January 12, 2026

0 views

In multi-tenant SaaS apps powered by Claude AI, one rogue prompt can breach compliance. Build custom per-tenant guardrails to filter content, enforce policies, and log for GDPR audits.

## The Multi-Tenant Challenge with Claude AI Building SaaS products on the Claude API is powerful—leveraging models like Claude 3.5 Sonnet for intelligent features across HR, sales, or engineering workflows. But in multi-user environments, tenant isolation is critical. Without custom guardrails, risks include: - **Cross-tenant data leakage**: A user's prompt might inadvertently expose another tenant's sensitive data via shared context or logs. - **Prompt injection attacks**: Malicious inputs bypassing built-in safety, generating harmful or non-compliant outputs. - **Compliance violations**: GDPR, HIPAA, or SOC 2 require auditable, consent-based interactions—Anthropic's default safety settings (e.g., `refusal` for harm categories) aren't tenant-specific. - **Audit gaps**: No granular logging ties interactions to tenants for forensic reviews. Enterprise teams need **per-tenant guardrails**: dynamic filters, moderation, and logging tailored to each customer's policies (e.g., stricter filters for finance tenants). This guide walks through implementing them using the Anthropic Python SDK, a Node.js middleware example, and best practices for production. ## Why Not Just Use Anthropic's Built-in Safety? Claude API offers robust safety via `top_p`, `temperature`, and `refusals` for categories like: - `HARM_CATEGORY_HARASSMENT` - `HARM_CATEGORY_HATE_SPEECH` - `HARM_CATEGORY.SEXUALLY_EXPLICIT` - etc. (full list in [docs](https://docs.anthropic.com/en/api/messages#request-safety-settings)). Set via: ```python safety_settings = [ { "category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE" } ] ``` But these are **global**, not per-tenant. For SaaS: - Tenants want custom thresholds (e.g., legal teams block all PII). - Pre/post-processing for regex-based filters (e.g., block competitor names). - Logs must pseudonymize data per GDPR Art. 25 (data protection by design). **Solution**: Wrap the API in a **guardrail middleware** that: 1. Validates tenant config. 2. Filters input prompts. 3. Calls Claude with tenant-tuned safety. 4. Moderates outputs (using Claude Haiku for speed). 5. Logs securely. ## Step 1: Tenant Config Management Store per-tenant rules in your DB (e.g., PostgreSQL with Supabase or AWS RDS). Schema: ```sql CREATE TABLE tenant_guardrails ( tenant_id UUID PRIMARY KEY, allowed_domains TEXT[], -- e.g., ['hr.example.com'] blocked_keywords TEXT[], -- e.g., ['confidential', 'SSN'] safety_thresholds JSONB, -- {"HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_LOW_AND_ABOVE"} log_retention_days INT DEFAULT 90, pii_redact BOOLEAN DEFAULT true ); ``` Load dynamically: ```python from anthropic import Anthropic import psycopg2 # or asyncpg for async class GuardrailManager: def __init__(self, db_url: str): self.conn = psycopg2.connect(db_url) def get_config(self, tenant_id: str) -> dict: cur = self.conn.cursor() cur.execute("SELECT * FROM tenant_guardrails WHERE tenant_id = %s", (tenant_id,)) row = cur.fetchone() return dict(row) if row else {} ``` ## Step 2: Input Filtering Pre-process prompts to block violations **before** Claude sees them. - **Regex for PII/keywords**: Use `re` for emails, SSNs, custom blocks. - **Domain checks**: Ensure prompt origin matches tenant. - **Length/rate limits**: Prevent abuse (Claude handles up to 200k tokens). Example filter function: ```python import re from typing import Dict, List class InputFilter: def __init__(self, config: dict): self.blocked_keywords = config.get('blocked_keywords', []) self.pii_patterns = [ r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', # emails r'\b\d{3}-\d{2}-\d{4}\b' # SSNs ] def filter(self, prompt: str, tenant_domains: List[str]) -> tuple[bool, str]: # Domain check (from headers/auth) if not any(domain in prompt for domain in tenant_domains): # simplistic; use auth return False, "Unauthorized domain." # Keyword/PII block for pattern in self.pii_patterns + [re.compile(kw, re.I) for kw in self.blocked_keywords]: if pattern.search(prompt): return False, "Blocked: sensitive content detected." return True, prompt ``` ## Step 3: Claude API Call with Tenant Safety Integrate into a service: ```python client = Anthropic(api_key="your-key") def chat_with_guardrails(tenant_id: str, prompt: str, headers: dict) -> dict: config = guardrail_mgr.get_config(tenant_id) filter = InputFilter(config) ok, cleaned_prompt = filter.filter(prompt, config.get('allowed_domains', [])) if not ok: return {"error": cleaned_prompt} # or log & return 403 safety = config.get('safety_thresholds', [ {"category": c, "threshold": "BLOCK_MEDIUM_AND_ABOVE"} for c in ["HARM_CATEGORY_HARASSMENT", "HARM_CATEGORY_HATE_SPEECH"] ]) msg = client.messages.create( model="claude-3-5-sonnet-20240620", max_tokens=1024, system="You are a helpful assistant.", messages=[{"role": "user", "content": cleaned_prompt}], tools=[], # Add tenant tools if needed safety_settings=safety ) return self._moderate_output(msg.content[0].text, config) ``` ## Step 4: Output Moderation Post-process with lightweight Claude Haiku for custom checks (e.g., "Does this contain trade secrets?"): ```python def _moderate_output(self, output: str, config: dict) -> dict: mod_prompt = f""" Review this output for compliance with tenant policy. Policy: {config.get('custom_policy', 'General safety')} Output: {output} Respond JSON: {{"safe": true/false, "reason": "..."}} """ mod_msg = client.messages.create( model="claude-3-haiku-20240307", max_tokens=100, system="Strict compliance checker.", messages=[{"role": "user", "content": mod_prompt}] ) # Parse JSON response (use structured outputs for prod) if "safe" not in mod_msg.content[0].text: return {"error": "Moderation failed", "output": None} return {"output": output} ``` Haiku's speed (<1s) makes this feasible at scale. ## Step 5: Secure Logging for Compliance Log to tenant-isolated stores (e.g., S3 per tenant or Elasticsearch with TTL). GDPR: Redact PII, get consent via ToS. ```python import json import uuid from datetime import datetime class Logger: def __init__(self, log_bucket: str): pass # S3 or DB impl def log_interaction(self, tenant_id: str, input: str, output: str, config: dict): event = { "tenant_id": tenant_id, "timestamp": datetime.utcnow().isoformat(), "session_id": str(uuid.uuid4()), "input_hash": hash(input), # Anonymize "output_hash": hash(output), "model": "claude-3-5-sonnet", "safety_applied": config['safety_thresholds'] } # Redact if config['pii_redact'] # Upload to S3: s3.put_object(Bucket=f"logs-{tenant_id}", Key=f"{event['session_id']}.json", Body=json.dumps(event)) ``` Retention: Use DB TTL or S3 lifecycle policies. ## Full Node.js Middleware Example For Express/Fastify apps: ```typescript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic({ apiKey: process.env.ANTHROPIC_KEY }); app.post('/api/claude/:tenantId', async (req, res) => { const { tenantId } = req.params; const { prompt } = req.body; // Fetch config from DB (e.g., Prisma) const config = await prisma.tenantGuardrails.findUnique({ where: { tenantId } }); // Input filter (similar to Python) if (config.blockedKeywords.some(kw => prompt.toLowerCase().includes(kw))) { return res.status(403).json({ error: 'Blocked keyword' }); } const safetySettings = config.safetyThresholds || [ { category: 'HARM_CATEGORY_HARASSMENT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' } ]; const msg = await client.messages.create({ model: 'claude-3-5-sonnet-20240620', max_tokens: 1024, messages: [{ role: 'user', content: prompt }], safety_settings: safetySettings, }); // Log await logInteraction(tenantId, prompt, msg.content[0].text); res.json({ output: msg.content[0].text }); }); ``` ## Scaling and Best Practices - **Performance**: Cache configs (Redis), async queues (BullMQ) for moderation/logging. - **Costs**: Haiku moderation ~$0.0001/1k tokens; batch non-urgent logs. - **Monitoring**: Track refusal rates per tenant in Datadog/Prometheus. - **Edge Cases**: Handle streaming (`stream: true`) by buffering for moderation—withhold until safe. - **Testing**: Unit test filters; e2e with Anthropic's eval datasets. - **Integrations**: Hook into n8n/Zapier via webhooks with guardrails. For enterprise: Use Claude's VPC endpoints for data residency (EU for GDPR). ## Measuring Success - **Metrics**: 99.9% compliance rate, <500ms latency overhead. - **ROI**: Avoid fines (GDPR up to 4% revenue); enable premium tiers with custom rules. Implement these guardrails to make Claude enterprise-ready in your SaaS. Fork the [GitHub repo](https://github.com/example/claude-guardrails) for starters. Questions? Comment below. *(~1450 words)*

Comments

More Blog

View all

Claude for Developers

Building Voice Agents with Claude API and ElevenLabs: Conversational AI Guide

Build natural voice agents combining Claude API's superior reasoning with ElevenLabs' lifelike TTS. This end-to-end guide creates a conversational web app with STT, AI chat, and speech synthesis.

Claude Directory

Model Comparisons

Claude vs Mistral Large 2: 2025 Data Analysis Benchmarks and Use Cases

As data volumes explode in 2025, choosing between Claude's reasoning depth and Mistral Large 2's efficiency is critical. We benchmark SQL generation, visualizations, and large datasets to reveal the w

Claude Directory

Enterprise

Claude Enterprise for Cybersecurity: Threat Modeling and Incident Response

In the high-stakes world of cybersecurity, rapid threat modeling and incident response can mean the difference between containment and catastrophe. Discover how Claude Enterprise empowers security tea

Claude Directory

Claude Code

Claude Code in VS Code: Custom Commands for Refactoring Large Codebases

Refactoring sprawling codebases manually? Harness Claude Code's power in VS Code with custom commands to automate AI-driven refactors across TypeScript and Python projects—saving hours of drudgery.

Claude Directory

Claude for Developers

Claude SDK Rust for Blockchain: Smart Contract Auditing Agents

Build blazing-fast smart contract auditing agents in Rust using the Claude SDK. Harness Claude's reasoning to scan Solidity code for vulnerabilities like reentrancy and overflows.

Claude Directory

Claude Best Practices

Advanced Claude Artifacts: Collaborative Editing in Multi-User Sessions

Elevate team productivity with Claude Artifacts in multi-user projects—enable real-time iterative editing for code reviews and docs without leaving the interface.

Claude Directory

Custom Guardrails for Claude in Multi-User SaaS Apps

Tags

Comments

More Blog

Building Voice Agents with Claude API and ElevenLabs: Conversational AI Guide

Claude vs Mistral Large 2: 2025 Data Analysis Benchmarks and Use Cases

Claude Enterprise for Cybersecurity: Threat Modeling and Incident Response

Claude Code in VS Code: Custom Commands for Refactoring Large Codebases

Claude SDK Rust for Blockchain: Smart Contract Auditing Agents

Advanced Claude Artifacts: Collaborative Editing in Multi-User Sessions