Production rules for handling DeepSeek API rate limits, errors, retries, and cost optimization in application code.
## DeepSeek API Production Rules ### Rate Limiting - DeepSeek API enforces rate limits per API key - Implement client-side rate limiting before hitting the API - Use a token bucket or sliding window algorithm - Cache responses for identical prompts (TTL: 5-15 minutes) ### Error Handling Handle these error codes specifically: - 400: Bad request — validate prompt before sending - 401: Invalid API key — check DEEPSEEK_API_KEY env var - 429: Rate limited — implement exponential backoff: - Start: 1 second - Max: 60 seconds - Jitter: random 0-500ms - Max retries: 5 - 500/502/503: Server error — retry with backoff (max 3 attempts) - Timeout: Set reasonable timeouts (30s for chat, 120s for R1 reasoning) ### Cost Optimization - Use deepseek-chat for simple tasks (cheaper) - Reserve deepseek-reasoner for complex reasoning - Set max_tokens to prevent runaway responses - Use temperature 0.0 for deterministic cache-friendly responses - Batch similar requests where possible - Monitor token usage per request and set alerts ### Security - Never expose API keys in client-side code - Use environment variables or secrets manager - Rotate keys periodically - Log request metadata (not content) for debugging - Implement request signing for webhook callbacks ### Monitoring - Track: latency p50/p95/p99, error rate, token usage, cost per request - Alert on: error rate > 5%, latency p95 > 10s, daily cost > budget - Dashboard: requests/minute, model distribution, cache hit rate
System rules for designing inter-service communication in microservices architectures with DeepSeek Coder, covering sync/async patterns, error handling, and resilience.
System rules for generating content in multiple languages with DeepSeek V3, covering translation quality, cultural adaptation, locale-specific formatting, and quality assurance.
System rules for safe code refactoring with DeepSeek R1, requiring test coverage verification, incremental changes, and behavior preservation checks.
System rules for using DeepSeek V3 to generate clear, maintainable technical documentation including API docs, architecture docs, and onboarding guides.
System rules for DeepSeek Coder to generate optimized database queries, with requirements for EXPLAIN analysis, indexing recommendations, and performance targets.
System rules for using DeepSeek V3 to generate infrastructure code, CI/CD pipelines, and operational runbooks with security and reliability best practices.