Serverless Bedrock: How I invoke Claude from Lambda in warrantyAI — DeepSeek Blog | Neura Market
    Neura MarketNeura Market/DeepSeek
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityTrendingGenerate
    DeepSeekBlogServerless Bedrock: How I invoke Claude from Lambda in warrantyAI
    Back to Blog
    Serverless Bedrock: How I invoke Claude from Lambda in warrantyAI
    aws

    Serverless Bedrock: How I invoke Claude from Lambda in warrantyAI

    Harish Aravindan March 3, 2026
    0 views

    Every week I ship a new piece of warrantyAI — an AI-powered warranty management system I'm building...

    --- title: Serverless Bedrock: How I invoke Claude from Lambda in warrantyAI tags: aws, serverless, ai, bedrock cover_image: published: true --- Every week I ship a new piece of warrantyAI — an AI-powered warranty management system I'm building on AWS. This week was Week 8: a 3-agent LangGraph pipeline wired to Bedrock. Before the agents could do anything, I needed one thing to work cleanly: **invoking Claude from a Lambda function without a server, without a container fleet, without an inference endpoint sitting idle burning money.** {% embed https://www.linkedin.com/posts/harish-aravindan_aiplatformengineering-langgraph-awsbedrock-activity-7433883183760408576-EuL5?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAZdZV0B6jNPTfwYZj3O5Lh0p6lcypaLVAo %} Here's exactly how I did it. --- ## Why serverless + Bedrock is the right combo Bedrock's `invoke_model` API is synchronous and stateless. It takes a request, returns a response. That's exactly what Lambda is built for. No warm model, no GPU instance, no ECS cluster. You pay per invocation, per token. For warrantyAI's workload — sporadic document uploads, not a real-time chat product — this matters. My entire system runs under $1.30/day. --- ## The setup: IAM first, always Before any code, the Lambda execution role needs this policy: ```json { "Effect": "Allow", "Action": [ "bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream" ], "Resource": [ "arn:aws:bedrock:ap-south-1::foundation-model/anthropic.claude-haiku-4-5-20251001", "arn:aws:bedrock:ap-south-1::foundation-model/anthropic.claude-sonnet-4-6" ] } ``` Scope it to specific model ARNs. Not `*`. Ever. --- ## The invoke wrapper This is the core function I reuse across all 3 agents in warrantyAI: ```python import json import boto3 bedrock = boto3.client("bedrock-runtime", region_name="ap-south-1") HAIKU = "anthropic.claude-haiku-4-5-20251001" SONNET = "anthropic.claude-sonnet-4-6" def invoke_bedrock(prompt: str, model_id: str = HAIKU, max_tokens: int = 512) -> str: """ Invoke a Bedrock Claude model from Lambda. Returns the text response as a string. """ response = bedrock.invoke_model( modelId=model_id, contentType="application/json", accept="application/json", body=json.dumps({ "anthropic_version": "bedrock-2023-05-31", "max_tokens": max_tokens, "messages": [ {"role": "user", "content": prompt} ] }) ) body = json.loads(response["body"].read()) return body["content"][0]["text"].strip() ``` That's it. Stateless, reusable, testable in isolation. --- ## Haiku-first, Sonnet fallback Haiku is fast and cheap. Sonnet is accurate and expensive. In warrantyAI's Classifier agent, I try Haiku first. If it returns low confidence, I retry with Sonnet automatically: ```python def classify_warranty(structured_data: dict) -> dict: prompt = build_classify_prompt(structured_data) # Attempt 1: Haiku result = invoke_bedrock(prompt, model_id=HAIKU) parsed = json.loads(result) # Fallback: Sonnet if confidence < 0.7 if parsed.get("confidence", 0) < 0.7: result = invoke_bedrock(prompt, model_id=SONNET) parsed = json.loads(result) parsed["model_used"] = "sonnet" else: parsed["model_used"] = "haiku" return parsed ``` In practice, Haiku handles ~85% of documents. Sonnet kicks in for complex commercial warranties with ambiguous clause structures. --- ## Three things that will burn you **1. The `body` is a StreamingBody, not a string.** Always call `.read()` before `json.loads()`. Forget this once and you'll spend 20 minutes confused. ```python # Wrong body = json.loads(response["body"]) # Right body = json.loads(response["body"].read()) ``` **2. Token limits on Lambda payloads.** Lambda has a 6MB synchronous response limit. Bedrock responses are usually tiny, but if you're passing large documents in your prompt, chunk them first. I cap prompts at 4,000 characters in the Reader agent. **3. Bedrock is regional.** Not all models are available in all regions. `ap-south-1` (Mumbai) supports Haiku and Sonnet. If you get a `ResourceNotFoundException`, check model availability in your region first before debugging your code. --- ## Cost reality check For warrantyAI's workload (roughly 50 documents/day): | Model | Avg tokens/call | Cost/call | Daily cost | |---|---|---|---| | Haiku | ~800 | ~$0.0004 | ~$0.017 | | Sonnet (15% of calls) | ~800 | ~$0.006 | ~$0.005 | Total Bedrock cost: under $0.025/day for this workload. The rest of my $1.30/day budget goes to Textract, SNS, and S3. --- ## What's next This pattern is the foundation for the entire warrantyAI pipeline. Next Sunday I'll cover how I wired these invocations into a LangGraph StateGraph — three agents, one shared state dict, no message queues. Follow along if you're building serverless AI on AWS. I publish every Sunday in LinkedIn *This is part of the Serverless Meets AI series — practical AWS patterns from building warrantyAI.*

    Tags

    awsserverlessaibedrock

    Comments

    More Blog

    View all
    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠ai

    How I'm using ASTs and Gemini to solve the "Codebase Onboarding" problem 🧠

    Hi everyone! 👋 I’m Tara, a Senior Software Engineer and Consultant. Over the years, I've jumped...

    T
    tworrell
    Local AI Will Save Us All (The Math Says So, Trust Me)ai

    Local AI Will Save Us All (The Math Says So, Trust Me)

    Every few weeks a take goes viral in tech circles making the case for ditching cloud AI and running...

    S
    Sebastian Schürmann
    Lost in the AI Hype, I Started Smallai

    Lost in the AI Hype, I Started Small

    And it helped me get back into tech without drowning TL;DR at the end Coming back to...

    R
    Rohini Gaonkar
    Building a Replay-Tested Interactive Brokers Client in Gogo

    Building a Replay-Tested Interactive Brokers Client in Go

    I wanted an IBKR library that felt like Go and had testing I could trust. So I wrote one.

    T
    Thomas Marcelis
    Playwright in Pictures: Fully Parallel Modeplaywright

    Playwright in Pictures: Fully Parallel Mode

    Playwright’s fullyParallel mode is often treated as a simple performance switch. In practice, it...

    V
    Vitaliy Potapov
    Designing a CLI for Both Humans and Agentscli

    Designing a CLI for Both Humans and Agents

    Learn how Alpic designed its CLI for both human developers and AI agents — covering tradeoffs like polling, context windows, interactivity, and statelessness.

    J
    Julien Vallini

    Stay up to date

    Get the latest DeepSeek prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for DeepSeek and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.