Design Document: BharatSeva AI

1. System Overview

BharatSeva AI is a multi-agent orchestration system built on AWS using Amazon Bedrock Agents with Claude 3.5 Sonnet as the foundation model. The system deploys 10 AI agents (1 Master Orchestrator + 9 Specialist Agents) to assist India's informal sector workers in navigating government schemes across three domains: PM Vishwakarma (artisan credit), PMFBY (crop insurance), and BOCW (construction worker welfare).

1.1 Design Principles

Voice-First: Prioritize voice interaction over text for accessibility
Proactive Intelligence: Monitor external data sources and trigger alerts automatically
Alternative Verification: Use digital footprints (UPI transactions, GPS, photos) instead of traditional documentation
Anti-Corruption: Educate users that services are free and collect evidence of bribery
Audit-Proof: Maintain immutable 7-year audit trails for regulatory compliance
Offline Resilience: Provide SMS fallback when internet is unavailable

1.2 Architecture Style

Event-Driven: AWS EventBridge for scheduled polling and event triggers
Serverless: AWS Lambda for all compute, no EC2 instances
Multi-Agent: Amazon Bedrock Agents for domain-specific intelligence
Microservices: Each agent has dedicated action groups (Lambda functions)

2. High-Level Architecture

2.1 System Components

┌─────────────────────────────────────────────────────────────┐
│                    User Entry Points                         │
│  IVR (Amazon Connect) │ WhatsApp │ SMS │ Web Interface      │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│              Master Orchestrator Agent                       │
│         (Amazon Bedrock Agent - Claude 3.5 Sonnet)          │
│  - Intent Classification (Amazon Comprehend)                 │
│  - Language Detection & Translation (Amazon Translate)       │
│  - Conversation State Management (DynamoDB)                  │
└──────────────────────┬──────────────────────────────────────┘
                       │
        ┌──────────────┼──────────────┐
        │              │              │
        ▼              ▼              ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│PM Vishwakarma│  │    PMFBY     │  │     BOCW     │
│ Agent Swarm  │  │ Agent Swarm  │  │ Agent Swarm  │
│  (3 agents)  │  │  (3 agents)  │  │  (2 agents)  │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       └─────────────────┼─────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Shared Services Layer                       │
│  - Knowledge Bases (OpenSearch Serverless)                   │
│  - Document Processing (Textract, Rekognition)               │
│  - ML Models (SageMaker Endpoints)                           │
│  - Audit Trail (DynamoDB Streams → S3)                       │
│  - External Integrations (Account Aggregator, Weather APIs)  │
└─────────────────────────────────────────────────────────────┘

2.2 Data Flow

User Input → IVR/WhatsApp/SMS → Amazon Transcribe (voice-to-text)
Language Processing → Amazon Translate (to English) → Amazon Comprehend (intent extraction)
Orchestration → Master Orchestrator routes to specialist agent
Agent Processing → Bedrock Agent invokes action groups (Lambda functions)
External Actions → Lambda calls AWS services, external APIs, ML models
Response Generation → Bedrock Agent generates response → Amazon Translate (to user language)
Response Delivery → Amazon Polly (text-to-speech) → IVR/WhatsApp/SMS
Audit Logging → All state changes → DynamoDB Streams → S3

3. Agent Architecture

3.1 Master Orchestrator Agent

Purpose: Route queries to specialized agents and manage conversation state

Foundation Model: Claude 3.5 Sonnet (anthropic.claude-3-5-sonnet-20241022)

Action Groups:

classify_intent (Lambda: orchestrator-intent-classifier)
- Input: User query text (translated to English)
- Processing: Calls Amazon Comprehend Custom Classifier
- Output: Intent category (loan_request, insurance_claim, status_check, etc.) + confidence score
- Routing logic: Maps intent to specialist agent
detect_scheme_domain (Lambda: orchestrator-domain-detector)
- Input: User query text
- Processing: Keyword matching + entity extraction
- Keywords: PM Vishwakarma (["loan", "vishwakarma", "artisan", "credit"]), PMFBY (["crop", "insurance", "fasal", "hailstorm"]), BOCW (["construction", "labor", "bocw", "migrant"])
- Output: Scheme domain + confidence score
manage_conversation_state (Lambda: orchestrator-state-manager)
- Input: Session ID, user message, agent response
- Processing: DynamoDB operations (GetItem, PutItem, UpdateItem)
- State schema: {session_id, user_id, current_domain, conversation_history[], pending_actions[], last_updated, ttl}
- TTL: 24 hours for automatic cleanup
- Output: Updated conversation context
route_to_specialist (Lambda: orchestrator-router)
- Input: Scheme domain, conversation context
- Processing: Invokes target Bedrock Agent via boto3
- Agent mapping: PM_Vishwakarma → agent_id_1, PMFBY → agent_id_2, BOCW → agent_id_3
- Output: Specialist agent response
escalate_to_human (Lambda: orchestrator-human-escalation)
- Input: Session ID, escalation reason
- Processing: Creates ticket in support system, notifies human operator
- Triggers: Negative sentiment detected, agent confidence <50%, user explicitly requests human
- Output: Ticket ID, estimated wait time

Knowledge Base: KB1 (All Government Schemes - 1200+ schemes)

Vector embeddings of scheme guidelines, FAQs, eligibility criteria
Updated weekly via S3 sync

Guardrails:

Accuracy threshold: 70% confidence required for routing
Safety filters: Block PII leakage, prevent hallucination of scheme details
Fallback: Ask clarifying questions if confidence <70%

3.2 PM Vishwakarma Agent Swarm

3.2.1 Shadow Credit Agent

Purpose: Generate alternative credit scores from UPI transaction history

Foundation Model: Claude 3.5 Sonnet

Action Groups:

initiate_account_aggregator_consent (Lambda: pmv-aa-consent-initiator)
- Input: User Aadhaar number, mobile number
- Processing: Calls Account Aggregator API (NBFC-AA licensed entity)
- Consent flow: Generate consent request → Send OTP → Validate OTP → Receive consent token
- Output: Consent token (valid 12 months), consent ID
fetch_transaction_history (Lambda: pmv-aa-data-fetcher)
- Input: Consent token, date range (12 months)
- Processing: Calls Account Aggregator FIU (Financial Information User) API
- Data retrieved: UPI transaction history (date, amount, payer/payee, transaction ID)
- Storage: Encrypted in S3 with KMS, DynamoDB for metadata
- Output: Transaction dataset (JSON array)
calculate_shadow_credit_score (Lambda: pmv-shadow-credit-calculator)
- Input: Transaction dataset
- Processing: Pandas-based financial analysis
- Metrics calculated:
  - Average monthly income: Sum of credits / 12 months
  - Transaction frequency: Count of transactions / days
  - Customer diversity: Unique payers count
  - Income stability: Standard deviation of monthly income
  - Digital footprint age: Days since first transaction
- Scoring algorithm:
```
Shadow_Score = (Income_Stability * 0.4) + 
               (Transaction_Frequency * 0.3) + 
               (Customer_Diversity * 0.2) + 
               (Digital_Footprint * 0.1)
Normalized to 300-900 scale
```
- Thresholds: 700+ (high confidence), 600-699 (moderate), <600 (requires review)
- Output: Shadow Credit Score (integer), contributing factors (JSON)
generate_lending_memo (Lambda: pmv-lending-memo-generator)
- Input: Shadow Credit Score, transaction dataset, user details
- Processing: ReportLab Python library for PDF generation
- Memo contents:
  - Header: BharatSeva AI logo, generation date, applicant name (masked Aadhaar)
  - Shadow Credit Score: Large font display with color coding (green >700, yellow 600-699, red <600)
  - 6-month income chart: Matplotlib bar chart embedded in PDF
  - Transaction consistency graph: Line chart showing daily transaction count
  - Risk assessment summary: AI-generated text explaining score
  - Digital signature: AWS KMS signature for tamper-proof authenticity
- Storage: S3 with pre-signed URL (7-day expiry)
- Output: PDF URL, memo ID
submit_to_bank (Lambda: pmv-bank-submission)
- Input: Memo ID, bank branch details
- Processing: Amazon SES email with PDF attachment
- Email template: Formal letter to bank manager with lending memo
- SMS notification: Sent to applicant confirming submission
- Follow-up: EventBridge scheduled rule checks for bank response after 7 days
- Output: Submission confirmation, tracking ID

Knowledge Base: KB2 (PM Vishwakarma Policies)

Guardrails: Minimum 6 months transaction history required, reject if <10 transactions per month

3.2.2 Proof of Work Agent

Purpose: Verify artisan trades using computer vision on work photos/videos

Foundation Model: Claude 3.5 Sonnet

Action Groups:

collect_work_media (Lambda: pmv-media-collector)
- Input: WhatsApp message with photo/video attachment
- Processing: Download from WhatsApp Business API, upload to S3
- Validation: File size <50MB, formats (MP4, MOV, AVI, JPEG, PNG)
- Video processing: FFmpeg conversion to MP4 H.264 codec
- Output: S3 object key, media type (photo/video)
analyze_trade_verification (Lambda: pmv-trade-analyzer)
- Input: S3 object key
- Processing:
  - For photos: Amazon Rekognition DetectLabels API
  - For videos: SageMaker endpoint invocation (Custom CNN model)
- SageMaker model details:
  - Endpoint: trade-verification-cnn-endpoint
  - Instance: ml.m5.xlarge (real-time inference)
  - Input: Video frames (5 FPS sampling), resized to 224x224
  - Output: Trade category + confidence score
- Trade categories: Carpentry, Pottery, Blacksmithing, Weaving, Cobbling, Tailoring, Masonry, Plumbing, Electrical, Goldsmithing, Basket Weaving, Doll Making, Toy Making, Fishing Net Making, Locksmithing, Sculpting, Stone Carving, Wood Carving
- Confidence thresholds: >85% approved, 70-85% manual review, <70% rejected
- Output: Trade type, confidence score, detected tools/materials
extract_metadata (Lambda: pmv-metadata-extractor)
- Input: S3 object key
- Processing: EXIF data extraction using Pillow library
- Metadata extracted:
  - GPS coordinates (latitude, longitude)
  - Timestamp (photo/video capture time)
  - Device model (for fraud detection)
  - Perceptual hash (SHA-256 for duplicate detection)
- Validation: Timestamp within last 30 days, GPS within India boundaries
- Output: Metadata JSON
generate_trade_certificate (Lambda: pmv-certificate-generator)
- Input: Trade type, confidence score, metadata, user details
- Processing: ReportLab PDF generation
- Certificate contents:
  - Header: "Digital Proof of Trade - Government of India"
  - Applicant details: Name, masked Aadhaar (XXXX-XXXX-1234)
  - Trade type: Large font with icon
  - Verification statement: "Trade verified by BharatSeva AI with 89% confidence"
  - Video thumbnail grid: 4 key frames showing work activity
  - GPS coordinates: Map image from Amazon Location Service
  - QR code: Contains certificate hash for instant verification
  - Digital signature: AWS KMS signature
- Storage: S3 with permanent retention
- Output: Certificate PDF URL, certificate ID
submit_to_pm_vishwakarma_portal (Lambda: pmv-portal-submission)
- Input: Certificate ID, user details
- Processing: API call to PM Vishwakarma portal (government API)
- Payload: Certificate PDF, applicant Aadhaar, trade type, verification date
- Notification: WhatsApp message to user with certificate copy
- District notification: Email to District Implementation Committee
- Output: Submission confirmation, portal reference number

Knowledge Base: KB2 (PM Vishwakarma Policies)

Guardrails: Minimum 85% confidence for auto-approval, manual review for 70-85%

3.2.3 Nudge & Escalation Agent

Purpose: Track PM Vishwakarma application status and escalate delays

Foundation Model: Claude 3.5 Sonnet

Action Groups:

poll_application_status (Lambda: pmv-status-poller)
- Trigger: EventBridge scheduled rule (daily at 6 AM IST)
- Input: List of pending applications from DynamoDB
- Processing: API calls to PM Vishwakarma portal for each application
- Status stages: Submitted → Panchayat Verification → District Approval → Bank Review → Loan Disbursement
- Storage: DynamoDB update with new status, timestamp
- Output: Status changes (array of applications with updated status)
calculate_sla_breach (Lambda: pmv-sla-calculator)
- Input: Application ID, current status, status history
- Processing: Calculate days pending at each stage
- SLA thresholds:
  - Day 15 at Gram Panchayat: Level 1 escalation
  - Day 30 at District Committee: Level 2 escalation
  - Day 45 at Bank Review: Level 3 escalation
  - Day 60 overall: Level 4 escalation (Ministry)
- Output: Escalation level, days overdue
execute_escalation (Lambda: pmv-escalation-executor)
- Input: Application ID, escalation level, official contact details
- Processing:
  - Level 1: IVR call to Gram Pradhan (Amazon Connect), SMS to District Officer
  - Level 2: Email to District Collector (Amazon SES), SMS to State Nodal Officer
  - Level 3: IVR call to bank branch manager, email with Shadow Credit Memo
  - Level 4: Email to Ministry of MSME grievance cell
- IVR script: Pre-recorded message in Hindi/local language explaining delay
- Email template: Formal escalation letter with application details
- Output: Escalation confirmation, notification IDs
notify_applicant (Lambda: pmv-applicant-notifier)
- Input: Application ID, escalation level
- Processing: SMS via Amazon Pinpoint
- Message templates:
  - Level 1: "Your application is being escalated to District Officer due to 15-day delay"
  - Level 2: "District Officer has been notified. You should receive update within 48 hours"
  - Approval: "Your application has been approved! Bank interview scheduled for [date]"
- Output: SMS delivery status

Knowledge Base: KB2 (PM Vishwakarma Policies)

Guardrails: Maximum 1 escalation per level per application (prevent spam)

3.3 PMFBY Agent Swarm

3.3.1 First Responder Agent

Purpose: Proactively detect weather events and initiate crop insurance claims

Foundation Model: Claude 3.5 Sonnet

Action Groups:

monitor_weather_events (Lambda: pmfby-weather-monitor)
- Trigger: EventBridge scheduled rule (every 15 minutes)
- Input: None (polls external APIs)
- APIs integrated:
  - India Meteorological Department (IMD) API: Real-time weather alerts
  - Skymet Weather API: District-level granular data
  - NASA POWER API: Satellite data (rainfall, temperature)
- Event detection: Hailstorm warnings, unseasonal rainfall >50mm, cyclone alerts, flood warnings, drought conditions
- Storage: DynamoDB table weather_events with timestamp, location, event type
- Output: Array of detected weather events
identify_affected_farmers (Lambda: pmfby-farmer-identifier)
- Input: Weather event (location, event type)
- Processing: DynamoDB geo-index query
- Query: Find farmers within 5km radius of event location
- Filters: Active PMFBY policy holders, crop type matches risk (wheat for hailstorm, rice for flood)
- Amazon Location Service: Radius-based geospatial search
- Typical result: 200-500 farmers per localized event
- Output: Array of affected farmer IDs with contact details
initiate_proactive_outreach (Lambda: pmfby-outreach-initiator)
- Input: Array of affected farmer IDs
- Processing: Amazon Connect for automated IVR calls
- Call timing: Within 30 minutes of event detection
- Call script (local language): "Namaste, I'm PMFBY assistant. There was a hailstorm in your area this morning. Did your crop suffer damage?"
- Voice response: Amazon Transcribe captures yes/no
- Branching: If "yes" → proceed to auto-intimation, If "no" → end call with "Stay safe"
- Output: Call results (answered, damage confirmed, no damage, no answer)
file_auto_intimation (Lambda: pmfby-intimation-filer)
- Input: Farmer ID, event type, damage confirmation
- Processing: API call to insurance company portal
- Payload: Farmer ID, event type, event date, crop type, area affected, GPS coordinates
- Insurance companies: 14 companies (ICICI Lombard, HDFC Ergo, Agriculture Insurance Company, etc.)
- Adapter pattern: Different Lambda functions for each company's API format
- Timestamp: Ensures submission within 72-hour window
- Output: Claim reference number, intimation confirmation
send_follow_up_instructions (Lambda: pmfby-followup-sender)
- Input: Claim reference number, farmer contact
- Processing: WhatsApp message via Business API
- Message content:
  - "Your claim intimation has been filed. Claim ID: CLM-2026-789456"
  - "Within 48 hours, submit photos of damaged crop. Follow these tips..."
  - Link to Surveyor Assistant Agent
- Output: Message delivery status

Knowledge Base: KB3 (PMFBY Insurance - 14 company policies)

Guardrails: Only contact farmers with active policies, respect 72-hour intimation window

3.3.2 Surveyor Assistant Agent

Purpose: Guide farmers in submitting high-quality crop damage photos

Foundation Model: Claude 3.5 Sonnet

Action Groups:

accept_photo_submission (Lambda: pmfby-photo-receiver)
- Input: WhatsApp message with photo attachment, claim ID
- Processing: Download photo from WhatsApp Business API
- Validation: File size check, format check (JPEG, PNG, HEIC)
- HEIC conversion: ImageMagick for iPhone photos
- Storage: S3 bucket pmfby-claim-photos with folder structure {claim_id}/{photo_number}.jpg
- Limit: Maximum 10 photos per claim (insurance requirement)
- Output: S3 object key, photo number
validate_photo_quality (Lambda: pmfby-photo-validator)
- Input: S3 object key
- Processing: Amazon Rekognition DetectModerationLabels + custom quality checks
- Quality checks:
  - Resolution: Minimum 1920x1080 (DetectLabels returns image dimensions)
  - Brightness: 30-90% optimal (calculated from pixel histogram)
  - Blur detection: Rekognition sharpness score (reject if <85%)
  - GPS metadata: Extract from EXIF (required for location validation)
- Content validation:
  - Crop type detection: Rekognition labels (wheat, rice, cotton, etc.)
  - Damage indicators: Custom labels (hail_marks, waterlogging, pest_damage)
  - Reference objects: Detect hand, stick for scale
  - Outdoor setting: Reject indoor photos (check for sky, field labels)
- Output: Validation result (accepted/rejected), rejection reason
provide_interactive_feedback (Lambda: pmfby-feedback-sender)
- Input: Validation result, photo number
- Processing: WhatsApp message generation
- Feedback messages:
  - Accepted: "✅ Photo 1: Accepted"
  - Rejected (blur): "❌ Photo 2: Too blurry. Hold phone steady and retake"
  - Rejected (crop not visible): "❌ Photo 3: Crop not visible. Move closer and retake"
- Guidance messages:
  - "Stand in the middle of damaged area"
  - "Include reference for scale (your hand or a stick)"
  - "Ensure good lighting, avoid shadows"
- Output: Message delivery status

enrich_photo_metadata (Lambda: pmfby-metadata-enricher)

Input: S3 object key, validation result
Processing: Create JSON metadata file

Metadata schema:

{
  "photo_id": "CLM-2026-789456-001",
  "claim_id": "CLM-2026-789456",
  "gps_coordinates": {"latitude": 28.7041, "longitude": 77.1025},
  "timestamp": "2026-01-24T10:30:00Z",
  "quality_score": 92,
  "ai_damage_estimate": 45,
  "rekognition_labels": ["wheat", "hail_damage", "field", "outdoor"],
  "confidence_scores": {"wheat": 98.5, "hail_damage": 87.3}
}

Storage: S3 alongside photo with .json extension
Output: Metadata object key

submit_to_insurance_portal (Lambda: pmfby-insurance-uploader)
- Input: Claim ID, array of accepted photo S3 keys
- Processing: Multi-company API integration
- Insurance company adapters: 14 Lambda functions (one per company)
- Upload format: Multipart form data with photos + metadata JSON
- Submission receipt: Generated with photo count, timestamp, claim ID
- Farmer notification: SMS "✅ All photos accepted and submitted. Surveyor will visit within 3 days"
- Output: Submission confirmation, portal reference number

Knowledge Base: KB3 (PMFBY Insurance)

Guardrails: Maximum 10 photos per claim, minimum 3 photos required for submission

3.3.3 Claim Tracker Agent

Purpose: Monitor insurance claim status across 14 companies and provide transparency

Foundation Model: Claude 3.5 Sonnet

Action Groups:

poll_insurance_apis (Lambda: pmfby-claim-poller)
- Trigger: EventBridge scheduled rule (every 6 hours, adaptive to every 2 hours for critical stages)
- Input: List of active claims from DynamoDB
- Processing: API calls to 14 insurance company portals
- API formats: REST (10 companies), SOAP (4 companies)
- Authentication: Rotating API keys stored in AWS Secrets Manager
- Fallback: Web scraping using Selenium on Lambda (if API unavailable)
- Status stages: Intimation Received → Surveyor Assigned → Assessment Complete → Claim Approved → Payment Released
- Storage: DynamoDB update with new status, timestamp
- Output: Status changes (array of claims with updated status)
detect_status_changes (Lambda: pmfby-change-detector)
- Input: Current status, previous status (from DynamoDB)
- Processing: Compare status fields, detect transitions
- Change detection: Status field change, surveyor assignment, payment release
- Output: Array of status change events
notify_status_change (Lambda: pmfby-status-notifier)
- Input: Status change event, farmer contact
- Processing: SMS via Amazon Pinpoint
- Message templates:
  - Surveyor assigned: "Surveyor Mr. Rajesh Sharma will visit your farm on 24th Jan. Mobile: +91-XXXX"
  - Assessment complete: "Survey completed. Claim under review by insurance company"
  - Claim approved: "Claim approved! Payment of ₹45,000 will be released within 7-10 days"
  - Payment released: "Payment of ₹45,000 credited to your account ending XXXX"
- Output: SMS delivery status
explain_delays (Lambda: pmfby-delay-explainer)
- Input: Claim ID, current status, days pending
- Processing: Bedrock Agent generates plain-language explanation
- Explanation logic:
  - If status = "Claim Approved" and days >10: "Payment pending because state government subsidy (40% share) not yet released. This is a normal government process, no bribe needed."
  - If status = "Surveyor Assigned" and days >7: "Surveyor visits are delayed due to high claim volume. Your turn is coming soon."
- Transparency: Differentiates legitimate delays vs stuck claims
- Expected timeline: Provides realistic timeframe based on historical data
- Output: Explanation text
file_grievance (Lambda: pmfby-grievance-filer)
- Input: Claim ID, delay reason
- Trigger: Automatic if claim stuck >90 days
- Processing: API call to insurance ombudsman portal
- Grievance payload: Claim details, delay duration, insurance company name
- Ministry notification: Email to Ministry of Agriculture Insurance Cell
- Farmer notification: SMS "We have escalated your delayed claim to the ombudsman. Reference: GRV-2026-XXXX"
- Output: Grievance reference number

Knowledge Base: KB3 (PMFBY Insurance)

Guardrails: Maximum 1 grievance per claim, 90-day minimum before escalation

3.4 BOCW Agent Swarm

3.4.1 Digital Work-Log Agent

Purpose: GPS and selfie-based attendance tracking for construction workers

Foundation Model: Claude 3.5 Sonnet

Action Groups:

process_checkin (Lambda: bocw-checkin-processor)
- Input: Worker ID, SMS/WhatsApp message "शुरू" (start), selfie, GPS coordinates
- Processing:
  - GPS validation: Amazon Location Service geofencing API
  - Geofence check: Validate GPS within 100m of known construction site
  - Timestamp: Record check-in time in DynamoDB
- Construction site database: DynamoDB table construction_sites with GPS coordinates
- Unregistered sites: If GPS doesn't match, request site photo for validation
- Output: Check-in confirmation, site name
verify_selfie_biometric (Lambda: bocw-selfie-verifier)
- Input: Selfie S3 key, worker Aadhaar photo S3 key
- Processing: Amazon Rekognition CompareFaces API
- Similarity threshold: >95% required for approval
- Liveness detection: Rekognition DetectFaces with quality attributes (checks for 3D face, not printed photo)
- Fraud prevention: Prevents buddy punching (each worker must submit own selfie)
- Output: Verification result (approved/rejected), similarity score
validate_construction_site (Lambda: bocw-site-validator)
- Input: Site photo S3 key, GPS coordinates
- Processing: Amazon Rekognition DetectLabels API
- Construction indicators: Crane, scaffold, concrete, rebar, cement bags, construction equipment
- Confidence threshold: >80% for at least 3 construction indicators
- Site registration: If validated, add to construction_sites table as "Unregistered Construction Site"
- Output: Validation result, detected labels
process_checkout (Lambda: bocw-checkout-processor)
- Input: Worker ID, SMS/WhatsApp message "समाप्त" (end), selfie, GPS coordinates
- Processing:
  - GPS validation: Same as check-in
  - Selfie verification: Same as check-in
  - Hours calculation: Checkout time - check-in time
  - Minimum hours: 4 hours required to count as valid day
- Storage: DynamoDB update with checkout time, hours worked
- Output: Checkout confirmation, hours worked
generate_90day_certificate (Lambda: bocw-certificate-generator)
- Trigger: Automatic when worker reaches 90 valid work days
- Input: Worker ID
- Processing: Query DynamoDB for all work logs
- Certificate contents:
  - Header: "90-Day Employment Certificate - e-Shram"
  - Worker details: Name, Aadhaar number, e-Shram ID
  - Work summary: 90 unique work dates, total hours worked
  - Site details: Site names and GPS coordinates
  - Verification: Selfie verification hashes (SHA-256)
  - Digital signature: AWS KMS signature
- Submission: API call to e-Shram portal and state BOCW board
- Replaces: Contractor employment certificate requirement
- Output: Certificate PDF URL, submission confirmation

Knowledge Base: KB4 (BOCW State Rules - 28 state boards)

Guardrails: Minimum 4 hours per day, maximum 12 hours per day, 95% selfie similarity required

3.4.2 Interstate Bridge Agent

Purpose: Enable benefit portability for migrant construction workers

Foundation Model: Claude 3.5 Sonnet

Action Groups:

detect_migration (Lambda: bocw-migration-detector)
- Trigger: EventBridge scheduled rule (daily check)
- Input: Worker ID, GPS history from DynamoDB
- Processing: Calculate distance between home state and current GPS
- Migration criteria: GPS change >500km from home maintained for >7 consecutive days
- Storage: DynamoDB flag migration_detected = true
- Output: Array of workers with detected migration
initiate_eshram_update (Lambda: bocw-eshram-updater)
- Input: Worker ID, new location (state, district)
- Processing: WhatsApp message "You've moved to Maharashtra. Should I update your current location in e-Shram?"
- User confirmation: Wait for yes/no response
- API call: e-Shram API to update current state, current district, expected duration
- Portability tracking: Enables national-level tracking
- Output: Update confirmation
discover_portable_benefits (Lambda: bocw-benefits-discoverer)
- Input: Worker ID, home state, host state
- Processing: Query knowledge base for portable schemes
- Portable schemes:
  - One Nation One Ration Card (ONORC)
  - Ayushman Bharat PMJAY (₹5 lakh health insurance)
  - PM Shram Yogi Maandhan (pension)
  - Maternity benefits
- Service point mapping: Amazon Location Service to find nearest Fair Price Shop, Ayushman hospital, labor department office
- Output: Array of available benefits with service point locations
guide_host_state_registration (Lambda: bocw-registration-guide)
- Input: Worker ID, host state
- Processing: Bedrock Agent generates step-by-step guidance
- Guidance content:
  - "To access Maharashtra BOCW benefits, register at [address]"
  - "Required documents: Aadhaar, e-Shram card, work log certificate"
  - "Dual registration strategy: Maintain home state (for children's education) + host state (for local healthcare)"
- Knowledge base: KB4 contains registration procedures for all 28 states
- Output: Guidance text, registration office address
map_local_resources (Lambda: bocw-resource-mapper)
- Input: Worker GPS coordinates
- Processing: Amazon Location Service PlaceIndex search
- Resources mapped:
  - Nearest government hospital
  - Fair Price Shop with ration availability
  - Labor welfare office
  - Nearest police station (for safety)
  - Low-cost accommodation (PG/hostels near construction sites)
- Output: WhatsApp message with Google Maps links to all resources

Knowledge Base: KB4 (BOCW State Rules)

Guardrails: Minimum 7 consecutive days in new location before migration detection

4. Communication Channels

4.1 Voice Interface (IVR)

Technology: Amazon Connect

Configuration:

Toll-free number: 1800-XXX-XXXX (accessible from any phone, no internet required)
Contact flows: 3 main menus (PM Vishwakarma, PMFBY, BOCW)
Language selection: 22 Indian languages via Amazon Transcribe and Polly
Call recording: Stored in S3, encrypted with KMS
Average call duration: 3-5 minutes (simple queries), 10-15 minutes (application submission)

Contact Flow Design:

Welcome message: "Welcome to BharatSeva AI. All services are completely free."
Language selection: "Press 1 for Hindi, Press 2 for English, Press 3 for Tamil..."
Main menu: "Press 1 for PM Vishwakarma, Press 2 for Crop Insurance, Press 3 for Construction Worker assistance"
Voice input: Amazon Transcribe streaming transcription
Agent routing: Master Orchestrator processes query
Response: Amazon Polly text-to-speech in selected language
Closing message: "Remember, all government services are free. No one should ask for money."

Transcribe Configuration:

Streaming API: Real-time transcription with <1 second latency
Custom vocabulary: 5000+ terms (scheme names, regional crop names, construction terminology)
Accuracy: >90% for clear speech, >75% for noisy environments
Language auto-detection: Enabled for seamless multilingual support

Polly Configuration:

Neural voices: Aditi (Hindi, female), Raveena (Indian English, female)
SSML support: Custom pronunciation for scheme names, emphasis on important numbers
Speech marks: Word-level timing for synchronized UI animations

4.2 WhatsApp Business Integration

Technology: WhatsApp Business API

Configuration:

Official WhatsApp Business number: +91-XXXX-XXXXXX
End-to-end encryption: Maintained (BharatSeva cannot read message content, only metadata)
Message templates: Pre-approved by WhatsApp for high volume

Supported Interactions:

Text messages: Natural language queries
Photo/video uploads: Trade verification, crop damage photos
Document sharing: PDF certificates, lending memos
Interactive buttons: "Check Status", "Submit Photo", "File Claim"
Location sharing: GPS verification for work logs

Message Flow:

User sends message to WhatsApp number
Webhook: WhatsApp API sends POST request to API Gateway
Lambda: whatsapp-message-handler processes message
Master Orchestrator: Routes to appropriate agent
Response: Lambda sends reply via WhatsApp API
Delivery: WhatsApp delivers message to user

4.3 SMS Fallback

Technology: Amazon Pinpoint

Configuration:

SMS delivery: Across all operators (Jio, Airtel, BSNL, Vi)
Delivery rate: 98%+ across India
Character limit: 160 characters (optimized Hindi messages)

Use Cases:

Internet unavailable (rural areas, during disasters)
WhatsApp not installed
User prefers SMS
Critical notifications (claim status, escalations)

Message Templates:

Status update: "Claim CLM-2026-789456 approved. Payment in 7-10 days"
Escalation: "Application escalated to District Officer. Update in 48 hours"
Reminder: "Submit crop damage photos within 24 hours. Claim ID: CLM-2026-789456"

4.4 Web Interface

Technology: Browser-based chat widget

Configuration:

Hosting: AWS Amplify for static site hosting
Chat widget: Embedded iframe with WebSocket connection
Authentication: JWT tokens (24-hour expiry)

Features:

Real-time chat: WebSocket connection to API Gateway
File upload: Drag-and-drop for photos/videos
Application status dashboard: View all pending applications
Certificate download: Access to generated certificates

Integration Points:

Government portal embeds: PM Vishwakarma portal, PMFBY portal, e-Shram portal
NGO partner websites: Embedded chat widget
Direct access: Standalone web application

5. Machine Learning Models

5.1 Trade Verification CNN

Purpose: Identify artisan trades from video footage

Architecture: Custom CNN with temporal convolution layers

Training:

Training data: 50,000 labeled videos
- Carpentry: 15,000 videos
- Pottery: 12,000 videos
- Blacksmithing: 8,000 videos
- Weaving: 6,000 videos
- Other trades: 9,000 videos
Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge instances (GPU)
Framework: TensorFlow 2.x
Training duration: ~48 hours
Validation accuracy: >85% for 18 traditional trades

Model Architecture:

Input: Video frames (224x224x3, 5 FPS sampling)
↓
Conv3D Layer 1: 32 filters, 3x3x3 kernel, ReLU activation
MaxPooling3D: 2x2x2
↓
Conv3D Layer 2: 64 filters, 3x3x3 kernel, ReLU activation
MaxPooling3D: 2x2x2
↓
Conv3D Layer 3: 128 filters, 3x3x3 kernel, ReLU activation
GlobalAveragePooling3D
↓
Dense Layer 1: 256 units, ReLU activation, Dropout 0.5
Dense Layer 2: 128 units, ReLU activation, Dropout 0.3
↓
Output: 18 units (trade categories), Softmax activation

Deployment:

SageMaker Endpoint: trade-verification-cnn-endpoint
Instance type: ml.m5.xlarge (real-time inference)
Auto-scaling: 1-5 instances based on invocation rate
Latency: <2 seconds per video (30 frames)

Monitoring:

SageMaker Model Monitor: Drift detection on input data distribution
Retraining trigger: Accuracy drops <80% on validation set
Model Registry: All versions tracked for rollback capability

5.2 Crop Damage Assessment Model

Purpose: Estimate crop damage percentage from photos

Architecture: ResNet-50 fine-tuned on agricultural imagery

Training:

Training data: 100,000 insurance claim photos with surveyor damage % labels
Data augmentation: Rotation, brightness adjustment, crop, flip
Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge
Framework: PyTorch
Training duration: ~24 hours
Validation MAE: <8% (mean absolute error on damage percentage)

Model Architecture:

Input: Photo (224x224x3)
↓
ResNet-50 Backbone (pre-trained on ImageNet)
↓
Global Average Pooling
↓
Dense Layer 1: 512 units, ReLU activation, Dropout 0.4
Dense Layer 2: 256 units, ReLU activation, Dropout 0.3
↓
Output: 1 unit (damage percentage 0-100), Linear activation

Deployment:

SageMaker Serverless Inference: Cost-optimized for variable load
Memory: 4096 MB
Max concurrency: 20
Cold start: <10 seconds
Warm inference: <500ms

Output:

Damage percentage: 0-100 (integer)
Confidence interval: ±10% (based on validation MAE)
Used as reference for surveyors, not final decision

5.3 Deepfake Detection Model

Purpose: Detect manipulated selfies for work log fraud prevention

Architecture: EfficientNet-B0 binary classifier

Training:

Training data:
- Authentic selfies: 50,000 (from e-Shram database)
- Synthetic/manipulated images: 50,000 (generated using StyleGAN, FaceSwap)
Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge
Framework: TensorFlow 2.x
Training duration: ~12 hours
Validation accuracy: >92%

Model Architecture:

Input: Selfie (224x224x3)
↓
EfficientNet-B0 Backbone (pre-trained on ImageNet)
↓
Global Average Pooling
↓
Dense Layer: 128 units, ReLU activation, Dropout 0.5
↓
Output: 1 unit (deepfake probability), Sigmoid activation

Deployment:

SageMaker Endpoint: deepfake-detection-endpoint
Instance type: ml.t2.medium (low-cost for simple inference)
Latency: <300ms per image

Decision Logic:

Deepfake probability >0.7: Flag for manual review
Deepfake probability >0.9: Automatic rejection
Deepfake probability <0.7: Approved

6. Data Storage and Management

6.1 DynamoDB Tables

Table 1: conversation_state

Purpose: Store conversation context for Master Orchestrator
Partition key: session_id (String)
Attributes:
- user_id (String): Aadhaar hash or phone number
- current_domain (String): PM_Vishwakarma | PMFBY | BOCW
- conversation_history (List): Last 10 turns [{role, message, timestamp}]
- pending_actions (List): Actions awaiting user input
- last_updated (Number): Unix timestamp
- ttl (Number): 24-hour expiry for automatic cleanup
GSI: user_id-index for querying all sessions by user
Capacity: On-demand (auto-scaling)

Table 2: applications

Purpose: Track all scheme applications across domains
Partition key: application_id (String)
Sort key: scheme_domain (String)
Attributes:
- user_id (String): Aadhaar hash
- status (String): Submitted | Pending | Approved | Rejected
- status_history (List): [{status, timestamp, updated_by}]
- application_data (Map): Scheme-specific data
- created_at (Number): Unix timestamp
- updated_at (Number): Unix timestamp
GSI: user_id-index for querying all applications by user
GSI: status-updated_at-index for querying pending applications
Capacity: On-demand

Table 3: work_logs

Purpose: Store construction worker attendance records
Partition key: worker_id (String)
Sort key: date (String): YYYY-MM-DD format
Attributes:
- checkin_time (Number): Unix timestamp
- checkout_time (Number): Unix timestamp
- checkin_gps (Map): {latitude, longitude}
- checkout_gps (Map): {latitude, longitude}
- site_id (String): Construction site identifier
- hours_worked (Number): Calculated hours
- selfie_s3_keys (List): [checkin_selfie, checkout_selfie]
- verification_status (String): Approved | Rejected
GSI: worker_id-verification_status-index for counting approved days
Capacity: On-demand

Table 4: weather_events

Purpose: Store detected weather events for PMFBY
Partition key: event_id (String)
Attributes:
- event_type (String): Hailstorm | Flood | Drought | Cyclone
- location (Map): {latitude, longitude, district, state}
- severity (String): Low | Medium | High
- detected_at (Number): Unix timestamp
- affected_farmers (List): Array of farmer IDs
- notifications_sent (Boolean)
GSI: location-detected_at-index for geospatial queries
Capacity: On-demand

Table 5: construction_sites

Purpose: Store registered construction sites for BOCW
Partition key: site_id (String)
Attributes:
- site_name (String)
- gps_coordinates (Map): {latitude, longitude}
- site_type (String): Registered | Unregistered
- registration_date (Number): Unix timestamp
- geofence_radius (Number): Meters (default 100)
Geospatial index: For radius-based queries (requires custom implementation)
Capacity: Provisioned (low traffic)

6.2 S3 Buckets

Bucket 1: bharatseva-audit-trail

Purpose: Immutable audit logs from DynamoDB Streams
Lifecycle policy: Transition to Glacier after 90 days, retain for 7 years
Encryption: SSE-KMS with government-approved key
Versioning: Enabled
Object lock: Enabled (compliance mode)

Bucket 2: bharatseva-user-media

Purpose: Store photos, videos, selfies
Folder structure: {user_id}/{media_type}/{timestamp}_{filename}
Lifecycle policy: Transition to Glacier after 90 days
Encryption: SSE-KMS
Access: Pre-signed URLs with 7-day expiry

Bucket 3: bharatseva-certificates

Purpose: Store generated certificates (trade verification, work logs)
Folder structure: {certificate_type}/{user_id}/{certificate_id}.pdf
Lifecycle policy: Permanent retention (no deletion)
Encryption: SSE-KMS
Access: Pre-signed URLs with 30-day expiry

Bucket 4: bharatseva-knowledge-base

Purpose: Store documents for Bedrock Knowledge Bases
Folder structure: {knowledge_base_id}/{document_category}/{filename}.pdf
Update mechanism: Weekly Lambda sync from government portals
Encryption: SSE-S3
Versioning: Enabled for document history

Bucket 5: bharatseva-ml-models

Purpose: Store trained ML models and artifacts
Folder structure: {model_name}/{version}/{artifacts}
Lifecycle policy: Retain latest 5 versions, delete older
Encryption: SSE-S3
Access: SageMaker execution role only

6.3 OpenSearch Serverless (Knowledge Bases)

Collection 1: kb-all-schemes

Purpose: Vector embeddings for 1200+ government schemes
Index: schemes-index
Dimensions: 1536 (Amazon Titan Embeddings v2)
Documents: 10,000+ PDFs (chunked to 500-token segments)
Update frequency: Weekly

Collection 2: kb-pm-vishwakarma

Purpose: PM Vishwakarma policies and guidelines
Index: pmv-index
Documents: 500+ PDFs from MSME ministry
Update frequency: Weekly

Collection 3: kb-pmfby

Purpose: PMFBY insurance policies from 14 companies
Index: pmfby-index
Documents: 1000+ PDFs (policy documents, claim procedures)
Update frequency: Weekly

Collection 4: kb-bocw

Purpose: BOCW state rules for 28 states
Index: bocw-index
Documents: 800+ PDFs (state board rules, benefit schemes)
Update frequency: Weekly

7. External Integrations

7.1 Account Aggregator Framework

Purpose: Fetch UPI transaction history for Shadow Credit Score

Integration Type: REST API

Provider: NBFC-AA licensed entities (e.g., Sahamati, OneMoney, CAMS Finserv)

API Endpoints:

Consent Request: POST /Consent
- Request: User Aadhaar, mobile, data range (12 months)
- Response: Consent ID, OTP sent to user
Consent Verification: POST /Consent/{consent_id}/verify
- Request: OTP
- Response: Consent token (valid 12 months)
Data Fetch: GET /FI/fetch
- Request: Consent token, FIP (Financial Information Provider) ID
- Response: Transaction data (JSON array)

Authentication: OAuth 2.0 with client credentials

Data Format: FI Data Schema v2.0 (RBI standard)

Compliance: UIDAI-compliant, purpose limitation enforced

7.2 Weather APIs

API 1: India Meteorological Department (IMD)

Endpoint: https://api.imd.gov.in/weather/alerts
Authentication: API key
Polling frequency: Every 15 minutes
Data: Real-time weather alerts (hailstorm, cyclone, flood)
Response format: JSON

API 2: Skymet Weather

Endpoint: https://api.skymetweather.com/v1/district-weather
Authentication: API key
Polling frequency: Every 15 minutes
Data: District-level granular data (rainfall, temperature)
Response format: JSON

API 3: NASA POWER

Endpoint: https://power.larc.nasa.gov/api/temporal/daily/point
Authentication: None (public API)
Polling frequency: Daily
Data: Satellite data (rainfall, temperature, solar radiation)
Response format: JSON

7.3 Government Portals

Portal 1: PM Vishwakarma Portal

Endpoint: https://pmvishwakarma.gov.in/api/v1/
Authentication: API key + digital signature
Operations:
- Submit application: POST /applications
- Check status: GET /applications/{application_id}/status
- Upload certificate: POST /applications/{application_id}/documents
Response format: JSON

Portal 2: PMFBY Insurance Companies (14 APIs)

Example: ICICI Lombard
- Endpoint: https://api.icicilombard.com/pmfby/v1/
- Authentication: API key
- Operations:
  - File intimation: POST /intimations
  - Upload photos: POST /claims/{claim_id}/photos
  - Check status: GET /claims/{claim_id}/status
Adapter pattern: Separate Lambda for each company's API format

Portal 3: e-Shram Portal

Endpoint: https://eshram.gov.in/api/v1/
Authentication: API key + Aadhaar authentication
Operations:
- Update location: PUT /workers/{eshram_id}/location
- Submit certificate: POST /workers/{eshram_id}/certificates
- Query benefits: GET /workers/{eshram_id}/benefits
Response format: JSON

7.4 UIDAI Aadhaar Authentication

Purpose: Authenticate users for sensitive operations

Integration Type: UIDAI-compliant authentication

API Endpoints:

OTP Request: POST /aadhaar/otp/request
- Request: Aadhaar number
- Response: Transaction ID, OTP sent to registered mobile
OTP Verification: POST /aadhaar/otp/verify
- Request: Transaction ID, OTP
- Response: Authentication status, user details (name, DOB, gender)
Biometric Authentication: POST /aadhaar/biometric/auth
- Request: Aadhaar number, biometric data (fingerprint/iris)
- Response: Authentication status

Compliance:

Purpose limitation: Only for authentication, not stored as primary ID
Data localization: All data in AWS Mumbai (ap-south-1)
Audit logging: UIDAI-compliant format

8. Security and Compliance

8.1 Data Encryption

At Rest:

DynamoDB: Encryption at rest with AWS-managed keys
S3: SSE-KMS encryption with government-approved keys
Aadhaar numbers: Application-level encryption using AWS KMS before storage
Storage format: SHA-256 hash for primary key, encrypted value for display

In Transit:

TLS 1.3: All API calls, webhooks, database connections
Certificate pinning: For critical external APIs (Account Aggregator, UIDAI)

Key Management:

AWS KMS: Customer-managed keys (CMK) for sensitive data
Key rotation: Automatic annual rotation
Key access: IAM policies with least privilege

8.2 Fraud Prevention

Deepfake Detection:

SageMaker model: Analyzes selfies for manipulation
Detection methods: GAN artifacts, unnatural eye movements, facial inconsistencies
Action: >0.7 probability triggers manual review

GPS Spoofing Detection:

Validation: Location changes against realistic travel speeds (<100 km/h)
Cross-check: Cell tower data from mobile operator (if available)
Action: Suspicious patterns flagged for investigation

Duplicate Application Check:

DynamoDB query: Same user + scheme within 30 days
Deduplication: Prevent multiple applications for same benefit
Action: Reject duplicate, notify user

Video Reuse Detection:

Perceptual hashing: SHA-256 hash of video frames
Database: Store hashes in DynamoDB
Action: Block resubmission of same video by different users

8.3 Audit and Compliance

CloudTrail:

Logging: Every AWS API call logged
Retention: 7 years (regulatory requirement)
Storage: S3 with object lock (compliance mode)

DynamoDB Streams:

Purpose: Capture all state changes
Processing: Lambda streams to S3 as immutable logs
Format: JSON with timestamp, user_id, action, before/after state
Retention: 7 years

Aadhaar Act Compliance:

Purpose limitation: Only for authentication, not stored as primary ID
User consent: Collected before each Aadhaar use
Data localization: All data in AWS Mumbai (ap-south-1)
Audit format: UIDAI-compliant logging

Data Retention Policy:

User profiles: 5 years
Applications: 7 years (audit requirement)
Audit logs: 7 years
Temporary data: Voice recordings deleted after 30 days

8.4 User Privacy Controls

Right to Erasure:

Lambda function: privacy-data-eraser
Actions: Delete from DynamoDB, S3, anonymize audit logs
Timeline: Within 30 days of request
Exceptions: Audit logs retained in anonymized form

Data Access Request:

Lambda function: privacy-data-exporter
Output: JSON file with all user data
Delivery: Secure download link (7-day expiry)

Consent Management:

Granular permissions: Aadhaar, Account Aggregator, Location, Photos
Consent storage: DynamoDB with timestamp
Revocation: User can revoke consent anytime

8.5 Access Control

IAM Roles:

Lambda execution roles: Least privilege (only required services)
Bedrock Agent roles: Access to action groups, knowledge bases
Human operators: Read-only access to audit logs

RBAC (Role-Based Access Control):

Admin: Full access to all resources
Operator: Read-only access to applications, ability to escalate
Auditor: Read-only access to audit logs
Developer: Access to non-production environments only

API Authentication:

External APIs: API keys stored in AWS Secrets Manager
Rotation: Automatic 90-day rotation
Monitoring: CloudWatch alarms for failed authentication attempts

9. Workflow Orchestration

9.1 AWS Step Functions

Purpose: Orchestrate complex multi-agent workflows

State Machine 1: shadow-credit-workflow

Purpose: End-to-end Shadow Credit Score generation
Steps:
1. Initiate Consent: Lambda pmv-aa-consent-initiator
2. Wait for OTP: Wait state (max 5 minutes)
3. Verify Consent: Lambda pmv-aa-consent-verifier
4. Fetch Transactions: Lambda pmv-aa-data-fetcher
5. Calculate Score: Lambda pmv-shadow-credit-calculator
6. Generate Memo: Lambda pmv-lending-memo-generator
7. Submit to Bank: Lambda pmv-bank-submission
Error handling: Exponential backoff retry (max 3 attempts)
Timeout: 30 minutes total
State storage: DynamoDB with TTL-based cleanup after 90 days

State Machine 2: crop-insurance-claim-workflow

Purpose: End-to-end PMFBY claim processing
Steps:
1. Detect Weather Event: Lambda pmfby-weather-monitor
2. Identify Farmers: Lambda pmfby-farmer-identifier
3. Initiate Outreach: Lambda pmfby-outreach-initiator (parallel for multiple farmers)
4. File Intimation: Lambda pmfby-intimation-filer
5. Send Instructions: Lambda pmfby-followup-sender
6. Wait for Photos: Wait state (max 48 hours)
7. Validate Photos: Lambda pmfby-photo-validator (parallel for multiple photos)
8. Submit to Insurance: Lambda pmfby-insurance-uploader
Error handling: Retry with exponential backoff
Timeout: 72 hours total
Parallel execution: Up to 500 farmers per weather event

State Machine 3: interstate-migration-workflow

Purpose: BOCW benefit portability
Steps:
1. Detect Migration: Lambda bocw-migration-detector
2. Notify Worker: Lambda bocw-migration-notifier
3. Wait for Confirmation: Wait state (max 7 days)
4. Update e-Shram: Lambda bocw-eshram-updater
5. Discover Benefits: Lambda bocw-benefits-discoverer
6. Guide Registration: Lambda bocw-registration-guide
7. Map Resources: Lambda bocw-resource-mapper
Error handling: Retry with exponential backoff
Timeout: 14 days total

9.2 EventBridge Scheduled Rules

Rule 1: weather-monitoring-schedule

Schedule: Every 15 minutes
Target: Lambda pmfby-weather-monitor
Purpose: Poll weather APIs for adverse events

Rule 2: application-status-polling-schedule

Schedule: Daily at 6 AM IST (cron: 0 0 * * ? *)
Target: Lambda pmv-status-poller
Purpose: Check PM Vishwakarma application status

Rule 3: claim-status-polling-schedule

Schedule: Every 6 hours (cron: 0 */6 * * ? *)
Target: Lambda pmfby-claim-poller
Purpose: Check PMFBY claim status across 14 insurance companies

Rule 4: migration-detection-schedule

Schedule: Daily at 8 AM IST (cron: 0 2 * * ? *)
Target: Lambda bocw-migration-detector
Purpose: Detect interstate migration for construction workers

Rule 5: knowledge-base-sync-schedule

Schedule: Weekly on Sunday at 2 AM IST (cron: 0 20 ? * SUN *)
Target: Lambda kb-sync-orchestrator
Purpose: Sync government documents to S3 for Knowledge Bases

9.3 Error Handling Strategy

Retry Policy:

Exponential backoff: 1s, 2s, 4s, 8s, 16s
Maximum attempts: 3
Jitter: Random delay (0-1s) to prevent thundering herd

Fallback Actions:

API failure: Switch to backup API or web scraping
ML model failure: Use rule-based fallback logic
External service unavailable: Queue request for later processing

Human Escalation:

Trigger: All retries exhausted
Action: Create ticket in support system, notify human operator
Notification: SMS to beneficiary with ticket ID and estimated resolution time

Circuit Breaker:

Purpose: Prevent cascading failures
Threshold: 50% error rate over 5 minutes
Action: Open circuit, return cached response or error message
Recovery: Automatic retry after 5 minutes

10. Performance and Scalability

10.1 Performance Targets

Latency:

Voice transcription: <3 seconds (95th percentile)
Agent response generation: <5 seconds (95th percentile)
Photo validation: <2 seconds per photo
ML model inference: <2 seconds (trade verification), <500ms (crop damage)
End-to-end IVR call: <30 seconds for simple queries

Throughput:

Concurrent voice calls: 10,000 (Amazon Connect capacity)
SMS volume: 100,000 per hour (Amazon Pinpoint capacity)
WhatsApp messages: 50,000 per hour (Business API limit)
API requests: 10,000 requests per second (API Gateway limit)

Availability:

System uptime: 99.9% measured monthly
Planned maintenance: <4 hours per month
Disaster recovery: RTO 4 hours, RPO 1 hour

10.2 Scalability Strategy

DynamoDB:

Capacity mode: On-demand (auto-scaling)
Scaling: Automatic based on traffic
Partition key design: High cardinality (session_id, application_id, worker_id)

Lambda:

Concurrency: Reserved concurrency for critical functions (1000 per function)
Provisioned concurrency: For latency-sensitive functions (Master Orchestrator)
Timeout: 15 minutes maximum (Step Functions for longer workflows)

SageMaker Endpoints:

Auto-scaling: 1-5 instances based on invocation rate
Scaling policy: Target tracking (70% CPU utilization)
Cold start mitigation: Provisioned concurrency for critical models

API Gateway:

Throttling: 10,000 requests per second per account
Burst: 5,000 requests
Caching: Enabled for read-heavy endpoints (TTL 5 minutes)

S3:

Request rate: 5,500 GET/HEAD per second per prefix
Prefix strategy: Date-based partitioning ({year}/{month}/{day}/)
Transfer acceleration: Enabled for large file uploads

10.3 Monitoring and Observability

CloudWatch Metrics:

Lambda: Invocation count, duration, error rate, throttles
DynamoDB: Read/write capacity, throttled requests, latency
SageMaker: Invocation count, model latency, 4xx/5xx errors
API Gateway: Request count, latency, 4xx/5xx errors
Amazon Connect: Call volume, average handle time, abandonment rate

CloudWatch Alarms:

Lambda error rate >5%: SNS notification to ops team
DynamoDB throttled requests >10: Auto-scaling trigger
SageMaker model latency >5s: SNS notification
API Gateway 5xx errors >1%: SNS notification
Amazon Connect abandonment rate >10%: SNS notification

CloudWatch Logs:

Lambda: All function logs with structured JSON
API Gateway: Access logs with request/response details
Step Functions: Execution history with state transitions
Retention: 30 days (cost optimization)

X-Ray Tracing:

Enabled for: Lambda, API Gateway, DynamoDB
Sampling rate: 10% of requests (cost optimization)
Use case: Trace end-to-end request flow, identify bottlenecks

Custom Dashboards:

Dashboard 1: System health (error rates, latency, availability)
Dashboard 2: Business metrics (applications submitted, claims filed, certificates generated)
Dashboard 3: Cost optimization (Lambda invocations, DynamoDB capacity, S3 storage)

11. Cost Optimization

11.1 Cost Breakdown (Estimated Monthly)

Compute:

Lambda: ~$5,000 (10M invocations, 512MB average, 3s average duration)
Bedrock Agents: ~$15,000 (Claude 3.5 Sonnet, 50M tokens input, 10M tokens output)
SageMaker Endpoints: ~$2,000 (ml.m5.xlarge 24/7, serverless inference)

Storage:

DynamoDB: ~$1,000 (on-demand, 100GB storage, 10M read/write per month)
S3: ~$500 (1TB storage, 10M GET requests, 1M PUT requests)
OpenSearch Serverless: ~$1,500 (4 OCUs for indexing, 4 OCUs for search)

Communication:

Amazon Connect: ~$3,000 (100,000 minutes, $0.018 per minute)
Amazon Transcribe: ~$1,500 (100,000 minutes, $0.024 per minute)
Amazon Polly: ~$500 (10M characters, $0.016 per 1M characters)
Amazon Pinpoint SMS: ~$2,000 (500,000 SMS, $0.00645 per SMS in India)
WhatsApp Business API: ~$1,000 (100,000 messages, $0.01 per message)

AI/ML:

Amazon Rekognition: ~$1,000 (100,000 images, $0.001 per image)
Amazon Comprehend: ~$500 (10M characters, $0.0001 per unit)
Amazon Translate: ~$300 (10M characters, $0.000015 per character)

Total Estimated Monthly Cost: ~$35,000

Cost per Beneficiary Interaction: ~$0.35 (assuming 100,000 interactions per month)

11.2 Cost Optimization Strategies

Lambda:

Right-sizing: Use Lambda Power Tuning to optimize memory allocation
Provisioned concurrency: Only for latency-sensitive functions
Code optimization: Reduce cold starts, minimize dependencies

DynamoDB:

On-demand mode: For unpredictable traffic patterns
TTL: Automatic cleanup of expired conversation state (24 hours)
Compression: Store large attributes (conversation_history) as compressed JSON

S3:

Lifecycle policies: Transition to Glacier after 90 days (80% cost reduction)
Intelligent-Tiering: For unpredictable access patterns
Compression: Store photos/videos in compressed formats

SageMaker:

Serverless inference: For variable load (crop damage model)
Spot instances: For training jobs (70% cost reduction)
Model optimization: Quantization, pruning for faster inference

Bedrock:

Prompt optimization: Reduce token usage with concise prompts
Caching: Cache common responses in DynamoDB (TTL 1 hour)
Batch processing: Group multiple queries when possible

Communication:

SMS optimization: Use 160-character limit efficiently
WhatsApp preference: Encourage WhatsApp over SMS (lower cost)
IVR optimization: Reduce average handle time with better prompts

12. Deployment Architecture

12.1 AWS Region Strategy

Primary Region: ap-south-1 (Mumbai)

Reason: Data localization requirement (Aadhaar Act compliance)
All user data must reside in India

Disaster Recovery Region: ap-south-2 (Hyderabad)

Purpose: Backup for critical services
Replication: S3 cross-region replication, DynamoDB global tables
Failover: Manual (RTO 4 hours)

12.2 Environment Strategy

Development Environment:

Purpose: Feature development and testing
Resources: Scaled-down versions (1/10th of production)
Data: Synthetic test data only
Access: Developers only

Staging Environment:

Purpose: Pre-production testing
Resources: Same configuration as production
Data: Anonymized production data
Access: Developers, QA, stakeholders

Production Environment:

Purpose: Live system serving beneficiaries
Resources: Full-scale with auto-scaling
Data: Real user data (encrypted)
Access: Ops team only (read-only for most)

12.3 CI/CD Pipeline

Source Control: GitHub

Repository structure: Monorepo with folders per service
Branching strategy: GitFlow (main, develop, feature branches)

Build Pipeline (GitHub Actions):

Lint: ESLint for JavaScript, Pylint for Python
Test: Unit tests (Jest, pytest), integration tests
Build: Package Lambda functions, Docker images for SageMaker
Security Scan: Snyk for dependency vulnerabilities
Artifact Storage: S3 bucket bharatseva-artifacts

Deployment Pipeline (AWS CodePipeline):

Source: GitHub webhook triggers pipeline
Build: CodeBuild runs tests and packages artifacts
Deploy to Staging: CloudFormation stack update
Integration Tests: Automated tests against staging
Manual Approval: Stakeholder approval required
Deploy to Production: CloudFormation stack update
Smoke Tests: Automated health checks

Infrastructure as Code: AWS CloudFormation

Templates: Separate stacks for each service
Parameters: Environment-specific (dev, staging, prod)
Drift detection: Daily checks for manual changes

12.4 Rollback Strategy

Lambda Functions:

Versioning: All functions versioned
Aliases: prod alias points to stable version
Rollback: Update alias to previous version (instant)

SageMaker Models:

Model Registry: All models versioned
Endpoint update: Blue/green deployment
Rollback: Update endpoint to previous model version

Database Schema:

DynamoDB: Backward-compatible schema changes only
Migration: Lambda function for data migration
Rollback: Restore from point-in-time backup (max 5 minutes data loss)

API Gateway:

Stages: Separate stages for each environment
Canary deployment: 10% traffic to new version, monitor for 1 hour
Rollback: Revert stage to previous deployment

13. Testing Strategy

13.1 Unit Testing

Lambda Functions:

Framework: Jest (JavaScript), pytest (Python)
Coverage target: >80%
Mocking: AWS SDK calls mocked with aws-sdk-mock
Test cases: Happy path, error handling, edge cases

Bedrock Agent Action Groups:

Framework: pytest with moto for AWS mocking
Test cases: Valid inputs, invalid inputs, API failures
Assertions: Response format, error messages, side effects

13.2 Integration Testing

API Testing:

Framework: Postman collections, Newman for automation
Test cases: End-to-end workflows (Shadow Credit, Claim Filing, Work Log)
Environment: Staging environment with test data
Assertions: Response codes, response body, database state

Agent Testing:

Framework: Custom Python scripts
Test cases: Multi-turn conversations, domain switching, error recovery
Environment: Staging Bedrock Agents
Assertions: Intent classification accuracy, response relevance

13.3 ML Model Testing

Model Accuracy Testing:

Framework: pytest with scikit-learn metrics
Test cases: Validation set (20% of training data)
Metrics: Accuracy, precision, recall, F1-score (classification), MAE (regression)
Threshold: >85% accuracy for trade verification, <8% MAE for crop damage

Model Bias Testing:

Framework: Fairlearn library
Test cases: Performance across demographic groups (gender, region, crop type)
Metrics: Demographic parity, equalized odds
Threshold: <10% disparity across groups

Model Robustness Testing:

Framework: Adversarial Robustness Toolbox (ART)
Test cases: Adversarial examples, noisy inputs
Metrics: Accuracy under perturbation
Threshold: >70% accuracy with 10% noise

13.4 Load Testing

IVR Load Testing:

Tool: Amazon Connect load testing tool
Scenario: 10,000 concurrent calls
Metrics: Call success rate, average handle time, abandonment rate
Threshold: >99% success rate, <30s handle time, <5% abandonment

API Load Testing:

Tool: Apache JMeter
Scenario: 10,000 requests per second for 10 minutes
Metrics: Response time, error rate, throughput
Threshold: <5s response time (95th percentile), <1% error rate

Database Load Testing:

Tool: DynamoDB load testing script
Scenario: 10,000 read/write per second
Metrics: Latency, throttled requests
Threshold: <100ms latency (95th percentile), 0 throttled requests

13.5 Security Testing

Penetration Testing:

Frequency: Quarterly
Scope: API endpoints, authentication, authorization
Tools: OWASP ZAP, Burp Suite
Findings: Documented and prioritized for remediation

Vulnerability Scanning:

Frequency: Weekly
Scope: Dependencies, Docker images, Lambda functions
Tools: Snyk, AWS Inspector
Findings: Auto-remediation for critical vulnerabilities

Compliance Auditing:

Frequency: Annual
Scope: Aadhaar Act compliance, data retention, encryption
Auditor: Third-party security firm
Findings: Documented and remediated within 30 days

14. Disaster Recovery and Business Continuity

14.1 Backup Strategy

DynamoDB:

Point-in-time recovery: Enabled (restore to any point in last 35 days)
On-demand backups: Daily automated backups
Retention: 35 days
Cross-region replication: Global tables to ap-south-2 (Hyderabad)

S3:

Versioning: Enabled for all buckets
Cross-region replication: Critical buckets replicated to ap-south-2
Lifecycle policies: Transition to Glacier after 90 days
Object lock: Enabled for audit trail (compliance mode)

Lambda Functions:

Version control: All code in GitHub
Deployment artifacts: Stored in S3 with versioning
Rollback: Instant via alias update

SageMaker Models:

Model Registry: All models versioned
Artifacts: Stored in S3 with versioning
Rollback: Update endpoint to previous model version

14.2 Disaster Recovery Plan

RTO (Recovery Time Objective): 4 hours RPO (Recovery Point Objective): 1 hour

Disaster Scenarios:

Scenario 1: Region Failure (ap-south-1 unavailable)

Detection: CloudWatch alarms, health checks
Action:
1. Activate DR region (ap-south-2)
2. Update Route 53 DNS to point to DR region
3. Restore DynamoDB from global tables
4. Restore S3 from cross-region replication
5. Deploy Lambda functions from artifacts
6. Update external integrations (webhooks, API endpoints)
Timeline: 4 hours
Data loss: <1 hour (RPO)

Scenario 2: Data Corruption

Detection: Data validation checks, user reports
Action:
1. Identify affected tables/buckets
2. Restore from point-in-time backup (DynamoDB)
3. Restore from versioned objects (S3)
4. Validate data integrity
Timeline: 2 hours
Data loss: Minimal (point-in-time recovery)

Scenario 3: Security Breach

Detection: CloudTrail anomaly detection, security alerts
Action:
1. Isolate affected resources (security groups, IAM policies)
2. Rotate all credentials (API keys, passwords)
3. Audit access logs (CloudTrail, VPC Flow Logs)
4. Restore from clean backup if necessary
5. Notify affected users
Timeline: 1 hour (isolation), 4 hours (full recovery)

14.3 Business Continuity

Critical Services (must remain operational):

IVR system (Amazon Connect)
Master Orchestrator Agent
SMS notifications (Amazon Pinpoint)
Aadhaar authentication

Degraded Mode Operations:

If Bedrock unavailable: Use rule-based fallback logic
If ML models unavailable: Manual review for trade verification, crop damage
If external APIs unavailable: Queue requests for later processing
If WhatsApp unavailable: Fall back to SMS

Communication Plan:

Internal: Slack channel for ops team, PagerDuty for on-call
External: Status page for beneficiaries, SMS notifications for critical outages
Stakeholders: Email updates every 2 hours during incident

15. Future Enhancements

15.1 Phase 2 Features (6-12 months)

Multilingual Knowledge Bases:

Current: English-only knowledge bases with runtime translation
Enhancement: Native language embeddings for 22 Indian languages
Benefit: Improved semantic search accuracy for regional languages

Blockchain Integration:

Current: SHA-256 hashes for certificates
Enhancement: Store certificate hashes on blockchain (Hyperledger Fabric)
Benefit: Immutable verification, prevent tampering

Predictive Analytics:

Current: Reactive (respond to user queries)
Enhancement: Proactive (predict user needs based on patterns)
Example: "You're eligible for PM Vishwakarma loan based on your transaction history"

Voice Biometrics:

Current: Aadhaar OTP for authentication
Enhancement: Voice biometric authentication (Amazon Connect Voice ID)
Benefit: Faster authentication, better user experience

Offline Mobile App:

Current: IVR, WhatsApp, SMS (require connectivity)
Enhancement: Progressive Web App (PWA) with offline capabilities
Benefit: Work in areas with intermittent connectivity

15.2 Phase 3 Features (12-24 months)

Computer Vision for Surveyor Assistance:

Current: Photo validation only
Enhancement: Real-time AR guidance for surveyors (mobile app)
Example: Overlay damage percentage estimate on live camera feed

Natural Language Generation for Reports:

Current: Template-based reports
Enhancement: AI-generated narrative reports (Bedrock Claude)
Example: "Based on your transaction history, you have a strong credit profile..."

Multi-Modal Interaction:

Current: Voice, text, photos
Enhancement: Video calls with AI avatar (Amazon IVS + Bedrock)
Benefit: More engaging user experience

Integration with More Schemes:

Current: 3 schemes (PM Vishwakarma, PMFBY, BOCW)
Enhancement: 10+ schemes (PM-KISAN, Ayushman Bharat, MGNREGA, etc.)
Benefit: One-stop solution for all government services

AI-Powered Grievance Resolution:

Current: Manual escalation to human operators
Enhancement: AI-powered grievance analysis and resolution
Example: Automatically identify root cause, suggest resolution

15.3 Research and Innovation

Federated Learning for Privacy:

Current: Centralized ML models
Enhancement: Federated learning (train on-device, aggregate updates)
Benefit: Enhanced privacy, no raw data leaves device

Explainable AI:

Current: Black-box ML models
Enhancement: SHAP/LIME for model interpretability
Benefit: Transparency, trust, regulatory compliance

Low-Resource Language Support:

Current: 22 Indian languages (major languages)
Enhancement: 100+ languages (tribal languages, dialects)
Benefit: Reach underserved communities

Edge Computing:

Current: Cloud-based processing
Enhancement: Edge processing on mobile devices (AWS IoT Greengrass)
Benefit: Lower latency, offline capabilities

16. Risks and Mitigations

16.1 Technical Risks

Risk 1: Bedrock Agent Hallucination

Description: Agent generates incorrect information about schemes
Impact: High (misinformation to beneficiaries)
Probability: Medium
Mitigation:
- Guardrails: Accuracy threshold 70%, reject low-confidence responses
- Knowledge bases: Ground responses in official documents
- Human review: Flag responses for manual review if confidence <80%
- Testing: Regular accuracy testing with ground truth dataset

Risk 2: ML Model Drift

Description: Model accuracy degrades over time due to data distribution changes
Impact: Medium (incorrect trade verification, crop damage estimates)
Probability: High
Mitigation:
- Monitoring: SageMaker Model Monitor for drift detection
- Retraining: Automatic retraining trigger if accuracy <80%
- A/B testing: Test new models on 10% traffic before full rollout
- Fallback: Manual review if model confidence <85%

Risk 3: External API Failures

Description: Government portals, weather APIs, Account Aggregator unavailable
Impact: High (core functionality blocked)
Probability: Medium
Mitigation:
- Retry logic: Exponential backoff with max 3 attempts
- Fallback: Web scraping for government portals, cached weather data
- Circuit breaker: Prevent cascading failures
- Queue: Store requests for later processing when API recovers

Risk 4: DDoS Attack

Description: Malicious traffic overwhelms system
Impact: High (service unavailable)
Probability: Low
Mitigation:
- AWS Shield: Standard DDoS protection (free)
- AWS WAF: Rate limiting, IP blocking
- CloudFront: CDN for static content, absorb traffic spikes
- Auto-scaling: Lambda, DynamoDB scale automatically

16.2 Operational Risks

Risk 5: Data Breach

Description: Unauthorized access to user data (Aadhaar, financial data)
Impact: Critical (legal liability, loss of trust)
Probability: Low
Mitigation:
- Encryption: At rest (KMS), in transit (TLS 1.3)
- Access control: IAM policies with least privilege
- Monitoring: CloudTrail for audit, GuardDuty for threat detection
- Compliance: Regular security audits, penetration testing

Risk 6: Cost Overrun

Description: Unexpected traffic spike leads to high AWS bills
Impact: Medium (budget exceeded)
Probability: Medium
Mitigation:
- Budgets: AWS Budgets with alerts at 80%, 100%, 120%
- Throttling: API Gateway rate limiting
- Reserved capacity: Reserved concurrency for Lambda
- Cost optimization: Regular review of usage patterns

Risk 7: Key Personnel Departure

Description: Loss of critical team members (ML engineers, DevOps)
Impact: Medium (delayed development, knowledge loss)
Probability: Medium
Mitigation:
- Documentation: Comprehensive design docs, runbooks
- Knowledge sharing: Regular team meetings, pair programming
- Cross-training: Multiple team members trained on each component
- Vendor support: AWS Professional Services for critical issues

16.3 Regulatory Risks

Risk 8: Aadhaar Act Non-Compliance

Description: Violation of Aadhaar Act (purpose limitation, data localization)
Impact: Critical (legal penalties, system shutdown)
Probability: Low
Mitigation:
- Legal review: Regular compliance audits by legal team
- Data localization: All data in ap-south-1 (Mumbai)
- Purpose limitation: Only use Aadhaar for authentication
- Audit trail: UIDAI-compliant logging

Risk 9: Insurance Regulatory Changes

Description: IRDAI changes PMFBY claim procedures
Impact: Medium (system updates required)
Probability: Medium
Mitigation:
- Monitoring: Daily scraping of government gazette
- Flexibility: Configurable workflows (Step Functions)
- Knowledge base: Weekly sync of policy documents
- Stakeholder engagement: Regular meetings with insurance companies

Risk 10: AI Regulation

Description: New AI regulations (e.g., EU AI Act equivalent in India)
Impact: Medium (compliance requirements, system changes)
Probability: Medium
Mitigation:
- Explainability: Implement SHAP/LIME for model interpretability
- Human oversight: Manual review for critical decisions
- Transparency: Disclose AI usage to beneficiaries
- Monitoring: Track AI regulation developments

17. Success Metrics

17.1 User Adoption Metrics

Metric 1: Active Users

Definition: Unique users interacting with system per month
Target: 100,000 users by Month 6, 500,000 users by Month 12
Measurement: DynamoDB query on user_id (distinct count)

Metric 2: Channel Distribution

Definition: Percentage of interactions by channel (IVR, WhatsApp, SMS, Web)
Target: IVR 50%, WhatsApp 30%, SMS 15%, Web 5%
Measurement: CloudWatch metrics by channel

Metric 3: Language Distribution

Definition: Percentage of interactions by language
Target: Hindi 40%, English 20%, Regional languages 40%
Measurement: Amazon Transcribe language detection logs

17.2 Operational Metrics

Metric 4: Application Submission Rate

Definition: Number of applications submitted per month
Target: 10,000 applications by Month 6, 50,000 applications by Month 12
Measurement: DynamoDB query on applications table

Metric 5: Application Approval Rate

Definition: Percentage of applications approved
Target: >70% approval rate
Measurement: DynamoDB query on applications table (status = Approved)

Metric 6: Average Processing Time

Definition: Days from application submission to approval
Target: <30 days (PM Vishwakarma), <7 days (PMFBY), <90 days (BOCW)
Measurement: DynamoDB query on status_history timestamps

Metric 7: Escalation Rate

Definition: Percentage of applications requiring escalation
Target: <20% escalation rate
Measurement: DynamoDB query on applications table (escalation_level > 0)

17.3 Quality Metrics

Metric 8: Agent Accuracy

Definition: Percentage of agent responses rated as accurate by users
Target: >90% accuracy
Measurement: User feedback (thumbs up/down after each interaction)

Metric 9: ML Model Accuracy

Definition: Accuracy of trade verification, crop damage models
Target: >85% accuracy (trade verification), <8% MAE (crop damage)
Measurement: SageMaker Model Monitor on validation set

Metric 10: First Contact Resolution

Definition: Percentage of queries resolved in first interaction
Target: >80% first contact resolution
Measurement: Conversation state analysis (no follow-up within 24 hours)

17.4 User Satisfaction Metrics

Metric 11: Net Promoter Score (NPS)

Definition: Likelihood of users recommending system to others
Target: NPS >50 (excellent)
Measurement: Post-interaction survey (0-10 scale)

Metric 12: User Satisfaction (CSAT)

Definition: User satisfaction with interaction
Target: CSAT >4.0 out of 5.0
Measurement: Post-interaction survey (1-5 scale)

Metric 13: Task Completion Rate

Definition: Percentage of users who complete intended task
Target: >85% task completion
Measurement: Workflow completion in Step Functions

17.5 Business Impact Metrics

Metric 14: Cost per Interaction

Definition: Total system cost / number of interactions
Target: <$0.50 per interaction
Measurement: AWS Cost Explorer / CloudWatch metrics

Metric 15: Time Saved per User

Definition: Time saved compared to manual process
Target: >2 hours saved per application
Measurement: User survey, comparison with manual process

Metric 16: Corruption Reduction

Definition: Number of bribery reports collected
Target: 100+ reports per month (indicates awareness)
Measurement: DynamoDB query on anti-corruption evidence table

Metric 17: Financial Inclusion

Definition: Number of users accessing credit without traditional documentation
Target: 10,000 Shadow Credit Scores generated by Month 12
Measurement: DynamoDB query on Shadow Credit Score table

18. Implementation Roadmap

18.1 Phase 1: Foundation (Months 1-3)

Month 1: Infrastructure Setup

Week 1-2: AWS account setup, IAM roles, VPC configuration
Week 3-4: DynamoDB tables, S3 buckets, CloudFormation templates
Deliverables: Infrastructure as Code, CI/CD pipeline

Month 2: Core Services

Week 1-2: Master Orchestrator Agent, conversation state management
Week 3-4: Voice interface (Amazon Connect, Transcribe, Polly)
Deliverables: Working IVR system, basic routing

Month 3: First Domain (PM Vishwakarma)

Week 1-2: Shadow Credit Agent (Account Aggregator integration)
Week 3-4: Proof of Work Agent (Rekognition, SageMaker model training)
Deliverables: End-to-end PM Vishwakarma workflow

18.2 Phase 2: Expansion (Months 4-6)

Month 4: PMFBY Domain

Week 1-2: First Responder Agent (weather monitoring, proactive outreach)
Week 3-4: Surveyor Assistant Agent (photo validation, insurance submission)
Deliverables: End-to-end PMFBY workflow

Month 5: BOCW Domain

Week 1-2: Digital Work-Log Agent (GPS tracking, selfie verification)
Week 3-4: Interstate Bridge Agent (migration detection, benefit portability)
Deliverables: End-to-end BOCW workflow

Month 6: Communication Channels

Week 1-2: WhatsApp Business API integration
Week 3-4: SMS fallback, web interface
Deliverables: Multi-channel access

18.3 Phase 3: Optimization (Months 7-9)

Month 7: ML Model Refinement

Week 1-2: Trade verification model retraining (more data)
Week 3-4: Crop damage model fine-tuning, deepfake detection
Deliverables: Improved model accuracy (>90%)

Month 8: Knowledge Base Enhancement

Week 1-2: Expand to 1200+ schemes, multilingual support
Week 3-4: Weekly sync automation, semantic search optimization
Deliverables: Comprehensive knowledge bases

Month 9: Security and Compliance

Week 1-2: Penetration testing, vulnerability remediation
Week 3-4: Aadhaar Act compliance audit, data retention policies
Deliverables: Security audit report, compliance certification

18.4 Phase 4: Scale and Launch (Months 10-12)

Month 10: Load Testing and Optimization

Week 1-2: Load testing (10,000 concurrent users)
Week 3-4: Performance optimization, cost optimization
Deliverables: System ready for scale

Month 11: Pilot Launch

Week 1-2: Pilot in 3 districts (1 per scheme domain)
Week 3-4: User feedback collection, bug fixes
Deliverables: Pilot report, user testimonials

Month 12: Full Launch

Week 1-2: National rollout, marketing campaign
Week 3-4: Monitoring, support, continuous improvement
Deliverables: Live system serving 100,000+ users

19. Team Structure and Roles

19.1 Core Team

Product Manager (1)

Responsibilities: Product vision, roadmap, stakeholder management
Skills: Government schemes domain knowledge, user research

Technical Architect (1)

Responsibilities: System design, technology decisions, code reviews
Skills: AWS architecture, AI/ML, distributed systems

Backend Engineers (3)

Responsibilities: Lambda functions, API integrations, database design
Skills: Python, Node.js, AWS services (Lambda, DynamoDB, Step Functions)

ML Engineers (2)

Responsibilities: Model training, deployment, monitoring
Skills: TensorFlow, PyTorch, SageMaker, computer vision, NLP

AI/Agent Engineers (2)

Responsibilities: Bedrock Agent configuration, prompt engineering, knowledge bases
Skills: Amazon Bedrock, Claude, RAG, prompt optimization

DevOps Engineer (1)

Responsibilities: CI/CD, infrastructure, monitoring, security
Skills: CloudFormation, GitHub Actions, CloudWatch, security best practices

QA Engineer (1)

Responsibilities: Testing strategy, automation, quality assurance
Skills: Jest, pytest, Postman, load testing, security testing

UX Designer (1)

Responsibilities: Voice interface design, conversation flows, accessibility
Skills: Voice UI design, user research, accessibility standards

19.2 Extended Team

Domain Experts (3)

PM Vishwakarma expert: Artisan credit, trade verification
PMFBY expert: Crop insurance, claim procedures
BOCW expert: Construction worker welfare, interstate portability

Legal Counsel (1)

Responsibilities: Aadhaar Act compliance, data privacy, regulatory compliance
Skills: Indian data protection laws, government regulations

Security Consultant (1)

Responsibilities: Security audits, penetration testing, compliance
Skills: AWS security, OWASP, vulnerability assessment

Data Analyst (1)

Responsibilities: Metrics tracking, reporting, insights
Skills: SQL, Python, data visualization (Tableau, QuickSight)

19.3 Support Team

Customer Support Agents (5)

Responsibilities: Handle escalations, user support, feedback collection
Skills: Government schemes knowledge, multilingual (Hindi, English, regional languages)

Operations Manager (1)

Responsibilities: System monitoring, incident response, SLA management
Skills: AWS operations, incident management, on-call rotation

20. Appendices

20.1 Glossary of AWS Services

Amazon Bedrock: Managed service for foundation models (Claude, Titan)
Amazon Bedrock Agents: Orchestration framework for multi-step AI workflows
Amazon Bedrock Knowledge Bases: RAG (Retrieval-Augmented Generation) for grounding agents
Amazon Connect: Cloud-based contact center (IVR)
Amazon Transcribe: Speech-to-text service (22 Indian languages)
Amazon Polly: Text-to-speech service (neural voices)
Amazon Comprehend: NLP service (intent classification, entity extraction)
Amazon Translate: Neural machine translation
Amazon Rekognition: Computer vision (image/video analysis, face comparison)
Amazon Textract: Intelligent OCR (document data extraction)
Amazon Location Service: Geospatial services (maps, geocoding, routing)
Amazon SageMaker: ML platform (training, deployment, monitoring)
AWS Lambda: Serverless compute
Amazon DynamoDB: NoSQL database
Amazon S3: Object storage
Amazon OpenSearch Serverless: Managed search and analytics
AWS Step Functions: Workflow orchestration
Amazon EventBridge: Event bus for scheduled rules and event-driven architecture
Amazon Pinpoint: SMS and push notifications
Amazon SES: Email service
AWS KMS: Key management service (encryption)
AWS CloudTrail: API call logging
Amazon CloudWatch: Monitoring and observability
AWS X-Ray: Distributed tracing
AWS Secrets Manager: Secrets storage and rotation
AWS IAM: Identity and access management
Amazon API Gateway: API management
AWS CloudFormation: Infrastructure as Code
AWS CodePipeline: CI/CD pipeline
AWS CodeBuild: Build service

20.2 Acronyms

AI: Artificial Intelligence
API: Application Programming Interface
AWS: Amazon Web Services
BOCW: Building and Other Construction Workers
CIBIL: Credit Information Bureau (India) Limited
CNN: Convolutional Neural Network
CSAT: Customer Satisfaction Score
DDoS: Distributed Denial of Service
FI: Financial Information
FIP: Financial Information Provider
FIU: Financial Information User
GAN: Generative Adversarial Network
GPS: Global Positioning System
IAM: Identity and Access Management
IMD: India Meteorological Department
IRDAI: Insurance Regulatory and Development Authority of India
IVR: Interactive Voice Response
JWT: JSON Web Token
KMS: Key Management Service
MAE: Mean Absolute Error
ML: Machine Learning
MSME: Micro, Small, and Medium Enterprises
NBFC-AA: Non-Banking Financial Company - Account Aggregator
NLP: Natural Language Processing
NPS: Net Promoter Score
OCR: Optical Character Recognition
ONORC: One Nation One Ration Card
OTP: One-Time Password
PII: Personally Identifiable Information
PMFBY: Pradhan Mantri Fasal Bima Yojana (crop insurance)
PMJAY: Pradhan Mantri Jan Arogya Yojana (health insurance)
PWA: Progressive Web App
RAG: Retrieval-Augmented Generation
RBAC: Role-Based Access Control
REST: Representational State Transfer
RPO: Recovery Point Objective
RTO: Recovery Time Objective
S3: Simple Storage Service
SES: Simple Email Service
SHAP: SHapley Additive exPlanations
SLA: Service Level Agreement
SMS: Short Message Service
SOAP: Simple Object Access Protocol
SSML: Speech Synthesis Markup Language
TLS: Transport Layer Security
TTL: Time To Live
UIDAI: Unique Identification Authority of India
UPI: Unified Payments Interface
VPC: Virtual Private Cloud
WAF: Web Application Firewall

20.3 References

Government Schemes:
- PM Vishwakarma: https://pmvishwakarma.gov.in
- PMFBY: https://pmfby.gov.in
- e-Shram: https://eshram.gov.in
AWS Documentation:
- Amazon Bedrock: https://docs.aws.amazon.com/bedrock
- Amazon Connect: https://docs.aws.amazon.com/connect
- Amazon SageMaker: https://docs.aws.amazon.com/sagemaker
Regulatory:
- Aadhaar Act 2016: https://uidai.gov.in/legal-framework/acts.html
- Account Aggregator Framework: https://sahamati.org.in
Technical Standards:
- FI Data Schema: https://api.rebit.org.in
- UIDAI Authentication API: https://uidai.gov.in/ecosystem/authentication-devices-documents/about-aadhaar-paperless-offline-e-kyc.html

Document Version: 1.0
Last Updated: January 24, 2026
Author: BharatSeva AI Design Team
Status: Draft for Review