Loading...
Loading...
BharatSeva AI is a multi-agent orchestration system built on AWS using Amazon Bedrock Agents with Claude 3.5 Sonnet as the foundation model. The system deploys 10 AI agents (1 Master Orchestrator + 9 Specialist Agents) to assist India's informal sector workers in navigating government schemes across three domains: PM Vishwakarma (artisan credit), PMFBY (crop insurance), and BOCW (construction worker welfare).
# Design Document: BharatSeva AI
## 1. System Overview
BharatSeva AI is a multi-agent orchestration system built on AWS using Amazon Bedrock Agents with Claude 3.5 Sonnet as the foundation model. The system deploys 10 AI agents (1 Master Orchestrator + 9 Specialist Agents) to assist India's informal sector workers in navigating government schemes across three domains: PM Vishwakarma (artisan credit), PMFBY (crop insurance), and BOCW (construction worker welfare).
### 1.1 Design Principles
- **Voice-First**: Prioritize voice interaction over text for accessibility
- **Proactive Intelligence**: Monitor external data sources and trigger alerts automatically
- **Alternative Verification**: Use digital footprints (UPI transactions, GPS, photos) instead of traditional documentation
- **Anti-Corruption**: Educate users that services are free and collect evidence of bribery
- **Audit-Proof**: Maintain immutable 7-year audit trails for regulatory compliance
- **Offline Resilience**: Provide SMS fallback when internet is unavailable
### 1.2 Architecture Style
- **Event-Driven**: AWS EventBridge for scheduled polling and event triggers
- **Serverless**: AWS Lambda for all compute, no EC2 instances
- **Multi-Agent**: Amazon Bedrock Agents for domain-specific intelligence
- **Microservices**: Each agent has dedicated action groups (Lambda functions)
## 2. High-Level Architecture
### 2.1 System Components
```
┌─────────────────────────────────────────────────────────────┐
│ User Entry Points │
│ IVR (Amazon Connect) │ WhatsApp │ SMS │ Web Interface │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Master Orchestrator Agent │
│ (Amazon Bedrock Agent - Claude 3.5 Sonnet) │
│ - Intent Classification (Amazon Comprehend) │
│ - Language Detection & Translation (Amazon Translate) │
│ - Conversation State Management (DynamoDB) │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────┼──────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│PM Vishwakarma│ │ PMFBY │ │ BOCW │
│ Agent Swarm │ │ Agent Swarm │ │ Agent Swarm │
│ (3 agents) │ │ (3 agents) │ │ (2 agents) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Shared Services Layer │
│ - Knowledge Bases (OpenSearch Serverless) │
│ - Document Processing (Textract, Rekognition) │
│ - ML Models (SageMaker Endpoints) │
│ - Audit Trail (DynamoDB Streams → S3) │
│ - External Integrations (Account Aggregator, Weather APIs) │
└─────────────────────────────────────────────────────────────┘
```
### 2.2 Data Flow
1. **User Input** → IVR/WhatsApp/SMS → Amazon Transcribe (voice-to-text)
2. **Language Processing** → Amazon Translate (to English) → Amazon Comprehend (intent extraction)
3. **Orchestration** → Master Orchestrator routes to specialist agent
4. **Agent Processing** → Bedrock Agent invokes action groups (Lambda functions)
5. **External Actions** → Lambda calls AWS services, external APIs, ML models
6. **Response Generation** → Bedrock Agent generates response → Amazon Translate (to user language)
7. **Response Delivery** → Amazon Polly (text-to-speech) → IVR/WhatsApp/SMS
8. **Audit Logging** → All state changes → DynamoDB Streams → S3
## 3. Agent Architecture
### 3.1 Master Orchestrator Agent
**Purpose**: Route queries to specialized agents and manage conversation state
**Foundation Model**: Claude 3.5 Sonnet (anthropic.claude-3-5-sonnet-20241022)
**Action Groups**:
1. **classify_intent** (Lambda: `orchestrator-intent-classifier`)
- Input: User query text (translated to English)
- Processing: Calls Amazon Comprehend Custom Classifier
- Output: Intent category (loan_request, insurance_claim, status_check, etc.) + confidence score
- Routing logic: Maps intent to specialist agent
2. **detect_scheme_domain** (Lambda: `orchestrator-domain-detector`)
- Input: User query text
- Processing: Keyword matching + entity extraction
- Keywords: PM Vishwakarma (["loan", "vishwakarma", "artisan", "credit"]), PMFBY (["crop", "insurance", "fasal", "hailstorm"]), BOCW (["construction", "labor", "bocw", "migrant"])
- Output: Scheme domain + confidence score
3. **manage_conversation_state** (Lambda: `orchestrator-state-manager`)
- Input: Session ID, user message, agent response
- Processing: DynamoDB operations (GetItem, PutItem, UpdateItem)
- State schema: `{session_id, user_id, current_domain, conversation_history[], pending_actions[], last_updated, ttl}`
- TTL: 24 hours for automatic cleanup
- Output: Updated conversation context
4. **route_to_specialist** (Lambda: `orchestrator-router`)
- Input: Scheme domain, conversation context
- Processing: Invokes target Bedrock Agent via boto3
- Agent mapping: PM_Vishwakarma → agent_id_1, PMFBY → agent_id_2, BOCW → agent_id_3
- Output: Specialist agent response
5. **escalate_to_human** (Lambda: `orchestrator-human-escalation`)
- Input: Session ID, escalation reason
- Processing: Creates ticket in support system, notifies human operator
- Triggers: Negative sentiment detected, agent confidence <50%, user explicitly requests human
- Output: Ticket ID, estimated wait time
**Knowledge Base**: KB1 (All Government Schemes - 1200+ schemes)
- Vector embeddings of scheme guidelines, FAQs, eligibility criteria
- Updated weekly via S3 sync
**Guardrails**:
- Accuracy threshold: 70% confidence required for routing
- Safety filters: Block PII leakage, prevent hallucination of scheme details
- Fallback: Ask clarifying questions if confidence <70%
### 3.2 PM Vishwakarma Agent Swarm
#### 3.2.1 Shadow Credit Agent
**Purpose**: Generate alternative credit scores from UPI transaction history
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **initiate_account_aggregator_consent** (Lambda: `pmv-aa-consent-initiator`)
- Input: User Aadhaar number, mobile number
- Processing: Calls Account Aggregator API (NBFC-AA licensed entity)
- Consent flow: Generate consent request → Send OTP → Validate OTP → Receive consent token
- Output: Consent token (valid 12 months), consent ID
2. **fetch_transaction_history** (Lambda: `pmv-aa-data-fetcher`)
- Input: Consent token, date range (12 months)
- Processing: Calls Account Aggregator FIU (Financial Information User) API
- Data retrieved: UPI transaction history (date, amount, payer/payee, transaction ID)
- Storage: Encrypted in S3 with KMS, DynamoDB for metadata
- Output: Transaction dataset (JSON array)
3. **calculate_shadow_credit_score** (Lambda: `pmv-shadow-credit-calculator`)
- Input: Transaction dataset
- Processing: Pandas-based financial analysis
- Metrics calculated:
- Average monthly income: Sum of credits / 12 months
- Transaction frequency: Count of transactions / days
- Customer diversity: Unique payers count
- Income stability: Standard deviation of monthly income
- Digital footprint age: Days since first transaction
- Scoring algorithm:
```
Shadow_Score = (Income_Stability * 0.4) +
(Transaction_Frequency * 0.3) +
(Customer_Diversity * 0.2) +
(Digital_Footprint * 0.1)
Normalized to 300-900 scale
```
- Thresholds: 700+ (high confidence), 600-699 (moderate), <600 (requires review)
- Output: Shadow Credit Score (integer), contributing factors (JSON)
4. **generate_lending_memo** (Lambda: `pmv-lending-memo-generator`)
- Input: Shadow Credit Score, transaction dataset, user details
- Processing: ReportLab Python library for PDF generation
- Memo contents:
- Header: BharatSeva AI logo, generation date, applicant name (masked Aadhaar)
- Shadow Credit Score: Large font display with color coding (green >700, yellow 600-699, red <600)
- 6-month income chart: Matplotlib bar chart embedded in PDF
- Transaction consistency graph: Line chart showing daily transaction count
- Risk assessment summary: AI-generated text explaining score
- Digital signature: AWS KMS signature for tamper-proof authenticity
- Storage: S3 with pre-signed URL (7-day expiry)
- Output: PDF URL, memo ID
5. **submit_to_bank** (Lambda: `pmv-bank-submission`)
- Input: Memo ID, bank branch details
- Processing: Amazon SES email with PDF attachment
- Email template: Formal letter to bank manager with lending memo
- SMS notification: Sent to applicant confirming submission
- Follow-up: EventBridge scheduled rule checks for bank response after 7 days
- Output: Submission confirmation, tracking ID
**Knowledge Base**: KB2 (PM Vishwakarma Policies)
**Guardrails**: Minimum 6 months transaction history required, reject if <10 transactions per month
#### 3.2.2 Proof of Work Agent
**Purpose**: Verify artisan trades using computer vision on work photos/videos
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **collect_work_media** (Lambda: `pmv-media-collector`)
- Input: WhatsApp message with photo/video attachment
- Processing: Download from WhatsApp Business API, upload to S3
- Validation: File size <50MB, formats (MP4, MOV, AVI, JPEG, PNG)
- Video processing: FFmpeg conversion to MP4 H.264 codec
- Output: S3 object key, media type (photo/video)
2. **analyze_trade_verification** (Lambda: `pmv-trade-analyzer`)
- Input: S3 object key
- Processing:
- For photos: Amazon Rekognition DetectLabels API
- For videos: SageMaker endpoint invocation (Custom CNN model)
- SageMaker model details:
- Endpoint: `trade-verification-cnn-endpoint`
- Instance: ml.m5.xlarge (real-time inference)
- Input: Video frames (5 FPS sampling), resized to 224x224
- Output: Trade category + confidence score
- Trade categories: Carpentry, Pottery, Blacksmithing, Weaving, Cobbling, Tailoring, Masonry, Plumbing, Electrical, Goldsmithing, Basket Weaving, Doll Making, Toy Making, Fishing Net Making, Locksmithing, Sculpting, Stone Carving, Wood Carving
- Confidence thresholds: >85% approved, 70-85% manual review, <70% rejected
- Output: Trade type, confidence score, detected tools/materials
3. **extract_metadata** (Lambda: `pmv-metadata-extractor`)
- Input: S3 object key
- Processing: EXIF data extraction using Pillow library
- Metadata extracted:
- GPS coordinates (latitude, longitude)
- Timestamp (photo/video capture time)
- Device model (for fraud detection)
- Perceptual hash (SHA-256 for duplicate detection)
- Validation: Timestamp within last 30 days, GPS within India boundaries
- Output: Metadata JSON
4. **generate_trade_certificate** (Lambda: `pmv-certificate-generator`)
- Input: Trade type, confidence score, metadata, user details
- Processing: ReportLab PDF generation
- Certificate contents:
- Header: "Digital Proof of Trade - Government of India"
- Applicant details: Name, masked Aadhaar (XXXX-XXXX-1234)
- Trade type: Large font with icon
- Verification statement: "Trade verified by BharatSeva AI with 89% confidence"
- Video thumbnail grid: 4 key frames showing work activity
- GPS coordinates: Map image from Amazon Location Service
- QR code: Contains certificate hash for instant verification
- Digital signature: AWS KMS signature
- Storage: S3 with permanent retention
- Output: Certificate PDF URL, certificate ID
5. **submit_to_pm_vishwakarma_portal** (Lambda: `pmv-portal-submission`)
- Input: Certificate ID, user details
- Processing: API call to PM Vishwakarma portal (government API)
- Payload: Certificate PDF, applicant Aadhaar, trade type, verification date
- Notification: WhatsApp message to user with certificate copy
- District notification: Email to District Implementation Committee
- Output: Submission confirmation, portal reference number
**Knowledge Base**: KB2 (PM Vishwakarma Policies)
**Guardrails**: Minimum 85% confidence for auto-approval, manual review for 70-85%
#### 3.2.3 Nudge & Escalation Agent
**Purpose**: Track PM Vishwakarma application status and escalate delays
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **poll_application_status** (Lambda: `pmv-status-poller`)
- Trigger: EventBridge scheduled rule (daily at 6 AM IST)
- Input: List of pending applications from DynamoDB
- Processing: API calls to PM Vishwakarma portal for each application
- Status stages: Submitted → Panchayat Verification → District Approval → Bank Review → Loan Disbursement
- Storage: DynamoDB update with new status, timestamp
- Output: Status changes (array of applications with updated status)
2. **calculate_sla_breach** (Lambda: `pmv-sla-calculator`)
- Input: Application ID, current status, status history
- Processing: Calculate days pending at each stage
- SLA thresholds:
- Day 15 at Gram Panchayat: Level 1 escalation
- Day 30 at District Committee: Level 2 escalation
- Day 45 at Bank Review: Level 3 escalation
- Day 60 overall: Level 4 escalation (Ministry)
- Output: Escalation level, days overdue
3. **execute_escalation** (Lambda: `pmv-escalation-executor`)
- Input: Application ID, escalation level, official contact details
- Processing:
- Level 1: IVR call to Gram Pradhan (Amazon Connect), SMS to District Officer
- Level 2: Email to District Collector (Amazon SES), SMS to State Nodal Officer
- Level 3: IVR call to bank branch manager, email with Shadow Credit Memo
- Level 4: Email to Ministry of MSME grievance cell
- IVR script: Pre-recorded message in Hindi/local language explaining delay
- Email template: Formal escalation letter with application details
- Output: Escalation confirmation, notification IDs
4. **notify_applicant** (Lambda: `pmv-applicant-notifier`)
- Input: Application ID, escalation level
- Processing: SMS via Amazon Pinpoint
- Message templates:
- Level 1: "Your application is being escalated to District Officer due to 15-day delay"
- Level 2: "District Officer has been notified. You should receive update within 48 hours"
- Approval: "Your application has been approved! Bank interview scheduled for [date]"
- Output: SMS delivery status
**Knowledge Base**: KB2 (PM Vishwakarma Policies)
**Guardrails**: Maximum 1 escalation per level per application (prevent spam)
### 3.3 PMFBY Agent Swarm
#### 3.3.1 First Responder Agent
**Purpose**: Proactively detect weather events and initiate crop insurance claims
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **monitor_weather_events** (Lambda: `pmfby-weather-monitor`)
- Trigger: EventBridge scheduled rule (every 15 minutes)
- Input: None (polls external APIs)
- APIs integrated:
- India Meteorological Department (IMD) API: Real-time weather alerts
- Skymet Weather API: District-level granular data
- NASA POWER API: Satellite data (rainfall, temperature)
- Event detection: Hailstorm warnings, unseasonal rainfall >50mm, cyclone alerts, flood warnings, drought conditions
- Storage: DynamoDB table `weather_events` with timestamp, location, event type
- Output: Array of detected weather events
2. **identify_affected_farmers** (Lambda: `pmfby-farmer-identifier`)
- Input: Weather event (location, event type)
- Processing: DynamoDB geo-index query
- Query: Find farmers within 5km radius of event location
- Filters: Active PMFBY policy holders, crop type matches risk (wheat for hailstorm, rice for flood)
- Amazon Location Service: Radius-based geospatial search
- Typical result: 200-500 farmers per localized event
- Output: Array of affected farmer IDs with contact details
3. **initiate_proactive_outreach** (Lambda: `pmfby-outreach-initiator`)
- Input: Array of affected farmer IDs
- Processing: Amazon Connect for automated IVR calls
- Call timing: Within 30 minutes of event detection
- Call script (local language): "Namaste, I'm PMFBY assistant. There was a hailstorm in your area this morning. Did your crop suffer damage?"
- Voice response: Amazon Transcribe captures yes/no
- Branching: If "yes" → proceed to auto-intimation, If "no" → end call with "Stay safe"
- Output: Call results (answered, damage confirmed, no damage, no answer)
4. **file_auto_intimation** (Lambda: `pmfby-intimation-filer`)
- Input: Farmer ID, event type, damage confirmation
- Processing: API call to insurance company portal
- Payload: Farmer ID, event type, event date, crop type, area affected, GPS coordinates
- Insurance companies: 14 companies (ICICI Lombard, HDFC Ergo, Agriculture Insurance Company, etc.)
- Adapter pattern: Different Lambda functions for each company's API format
- Timestamp: Ensures submission within 72-hour window
- Output: Claim reference number, intimation confirmation
5. **send_follow_up_instructions** (Lambda: `pmfby-followup-sender`)
- Input: Claim reference number, farmer contact
- Processing: WhatsApp message via Business API
- Message content:
- "Your claim intimation has been filed. Claim ID: CLM-2026-789456"
- "Within 48 hours, submit photos of damaged crop. Follow these tips..."
- Link to Surveyor Assistant Agent
- Output: Message delivery status
**Knowledge Base**: KB3 (PMFBY Insurance - 14 company policies)
**Guardrails**: Only contact farmers with active policies, respect 72-hour intimation window
#### 3.3.2 Surveyor Assistant Agent
**Purpose**: Guide farmers in submitting high-quality crop damage photos
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **accept_photo_submission** (Lambda: `pmfby-photo-receiver`)
- Input: WhatsApp message with photo attachment, claim ID
- Processing: Download photo from WhatsApp Business API
- Validation: File size check, format check (JPEG, PNG, HEIC)
- HEIC conversion: ImageMagick for iPhone photos
- Storage: S3 bucket `pmfby-claim-photos` with folder structure `{claim_id}/{photo_number}.jpg`
- Limit: Maximum 10 photos per claim (insurance requirement)
- Output: S3 object key, photo number
2. **validate_photo_quality** (Lambda: `pmfby-photo-validator`)
- Input: S3 object key
- Processing: Amazon Rekognition DetectModerationLabels + custom quality checks
- Quality checks:
- Resolution: Minimum 1920x1080 (DetectLabels returns image dimensions)
- Brightness: 30-90% optimal (calculated from pixel histogram)
- Blur detection: Rekognition sharpness score (reject if <85%)
- GPS metadata: Extract from EXIF (required for location validation)
- Content validation:
- Crop type detection: Rekognition labels (wheat, rice, cotton, etc.)
- Damage indicators: Custom labels (hail_marks, waterlogging, pest_damage)
- Reference objects: Detect hand, stick for scale
- Outdoor setting: Reject indoor photos (check for sky, field labels)
- Output: Validation result (accepted/rejected), rejection reason
3. **provide_interactive_feedback** (Lambda: `pmfby-feedback-sender`)
- Input: Validation result, photo number
- Processing: WhatsApp message generation
- Feedback messages:
- Accepted: "✅ Photo 1: Accepted"
- Rejected (blur): "❌ Photo 2: Too blurry. Hold phone steady and retake"
- Rejected (crop not visible): "❌ Photo 3: Crop not visible. Move closer and retake"
- Guidance messages:
- "Stand in the middle of damaged area"
- "Include reference for scale (your hand or a stick)"
- "Ensure good lighting, avoid shadows"
- Output: Message delivery status
4. **enrich_photo_metadata** (Lambda: `pmfby-metadata-enricher`)
- Input: S3 object key, validation result
- Processing: Create JSON metadata file
- Metadata schema:
```json
{
"photo_id": "CLM-2026-789456-001",
"claim_id": "CLM-2026-789456",
"gps_coordinates": {"latitude": 28.7041, "longitude": 77.1025},
"timestamp": "2026-01-24T10:30:00Z",
"quality_score": 92,
"ai_damage_estimate": 45,
"rekognition_labels": ["wheat", "hail_damage", "field", "outdoor"],
"confidence_scores": {"wheat": 98.5, "hail_damage": 87.3}
}
```
- Storage: S3 alongside photo with `.json` extension
- Output: Metadata object key
5. **submit_to_insurance_portal** (Lambda: `pmfby-insurance-uploader`)
- Input: Claim ID, array of accepted photo S3 keys
- Processing: Multi-company API integration
- Insurance company adapters: 14 Lambda functions (one per company)
- Upload format: Multipart form data with photos + metadata JSON
- Submission receipt: Generated with photo count, timestamp, claim ID
- Farmer notification: SMS "✅ All photos accepted and submitted. Surveyor will visit within 3 days"
- Output: Submission confirmation, portal reference number
**Knowledge Base**: KB3 (PMFBY Insurance)
**Guardrails**: Maximum 10 photos per claim, minimum 3 photos required for submission
#### 3.3.3 Claim Tracker Agent
**Purpose**: Monitor insurance claim status across 14 companies and provide transparency
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **poll_insurance_apis** (Lambda: `pmfby-claim-poller`)
- Trigger: EventBridge scheduled rule (every 6 hours, adaptive to every 2 hours for critical stages)
- Input: List of active claims from DynamoDB
- Processing: API calls to 14 insurance company portals
- API formats: REST (10 companies), SOAP (4 companies)
- Authentication: Rotating API keys stored in AWS Secrets Manager
- Fallback: Web scraping using Selenium on Lambda (if API unavailable)
- Status stages: Intimation Received → Surveyor Assigned → Assessment Complete → Claim Approved → Payment Released
- Storage: DynamoDB update with new status, timestamp
- Output: Status changes (array of claims with updated status)
2. **detect_status_changes** (Lambda: `pmfby-change-detector`)
- Input: Current status, previous status (from DynamoDB)
- Processing: Compare status fields, detect transitions
- Change detection: Status field change, surveyor assignment, payment release
- Output: Array of status change events
3. **notify_status_change** (Lambda: `pmfby-status-notifier`)
- Input: Status change event, farmer contact
- Processing: SMS via Amazon Pinpoint
- Message templates:
- Surveyor assigned: "Surveyor Mr. Rajesh Sharma will visit your farm on 24th Jan. Mobile: +91-XXXX"
- Assessment complete: "Survey completed. Claim under review by insurance company"
- Claim approved: "Claim approved! Payment of ₹45,000 will be released within 7-10 days"
- Payment released: "Payment of ₹45,000 credited to your account ending XXXX"
- Output: SMS delivery status
4. **explain_delays** (Lambda: `pmfby-delay-explainer`)
- Input: Claim ID, current status, days pending
- Processing: Bedrock Agent generates plain-language explanation
- Explanation logic:
- If status = "Claim Approved" and days >10: "Payment pending because state government subsidy (40% share) not yet released. This is a normal government process, no bribe needed."
- If status = "Surveyor Assigned" and days >7: "Surveyor visits are delayed due to high claim volume. Your turn is coming soon."
- Transparency: Differentiates legitimate delays vs stuck claims
- Expected timeline: Provides realistic timeframe based on historical data
- Output: Explanation text
5. **file_grievance** (Lambda: `pmfby-grievance-filer`)
- Input: Claim ID, delay reason
- Trigger: Automatic if claim stuck >90 days
- Processing: API call to insurance ombudsman portal
- Grievance payload: Claim details, delay duration, insurance company name
- Ministry notification: Email to Ministry of Agriculture Insurance Cell
- Farmer notification: SMS "We have escalated your delayed claim to the ombudsman. Reference: GRV-2026-XXXX"
- Output: Grievance reference number
**Knowledge Base**: KB3 (PMFBY Insurance)
**Guardrails**: Maximum 1 grievance per claim, 90-day minimum before escalation
### 3.4 BOCW Agent Swarm
#### 3.4.1 Digital Work-Log Agent
**Purpose**: GPS and selfie-based attendance tracking for construction workers
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **process_checkin** (Lambda: `bocw-checkin-processor`)
- Input: Worker ID, SMS/WhatsApp message "शुरू" (start), selfie, GPS coordinates
- Processing:
- GPS validation: Amazon Location Service geofencing API
- Geofence check: Validate GPS within 100m of known construction site
- Timestamp: Record check-in time in DynamoDB
- Construction site database: DynamoDB table `construction_sites` with GPS coordinates
- Unregistered sites: If GPS doesn't match, request site photo for validation
- Output: Check-in confirmation, site name
2. **verify_selfie_biometric** (Lambda: `bocw-selfie-verifier`)
- Input: Selfie S3 key, worker Aadhaar photo S3 key
- Processing: Amazon Rekognition CompareFaces API
- Similarity threshold: >95% required for approval
- Liveness detection: Rekognition DetectFaces with quality attributes (checks for 3D face, not printed photo)
- Fraud prevention: Prevents buddy punching (each worker must submit own selfie)
- Output: Verification result (approved/rejected), similarity score
3. **validate_construction_site** (Lambda: `bocw-site-validator`)
- Input: Site photo S3 key, GPS coordinates
- Processing: Amazon Rekognition DetectLabels API
- Construction indicators: Crane, scaffold, concrete, rebar, cement bags, construction equipment
- Confidence threshold: >80% for at least 3 construction indicators
- Site registration: If validated, add to `construction_sites` table as "Unregistered Construction Site"
- Output: Validation result, detected labels
4. **process_checkout** (Lambda: `bocw-checkout-processor`)
- Input: Worker ID, SMS/WhatsApp message "समाप्त" (end), selfie, GPS coordinates
- Processing:
- GPS validation: Same as check-in
- Selfie verification: Same as check-in
- Hours calculation: Checkout time - check-in time
- Minimum hours: 4 hours required to count as valid day
- Storage: DynamoDB update with checkout time, hours worked
- Output: Checkout confirmation, hours worked
5. **generate_90day_certificate** (Lambda: `bocw-certificate-generator`)
- Trigger: Automatic when worker reaches 90 valid work days
- Input: Worker ID
- Processing: Query DynamoDB for all work logs
- Certificate contents:
- Header: "90-Day Employment Certificate - e-Shram"
- Worker details: Name, Aadhaar number, e-Shram ID
- Work summary: 90 unique work dates, total hours worked
- Site details: Site names and GPS coordinates
- Verification: Selfie verification hashes (SHA-256)
- Digital signature: AWS KMS signature
- Submission: API call to e-Shram portal and state BOCW board
- Replaces: Contractor employment certificate requirement
- Output: Certificate PDF URL, submission confirmation
**Knowledge Base**: KB4 (BOCW State Rules - 28 state boards)
**Guardrails**: Minimum 4 hours per day, maximum 12 hours per day, 95% selfie similarity required
#### 3.4.2 Interstate Bridge Agent
**Purpose**: Enable benefit portability for migrant construction workers
**Foundation Model**: Claude 3.5 Sonnet
**Action Groups**:
1. **detect_migration** (Lambda: `bocw-migration-detector`)
- Trigger: EventBridge scheduled rule (daily check)
- Input: Worker ID, GPS history from DynamoDB
- Processing: Calculate distance between home state and current GPS
- Migration criteria: GPS change >500km from home maintained for >7 consecutive days
- Storage: DynamoDB flag `migration_detected = true`
- Output: Array of workers with detected migration
2. **initiate_eshram_update** (Lambda: `bocw-eshram-updater`)
- Input: Worker ID, new location (state, district)
- Processing: WhatsApp message "You've moved to Maharashtra. Should I update your current location in e-Shram?"
- User confirmation: Wait for yes/no response
- API call: e-Shram API to update current state, current district, expected duration
- Portability tracking: Enables national-level tracking
- Output: Update confirmation
3. **discover_portable_benefits** (Lambda: `bocw-benefits-discoverer`)
- Input: Worker ID, home state, host state
- Processing: Query knowledge base for portable schemes
- Portable schemes:
- One Nation One Ration Card (ONORC)
- Ayushman Bharat PMJAY (₹5 lakh health insurance)
- PM Shram Yogi Maandhan (pension)
- Maternity benefits
- Service point mapping: Amazon Location Service to find nearest Fair Price Shop, Ayushman hospital, labor department office
- Output: Array of available benefits with service point locations
4. **guide_host_state_registration** (Lambda: `bocw-registration-guide`)
- Input: Worker ID, host state
- Processing: Bedrock Agent generates step-by-step guidance
- Guidance content:
- "To access Maharashtra BOCW benefits, register at [address]"
- "Required documents: Aadhaar, e-Shram card, work log certificate"
- "Dual registration strategy: Maintain home state (for children's education) + host state (for local healthcare)"
- Knowledge base: KB4 contains registration procedures for all 28 states
- Output: Guidance text, registration office address
5. **map_local_resources** (Lambda: `bocw-resource-mapper`)
- Input: Worker GPS coordinates
- Processing: Amazon Location Service PlaceIndex search
- Resources mapped:
- Nearest government hospital
- Fair Price Shop with ration availability
- Labor welfare office
- Nearest police station (for safety)
- Low-cost accommodation (PG/hostels near construction sites)
- Output: WhatsApp message with Google Maps links to all resources
**Knowledge Base**: KB4 (BOCW State Rules)
**Guardrails**: Minimum 7 consecutive days in new location before migration detection
## 4. Communication Channels
### 4.1 Voice Interface (IVR)
**Technology**: Amazon Connect
**Configuration**:
- Toll-free number: 1800-XXX-XXXX (accessible from any phone, no internet required)
- Contact flows: 3 main menus (PM Vishwakarma, PMFBY, BOCW)
- Language selection: 22 Indian languages via Amazon Transcribe and Polly
- Call recording: Stored in S3, encrypted with KMS
- Average call duration: 3-5 minutes (simple queries), 10-15 minutes (application submission)
**Contact Flow Design**:
1. Welcome message: "Welcome to BharatSeva AI. All services are completely free."
2. Language selection: "Press 1 for Hindi, Press 2 for English, Press 3 for Tamil..."
3. Main menu: "Press 1 for PM Vishwakarma, Press 2 for Crop Insurance, Press 3 for Construction Worker assistance"
4. Voice input: Amazon Transcribe streaming transcription
5. Agent routing: Master Orchestrator processes query
6. Response: Amazon Polly text-to-speech in selected language
7. Closing message: "Remember, all government services are free. No one should ask for money."
**Transcribe Configuration**:
- Streaming API: Real-time transcription with <1 second latency
- Custom vocabulary: 5000+ terms (scheme names, regional crop names, construction terminology)
- Accuracy: >90% for clear speech, >75% for noisy environments
- Language auto-detection: Enabled for seamless multilingual support
**Polly Configuration**:
- Neural voices: Aditi (Hindi, female), Raveena (Indian English, female)
- SSML support: Custom pronunciation for scheme names, emphasis on important numbers
- Speech marks: Word-level timing for synchronized UI animations
### 4.2 WhatsApp Business Integration
**Technology**: WhatsApp Business API
**Configuration**:
- Official WhatsApp Business number: +91-XXXX-XXXXXX
- End-to-end encryption: Maintained (BharatSeva cannot read message content, only metadata)
- Message templates: Pre-approved by WhatsApp for high volume
**Supported Interactions**:
- Text messages: Natural language queries
- Photo/video uploads: Trade verification, crop damage photos
- Document sharing: PDF certificates, lending memos
- Interactive buttons: "Check Status", "Submit Photo", "File Claim"
- Location sharing: GPS verification for work logs
**Message Flow**:
1. User sends message to WhatsApp number
2. Webhook: WhatsApp API sends POST request to API Gateway
3. Lambda: `whatsapp-message-handler` processes message
4. Master Orchestrator: Routes to appropriate agent
5. Response: Lambda sends reply via WhatsApp API
6. Delivery: WhatsApp delivers message to user
### 4.3 SMS Fallback
**Technology**: Amazon Pinpoint
**Configuration**:
- SMS delivery: Across all operators (Jio, Airtel, BSNL, Vi)
- Delivery rate: 98%+ across India
- Character limit: 160 characters (optimized Hindi messages)
**Use Cases**:
- Internet unavailable (rural areas, during disasters)
- WhatsApp not installed
- User prefers SMS
- Critical notifications (claim status, escalations)
**Message Templates**:
- Status update: "Claim CLM-2026-789456 approved. Payment in 7-10 days"
- Escalation: "Application escalated to District Officer. Update in 48 hours"
- Reminder: "Submit crop damage photos within 24 hours. Claim ID: CLM-2026-789456"
### 4.4 Web Interface
**Technology**: Browser-based chat widget
**Configuration**:
- Hosting: AWS Amplify for static site hosting
- Chat widget: Embedded iframe with WebSocket connection
- Authentication: JWT tokens (24-hour expiry)
**Features**:
- Real-time chat: WebSocket connection to API Gateway
- File upload: Drag-and-drop for photos/videos
- Application status dashboard: View all pending applications
- Certificate download: Access to generated certificates
**Integration Points**:
- Government portal embeds: PM Vishwakarma portal, PMFBY portal, e-Shram portal
- NGO partner websites: Embedded chat widget
- Direct access: Standalone web application
## 5. Machine Learning Models
### 5.1 Trade Verification CNN
**Purpose**: Identify artisan trades from video footage
**Architecture**: Custom CNN with temporal convolution layers
**Training**:
- Training data: 50,000 labeled videos
- Carpentry: 15,000 videos
- Pottery: 12,000 videos
- Blacksmithing: 8,000 videos
- Weaving: 6,000 videos
- Other trades: 9,000 videos
- Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge instances (GPU)
- Framework: TensorFlow 2.x
- Training duration: ~48 hours
- Validation accuracy: >85% for 18 traditional trades
**Model Architecture**:
```
Input: Video frames (224x224x3, 5 FPS sampling)
↓
Conv3D Layer 1: 32 filters, 3x3x3 kernel, ReLU activation
MaxPooling3D: 2x2x2
↓
Conv3D Layer 2: 64 filters, 3x3x3 kernel, ReLU activation
MaxPooling3D: 2x2x2
↓
Conv3D Layer 3: 128 filters, 3x3x3 kernel, ReLU activation
GlobalAveragePooling3D
↓
Dense Layer 1: 256 units, ReLU activation, Dropout 0.5
Dense Layer 2: 128 units, ReLU activation, Dropout 0.3
↓
Output: 18 units (trade categories), Softmax activation
```
**Deployment**:
- SageMaker Endpoint: `trade-verification-cnn-endpoint`
- Instance type: ml.m5.xlarge (real-time inference)
- Auto-scaling: 1-5 instances based on invocation rate
- Latency: <2 seconds per video (30 frames)
**Monitoring**:
- SageMaker Model Monitor: Drift detection on input data distribution
- Retraining trigger: Accuracy drops <80% on validation set
- Model Registry: All versions tracked for rollback capability
### 5.2 Crop Damage Assessment Model
**Purpose**: Estimate crop damage percentage from photos
**Architecture**: ResNet-50 fine-tuned on agricultural imagery
**Training**:
- Training data: 100,000 insurance claim photos with surveyor damage % labels
- Data augmentation: Rotation, brightness adjustment, crop, flip
- Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge
- Framework: PyTorch
- Training duration: ~24 hours
- Validation MAE: <8% (mean absolute error on damage percentage)
**Model Architecture**:
```
Input: Photo (224x224x3)
↓
ResNet-50 Backbone (pre-trained on ImageNet)
↓
Global Average Pooling
↓
Dense Layer 1: 512 units, ReLU activation, Dropout 0.4
Dense Layer 2: 256 units, ReLU activation, Dropout 0.3
↓
Output: 1 unit (damage percentage 0-100), Linear activation
```
**Deployment**:
- SageMaker Serverless Inference: Cost-optimized for variable load
- Memory: 4096 MB
- Max concurrency: 20
- Cold start: <10 seconds
- Warm inference: <500ms
**Output**:
- Damage percentage: 0-100 (integer)
- Confidence interval: ±10% (based on validation MAE)
- Used as reference for surveyors, not final decision
### 5.3 Deepfake Detection Model
**Purpose**: Detect manipulated selfies for work log fraud prevention
**Architecture**: EfficientNet-B0 binary classifier
**Training**:
- Training data:
- Authentic selfies: 50,000 (from e-Shram database)
- Synthetic/manipulated images: 50,000 (generated using StyleGAN, FaceSwap)
- Training infrastructure: SageMaker Training Jobs on ml.p3.2xlarge
- Framework: TensorFlow 2.x
- Training duration: ~12 hours
- Validation accuracy: >92%
**Model Architecture**:
```
Input: Selfie (224x224x3)
↓
EfficientNet-B0 Backbone (pre-trained on ImageNet)
↓
Global Average Pooling
↓
Dense Layer: 128 units, ReLU activation, Dropout 0.5
↓
Output: 1 unit (deepfake probability), Sigmoid activation
```
**Deployment**:
- SageMaker Endpoint: `deepfake-detection-endpoint`
- Instance type: ml.t2.medium (low-cost for simple inference)
- Latency: <300ms per image
**Decision Logic**:
- Deepfake probability >0.7: Flag for manual review
- Deepfake probability >0.9: Automatic rejection
- Deepfake probability <0.7: Approved
## 6. Data Storage and Management
### 6.1 DynamoDB Tables
**Table 1: `conversation_state`**
- Purpose: Store conversation context for Master Orchestrator
- Partition key: `session_id` (String)
- Attributes:
- `user_id` (String): Aadhaar hash or phone number
- `current_domain` (String): PM_Vishwakarma | PMFBY | BOCW
- `conversation_history` (List): Last 10 turns [{role, message, timestamp}]
- `pending_actions` (List): Actions awaiting user input
- `last_updated` (Number): Unix timestamp
- `ttl` (Number): 24-hour expiry for automatic cleanup
- GSI: `user_id-index` for querying all sessions by user
- Capacity: On-demand (auto-scaling)
**Table 2: `applications`**
- Purpose: Track all scheme applications across domains
- Partition key: `application_id` (String)
- Sort key: `scheme_domain` (String)
- Attributes:
- `user_id` (String): Aadhaar hash
- `status` (String): Submitted | Pending | Approved | Rejected
- `status_history` (List): [{status, timestamp, updated_by}]
- `application_data` (Map): Scheme-specific data
- `created_at` (Number): Unix timestamp
- `updated_at` (Number): Unix timestamp
- GSI: `user_id-index` for querying all applications by user
- GSI: `status-updated_at-index` for querying pending applications
- Capacity: On-demand
**Table 3: `work_logs`**
- Purpose: Store construction worker attendance records
- Partition key: `worker_id` (String)
- Sort key: `date` (String): YYYY-MM-DD format
- Attributes:
- `checkin_time` (Number): Unix timestamp
- `checkout_time` (Number): Unix timestamp
- `checkin_gps` (Map): {latitude, longitude}
- `checkout_gps` (Map): {latitude, longitude}
- `site_id` (String): Construction site identifier
- `hours_worked` (Number): Calculated hours
- `selfie_s3_keys` (List): [checkin_selfie, checkout_selfie]
- `verification_status` (String): Approved | Rejected
- GSI: `worker_id-verification_status-index` for counting approved days
- Capacity: On-demand
**Table 4: `weather_events`**
- Purpose: Store detected weather events for PMFBY
- Partition key: `event_id` (String)
- Attributes:
- `event_type` (String): Hailstorm | Flood | Drought | Cyclone
- `location` (Map): {latitude, longitude, district, state}
- `severity` (String): Low | Medium | High
- `detected_at` (Number): Unix timestamp
- `affected_farmers` (List): Array of farmer IDs
- `notifications_sent` (Boolean)
- GSI: `location-detected_at-index` for geospatial queries
- Capacity: On-demand
**Table 5: `construction_sites`**
- Purpose: Store registered construction sites for BOCW
- Partition key: `site_id` (String)
- Attributes:
- `site_name` (String)
- `gps_coordinates` (Map): {latitude, longitude}
- `site_type` (String): Registered | Unregistered
- `registration_date` (Number): Unix timestamp
- `geofence_radius` (Number): Meters (default 100)
- Geospatial index: For radius-based queries (requires custom implementation)
- Capacity: Provisioned (low traffic)
### 6.2 S3 Buckets
**Bucket 1: `bharatseva-audit-trail`**
- Purpose: Immutable audit logs from DynamoDB Streams
- Lifecycle policy: Transition to Glacier after 90 days, retain for 7 years
- Encryption: SSE-KMS with government-approved key
- Versioning: Enabled
- Object lock: Enabled (compliance mode)
**Bucket 2: `bharatseva-user-media`**
- Purpose: Store photos, videos, selfies
- Folder structure: `{user_id}/{media_type}/{timestamp}_{filename}`
- Lifecycle policy: Transition to Glacier after 90 days
- Encryption: SSE-KMS
- Access: Pre-signed URLs with 7-day expiry
**Bucket 3: `bharatseva-certificates`**
- Purpose: Store generated certificates (trade verification, work logs)
- Folder structure: `{certificate_type}/{user_id}/{certificate_id}.pdf`
- Lifecycle policy: Permanent retention (no deletion)
- Encryption: SSE-KMS
- Access: Pre-signed URLs with 30-day expiry
**Bucket 4: `bharatseva-knowledge-base`**
- Purpose: Store documents for Bedrock Knowledge Bases
- Folder structure: `{knowledge_base_id}/{document_category}/{filename}.pdf`
- Update mechanism: Weekly Lambda sync from government portals
- Encryption: SSE-S3
- Versioning: Enabled for document history
**Bucket 5: `bharatseva-ml-models`**
- Purpose: Store trained ML models and artifacts
- Folder structure: `{model_name}/{version}/{artifacts}`
- Lifecycle policy: Retain latest 5 versions, delete older
- Encryption: SSE-S3
- Access: SageMaker execution role only
### 6.3 OpenSearch Serverless (Knowledge Bases)
**Collection 1: `kb-all-schemes`**
- Purpose: Vector embeddings for 1200+ government schemes
- Index: `schemes-index`
- Dimensions: 1536 (Amazon Titan Embeddings v2)
- Documents: 10,000+ PDFs (chunked to 500-token segments)
- Update frequency: Weekly
**Collection 2: `kb-pm-vishwakarma`**
- Purpose: PM Vishwakarma policies and guidelines
- Index: `pmv-index`
- Documents: 500+ PDFs from MSME ministry
- Update frequency: Weekly
**Collection 3: `kb-pmfby`**
- Purpose: PMFBY insurance policies from 14 companies
- Index: `pmfby-index`
- Documents: 1000+ PDFs (policy documents, claim procedures)
- Update frequency: Weekly
**Collection 4: `kb-bocw`**
- Purpose: BOCW state rules for 28 states
- Index: `bocw-index`
- Documents: 800+ PDFs (state board rules, benefit schemes)
- Update frequency: Weekly
## 7. External Integrations
### 7.1 Account Aggregator Framework
**Purpose**: Fetch UPI transaction history for Shadow Credit Score
**Integration Type**: REST API
**Provider**: NBFC-AA licensed entities (e.g., Sahamati, OneMoney, CAMS Finserv)
**API Endpoints**:
1. **Consent Request**: `POST /Consent`
- Request: User Aadhaar, mobile, data range (12 months)
- Response: Consent ID, OTP sent to user
2. **Consent Verification**: `POST /Consent/{consent_id}/verify`
- Request: OTP
- Response: Consent token (valid 12 months)
3. **Data Fetch**: `GET /FI/fetch`
- Request: Consent token, FIP (Financial Information Provider) ID
- Response: Transaction data (JSON array)
**Authentication**: OAuth 2.0 with client credentials
**Data Format**: FI Data Schema v2.0 (RBI standard)
**Compliance**: UIDAI-compliant, purpose limitation enforced
### 7.2 Weather APIs
**API 1: India Meteorological Department (IMD)**
- Endpoint: `https://api.imd.gov.in/weather/alerts`
- Authentication: API key
- Polling frequency: Every 15 minutes
- Data: Real-time weather alerts (hailstorm, cyclone, flood)
- Response format: JSON
**API 2: Skymet Weather**
- Endpoint: `https://api.skymetweather.com/v1/district-weather`
- Authentication: API key
- Polling frequency: Every 15 minutes
- Data: District-level granular data (rainfall, temperature)
- Response format: JSON
**API 3: NASA POWER**
- Endpoint: `https://power.larc.nasa.gov/api/temporal/daily/point`
- Authentication: None (public API)
- Polling frequency: Daily
- Data: Satellite data (rainfall, temperature, solar radiation)
- Response format: JSON
### 7.3 Government Portals
**Portal 1: PM Vishwakarma Portal**
- Endpoint: `https://pmvishwakarma.gov.in/api/v1/`
- Authentication: API key + digital signature
- Operations:
- Submit application: `POST /applications`
- Check status: `GET /applications/{application_id}/status`
- Upload certificate: `POST /applications/{application_id}/documents`
- Response format: JSON
**Portal 2: PMFBY Insurance Companies (14 APIs)**
- Example: ICICI Lombard
- Endpoint: `https://api.icicilombard.com/pmfby/v1/`
- Authentication: API key
- Operations:
- File intimation: `POST /intimations`
- Upload photos: `POST /claims/{claim_id}/photos`
- Check status: `GET /claims/{claim_id}/status`
- Adapter pattern: Separate Lambda for each company's API format
**Portal 3: e-Shram Portal**
- Endpoint: `https://eshram.gov.in/api/v1/`
- Authentication: API key + Aadhaar authentication
- Operations:
- Update location: `PUT /workers/{eshram_id}/location`
- Submit certificate: `POST /workers/{eshram_id}/certificates`
- Query benefits: `GET /workers/{eshram_id}/benefits`
- Response format: JSON
### 7.4 UIDAI Aadhaar Authentication
**Purpose**: Authenticate users for sensitive operations
**Integration Type**: UIDAI-compliant authentication
**API Endpoints**:
1. **OTP Request**: `POST /aadhaar/otp/request`
- Request: Aadhaar number
- Response: Transaction ID, OTP sent to registered mobile
2. **OTP Verification**: `POST /aadhaar/otp/verify`
- Request: Transaction ID, OTP
- Response: Authentication status, user details (name, DOB, gender)
3. **Biometric Authentication**: `POST /aadhaar/biometric/auth`
- Request: Aadhaar number, biometric data (fingerprint/iris)
- Response: Authentication status
**Compliance**:
- Purpose limitation: Only for authentication, not stored as primary ID
- Data localization: All data in AWS Mumbai (ap-south-1)
- Audit logging: UIDAI-compliant format
## 8. Security and Compliance
### 8.1 Data Encryption
**At Rest**:
- DynamoDB: Encryption at rest with AWS-managed keys
- S3: SSE-KMS encryption with government-approved keys
- Aadhaar numbers: Application-level encryption using AWS KMS before storage
- Storage format: SHA-256 hash for primary key, encrypted value for display
**In Transit**:
- TLS 1.3: All API calls, webhooks, database connections
- Certificate pinning: For critical external APIs (Account Aggregator, UIDAI)
**Key Management**:
- AWS KMS: Customer-managed keys (CMK) for sensitive data
- Key rotation: Automatic annual rotation
- Key access: IAM policies with least privilege
### 8.2 Fraud Prevention
**Deepfake Detection**:
- SageMaker model: Analyzes selfies for manipulation
- Detection methods: GAN artifacts, unnatural eye movements, facial inconsistencies
- Action: >0.7 probability triggers manual review
**GPS Spoofing Detection**:
- Validation: Location changes against realistic travel speeds (<100 km/h)
- Cross-check: Cell tower data from mobile operator (if available)
- Action: Suspicious patterns flagged for investigation
**Duplicate Application Check**:
- DynamoDB query: Same user + scheme within 30 days
- Deduplication: Prevent multiple applications for same benefit
- Action: Reject duplicate, notify user
**Video Reuse Detection**:
- Perceptual hashing: SHA-256 hash of video frames
- Database: Store hashes in DynamoDB
- Action: Block resubmission of same video by different users
### 8.3 Audit and Compliance
**CloudTrail**:
- Logging: Every AWS API call logged
- Retention: 7 years (regulatory requirement)
- Storage: S3 with object lock (compliance mode)
**DynamoDB Streams**:
- Purpose: Capture all state changes
- Processing: Lambda streams to S3 as immutable logs
- Format: JSON with timestamp, user_id, action, before/after state
- Retention: 7 years
**Aadhaar Act Compliance**:
- Purpose limitation: Only for authentication, not stored as primary ID
- User consent: Collected before each Aadhaar use
- Data localization: All data in AWS Mumbai (ap-south-1)
- Audit format: UIDAI-compliant logging
**Data Retention Policy**:
- User profiles: 5 years
- Applications: 7 years (audit requirement)
- Audit logs: 7 years
- Temporary data: Voice recordings deleted after 30 days
### 8.4 User Privacy Controls
**Right to Erasure**:
- Lambda function: `privacy-data-eraser`
- Actions: Delete from DynamoDB, S3, anonymize audit logs
- Timeline: Within 30 days of request
- Exceptions: Audit logs retained in anonymized form
**Data Access Request**:
- Lambda function: `privacy-data-exporter`
- Output: JSON file with all user data
- Delivery: Secure download link (7-day expiry)
**Consent Management**:
- Granular permissions: Aadhaar, Account Aggregator, Location, Photos
- Consent storage: DynamoDB with timestamp
- Revocation: User can revoke consent anytime
### 8.5 Access Control
**IAM Roles**:
- Lambda execution roles: Least privilege (only required services)
- Bedrock Agent roles: Access to action groups, knowledge bases
- Human operators: Read-only access to audit logs
**RBAC (Role-Based Access Control)**:
- Admin: Full access to all resources
- Operator: Read-only access to applications, ability to escalate
- Auditor: Read-only access to audit logs
- Developer: Access to non-production environments only
**API Authentication**:
- External APIs: API keys stored in AWS Secrets Manager
- Rotation: Automatic 90-day rotation
- Monitoring: CloudWatch alarms for failed authentication attempts
## 9. Workflow Orchestration
### 9.1 AWS Step Functions
**Purpose**: Orchestrate complex multi-agent workflows
**State Machine 1: `shadow-credit-workflow`**
- Purpose: End-to-end Shadow Credit Score generation
- Steps:
1. **Initiate Consent**: Lambda `pmv-aa-consent-initiator`
2. **Wait for OTP**: Wait state (max 5 minutes)
3. **Verify Consent**: Lambda `pmv-aa-consent-verifier`
4. **Fetch Transactions**: Lambda `pmv-aa-data-fetcher`
5. **Calculate Score**: Lambda `pmv-shadow-credit-calculator`
6. **Generate Memo**: Lambda `pmv-lending-memo-generator`
7. **Submit to Bank**: Lambda `pmv-bank-submission`
- Error handling: Exponential backoff retry (max 3 attempts)
- Timeout: 30 minutes total
- State storage: DynamoDB with TTL-based cleanup after 90 days
**State Machine 2: `crop-insurance-claim-workflow`**
- Purpose: End-to-end PMFBY claim processing
- Steps:
1. **Detect Weather Event**: Lambda `pmfby-weather-monitor`
2. **Identify Farmers**: Lambda `pmfby-farmer-identifier`
3. **Initiate Outreach**: Lambda `pmfby-outreach-initiator` (parallel for multiple farmers)
4. **File Intimation**: Lambda `pmfby-intimation-filer`
5. **Send Instructions**: Lambda `pmfby-followup-sender`
6. **Wait for Photos**: Wait state (max 48 hours)
7. **Validate Photos**: Lambda `pmfby-photo-validator` (parallel for multiple photos)
8. **Submit to Insurance**: Lambda `pmfby-insurance-uploader`
- Error handling: Retry with exponential backoff
- Timeout: 72 hours total
- Parallel execution: Up to 500 farmers per weather event
**State Machine 3: `interstate-migration-workflow`**
- Purpose: BOCW benefit portability
- Steps:
1. **Detect Migration**: Lambda `bocw-migration-detector`
2. **Notify Worker**: Lambda `bocw-migration-notifier`
3. **Wait for Confirmation**: Wait state (max 7 days)
4. **Update e-Shram**: Lambda `bocw-eshram-updater`
5. **Discover Benefits**: Lambda `bocw-benefits-discoverer`
6. **Guide Registration**: Lambda `bocw-registration-guide`
7. **Map Resources**: Lambda `bocw-resource-mapper`
- Error handling: Retry with exponential backoff
- Timeout: 14 days total
### 9.2 EventBridge Scheduled Rules
**Rule 1: `weather-monitoring-schedule`**
- Schedule: Every 15 minutes
- Target: Lambda `pmfby-weather-monitor`
- Purpose: Poll weather APIs for adverse events
**Rule 2: `application-status-polling-schedule`**
- Schedule: Daily at 6 AM IST (cron: 0 0 * * ? *)
- Target: Lambda `pmv-status-poller`
- Purpose: Check PM Vishwakarma application status
**Rule 3: `claim-status-polling-schedule`**
- Schedule: Every 6 hours (cron: 0 */6 * * ? *)
- Target: Lambda `pmfby-claim-poller`
- Purpose: Check PMFBY claim status across 14 insurance companies
**Rule 4: `migration-detection-schedule`**
- Schedule: Daily at 8 AM IST (cron: 0 2 * * ? *)
- Target: Lambda `bocw-migration-detector`
- Purpose: Detect interstate migration for construction workers
**Rule 5: `knowledge-base-sync-schedule`**
- Schedule: Weekly on Sunday at 2 AM IST (cron: 0 20 ? * SUN *)
- Target: Lambda `kb-sync-orchestrator`
- Purpose: Sync government documents to S3 for Knowledge Bases
### 9.3 Error Handling Strategy
**Retry Policy**:
- Exponential backoff: 1s, 2s, 4s, 8s, 16s
- Maximum attempts: 3
- Jitter: Random delay (0-1s) to prevent thundering herd
**Fallback Actions**:
- API failure: Switch to backup API or web scraping
- ML model failure: Use rule-based fallback logic
- External service unavailable: Queue request for later processing
**Human Escalation**:
- Trigger: All retries exhausted
- Action: Create ticket in support system, notify human operator
- Notification: SMS to beneficiary with ticket ID and estimated resolution time
**Circuit Breaker**:
- Purpose: Prevent cascading failures
- Threshold: 50% error rate over 5 minutes
- Action: Open circuit, return cached response or error message
- Recovery: Automatic retry after 5 minutes
## 10. Performance and Scalability
### 10.1 Performance Targets
**Latency**:
- Voice transcription: <3 seconds (95th percentile)
- Agent response generation: <5 seconds (95th percentile)
- Photo validation: <2 seconds per photo
- ML model inference: <2 seconds (trade verification), <500ms (crop damage)
- End-to-end IVR call: <30 seconds for simple queries
**Throughput**:
- Concurrent voice calls: 10,000 (Amazon Connect capacity)
- SMS volume: 100,000 per hour (Amazon Pinpoint capacity)
- WhatsApp messages: 50,000 per hour (Business API limit)
- API requests: 10,000 requests per second (API Gateway limit)
**Availability**:
- System uptime: 99.9% measured monthly
- Planned maintenance: <4 hours per month
- Disaster recovery: RTO 4 hours, RPO 1 hour
### 10.2 Scalability Strategy
**DynamoDB**:
- Capacity mode: On-demand (auto-scaling)
- Scaling: Automatic based on traffic
- Partition key design: High cardinality (session_id, application_id, worker_id)
**Lambda**:
- Concurrency: Reserved concurrency for critical functions (1000 per function)
- Provisioned concurrency: For latency-sensitive functions (Master Orchestrator)
- Timeout: 15 minutes maximum (Step Functions for longer workflows)
**SageMaker Endpoints**:
- Auto-scaling: 1-5 instances based on invocation rate
- Scaling policy: Target tracking (70% CPU utilization)
- Cold start mitigation: Provisioned concurrency for critical models
**API Gateway**:
- Throttling: 10,000 requests per second per account
- Burst: 5,000 requests
- Caching: Enabled for read-heavy endpoints (TTL 5 minutes)
**S3**:
- Request rate: 5,500 GET/HEAD per second per prefix
- Prefix strategy: Date-based partitioning (`{year}/{month}/{day}/`)
- Transfer acceleration: Enabled for large file uploads
### 10.3 Monitoring and Observability
**CloudWatch Metrics**:
- Lambda: Invocation count, duration, error rate, throttles
- DynamoDB: Read/write capacity, throttled requests, latency
- SageMaker: Invocation count, model latency, 4xx/5xx errors
- API Gateway: Request count, latency, 4xx/5xx errors
- Amazon Connect: Call volume, average handle time, abandonment rate
**CloudWatch Alarms**:
- Lambda error rate >5%: SNS notification to ops team
- DynamoDB throttled requests >10: Auto-scaling trigger
- SageMaker model latency >5s: SNS notification
- API Gateway 5xx errors >1%: SNS notification
- Amazon Connect abandonment rate >10%: SNS notification
**CloudWatch Logs**:
- Lambda: All function logs with structured JSON
- API Gateway: Access logs with request/response details
- Step Functions: Execution history with state transitions
- Retention: 30 days (cost optimization)
**X-Ray Tracing**:
- Enabled for: Lambda, API Gateway, DynamoDB
- Sampling rate: 10% of requests (cost optimization)
- Use case: Trace end-to-end request flow, identify bottlenecks
**Custom Dashboards**:
- Dashboard 1: System health (error rates, latency, availability)
- Dashboard 2: Business metrics (applications submitted, claims filed, certificates generated)
- Dashboard 3: Cost optimization (Lambda invocations, DynamoDB capacity, S3 storage)
## 11. Cost Optimization
### 11.1 Cost Breakdown (Estimated Monthly)
**Compute**:
- Lambda: ~$5,000 (10M invocations, 512MB average, 3s average duration)
- Bedrock Agents: ~$15,000 (Claude 3.5 Sonnet, 50M tokens input, 10M tokens output)
- SageMaker Endpoints: ~$2,000 (ml.m5.xlarge 24/7, serverless inference)
**Storage**:
- DynamoDB: ~$1,000 (on-demand, 100GB storage, 10M read/write per month)
- S3: ~$500 (1TB storage, 10M GET requests, 1M PUT requests)
- OpenSearch Serverless: ~$1,500 (4 OCUs for indexing, 4 OCUs for search)
**Communication**:
- Amazon Connect: ~$3,000 (100,000 minutes, $0.018 per minute)
- Amazon Transcribe: ~$1,500 (100,000 minutes, $0.024 per minute)
- Amazon Polly: ~$500 (10M characters, $0.016 per 1M characters)
- Amazon Pinpoint SMS: ~$2,000 (500,000 SMS, $0.00645 per SMS in India)
- WhatsApp Business API: ~$1,000 (100,000 messages, $0.01 per message)
**AI/ML**:
- Amazon Rekognition: ~$1,000 (100,000 images, $0.001 per image)
- Amazon Comprehend: ~$500 (10M characters, $0.0001 per unit)
- Amazon Translate: ~$300 (10M characters, $0.000015 per character)
**Total Estimated Monthly Cost**: ~$35,000
**Cost per Beneficiary Interaction**: ~$0.35 (assuming 100,000 interactions per month)
### 11.2 Cost Optimization Strategies
**Lambda**:
- Right-sizing: Use Lambda Power Tuning to optimize memory allocation
- Provisioned concurrency: Only for latency-sensitive functions
- Code optimization: Reduce cold starts, minimize dependencies
**DynamoDB**:
- On-demand mode: For unpredictable traffic patterns
- TTL: Automatic cleanup of expired conversation state (24 hours)
- Compression: Store large attributes (conversation_history) as compressed JSON
**S3**:
- Lifecycle policies: Transition to Glacier after 90 days (80% cost reduction)
- Intelligent-Tiering: For unpredictable access patterns
- Compression: Store photos/videos in compressed formats
**SageMaker**:
- Serverless inference: For variable load (crop damage model)
- Spot instances: For training jobs (70% cost reduction)
- Model optimization: Quantization, pruning for faster inference
**Bedrock**:
- Prompt optimization: Reduce token usage with concise prompts
- Caching: Cache common responses in DynamoDB (TTL 1 hour)
- Batch processing: Group multiple queries when possible
**Communication**:
- SMS optimization: Use 160-character limit efficiently
- WhatsApp preference: Encourage WhatsApp over SMS (lower cost)
- IVR optimization: Reduce average handle time with better prompts
## 12. Deployment Architecture
### 12.1 AWS Region Strategy
**Primary Region**: ap-south-1 (Mumbai)
- Reason: Data localization requirement (Aadhaar Act compliance)
- All user data must reside in India
**Disaster Recovery Region**: ap-south-2 (Hyderabad)
- Purpose: Backup for critical services
- Replication: S3 cross-region replication, DynamoDB global tables
- Failover: Manual (RTO 4 hours)
### 12.2 Environment Strategy
**Development Environment**:
- Purpose: Feature development and testing
- Resources: Scaled-down versions (1/10th of production)
- Data: Synthetic test data only
- Access: Developers only
**Staging Environment**:
- Purpose: Pre-production testing
- Resources: Same configuration as production
- Data: Anonymized production data
- Access: Developers, QA, stakeholders
**Production Environment**:
- Purpose: Live system serving beneficiaries
- Resources: Full-scale with auto-scaling
- Data: Real user data (encrypted)
- Access: Ops team only (read-only for most)
### 12.3 CI/CD Pipeline
**Source Control**: GitHub
- Repository structure: Monorepo with folders per service
- Branching strategy: GitFlow (main, develop, feature branches)
**Build Pipeline** (GitHub Actions):
1. **Lint**: ESLint for JavaScript, Pylint for Python
2. **Test**: Unit tests (Jest, pytest), integration tests
3. **Build**: Package Lambda functions, Docker images for SageMaker
4. **Security Scan**: Snyk for dependency vulnerabilities
5. **Artifact Storage**: S3 bucket `bharatseva-artifacts`
**Deployment Pipeline** (AWS CodePipeline):
1. **Source**: GitHub webhook triggers pipeline
2. **Build**: CodeBuild runs tests and packages artifacts
3. **Deploy to Staging**: CloudFormation stack update
4. **Integration Tests**: Automated tests against staging
5. **Manual Approval**: Stakeholder approval required
6. **Deploy to Production**: CloudFormation stack update
7. **Smoke Tests**: Automated health checks
**Infrastructure as Code**: AWS CloudFormation
- Templates: Separate stacks for each service
- Parameters: Environment-specific (dev, staging, prod)
- Drift detection: Daily checks for manual changes
### 12.4 Rollback Strategy
**Lambda Functions**:
- Versioning: All functions versioned
- Aliases: `prod` alias points to stable version
- Rollback: Update alias to previous version (instant)
**SageMaker Models**:
- Model Registry: All models versioned
- Endpoint update: Blue/green deployment
- Rollback: Update endpoint to previous model version
**Database Schema**:
- DynamoDB: Backward-compatible schema changes only
- Migration: Lambda function for data migration
- Rollback: Restore from point-in-time backup (max 5 minutes data loss)
**API Gateway**:
- Stages: Separate stages for each environment
- Canary deployment: 10% traffic to new version, monitor for 1 hour
- Rollback: Revert stage to previous deployment
## 13. Testing Strategy
### 13.1 Unit Testing
**Lambda Functions**:
- Framework: Jest (JavaScript), pytest (Python)
- Coverage target: >80%
- Mocking: AWS SDK calls mocked with aws-sdk-mock
- Test cases: Happy path, error handling, edge cases
**Bedrock Agent Action Groups**:
- Framework: pytest with moto for AWS mocking
- Test cases: Valid inputs, invalid inputs, API failures
- Assertions: Response format, error messages, side effects
### 13.2 Integration Testing
**API Testing**:
- Framework: Postman collections, Newman for automation
- Test cases: End-to-end workflows (Shadow Credit, Claim Filing, Work Log)
- Environment: Staging environment with test data
- Assertions: Response codes, response body, database state
**Agent Testing**:
- Framework: Custom Python scripts
- Test cases: Multi-turn conversations, domain switching, error recovery
- Environment: Staging Bedrock Agents
- Assertions: Intent classification accuracy, response relevance
### 13.3 ML Model Testing
**Model Accuracy Testing**:
- Framework: pytest with scikit-learn metrics
- Test cases: Validation set (20% of training data)
- Metrics: Accuracy, precision, recall, F1-score (classification), MAE (regression)
- Threshold: >85% accuracy for trade verification, <8% MAE for crop damage
**Model Bias Testing**:
- Framework: Fairlearn library
- Test cases: Performance across demographic groups (gender, region, crop type)
- Metrics: Demographic parity, equalized odds
- Threshold: <10% disparity across groups
**Model Robustness Testing**:
- Framework: Adversarial Robustness Toolbox (ART)
- Test cases: Adversarial examples, noisy inputs
- Metrics: Accuracy under perturbation
- Threshold: >70% accuracy with 10% noise
### 13.4 Load Testing
**IVR Load Testing**:
- Tool: Amazon Connect load testing tool
- Scenario: 10,000 concurrent calls
- Metrics: Call success rate, average handle time, abandonment rate
- Threshold: >99% success rate, <30s handle time, <5% abandonment
**API Load Testing**:
- Tool: Apache JMeter
- Scenario: 10,000 requests per second for 10 minutes
- Metrics: Response time, error rate, throughput
- Threshold: <5s response time (95th percentile), <1% error rate
**Database Load Testing**:
- Tool: DynamoDB load testing script
- Scenario: 10,000 read/write per second
- Metrics: Latency, throttled requests
- Threshold: <100ms latency (95th percentile), 0 throttled requests
### 13.5 Security Testing
**Penetration Testing**:
- Frequency: Quarterly
- Scope: API endpoints, authentication, authorization
- Tools: OWASP ZAP, Burp Suite
- Findings: Documented and prioritized for remediation
**Vulnerability Scanning**:
- Frequency: Weekly
- Scope: Dependencies, Docker images, Lambda functions
- Tools: Snyk, AWS Inspector
- Findings: Auto-remediation for critical vulnerabilities
**Compliance Auditing**:
- Frequency: Annual
- Scope: Aadhaar Act compliance, data retention, encryption
- Auditor: Third-party security firm
- Findings: Documented and remediated within 30 days
## 14. Disaster Recovery and Business Continuity
### 14.1 Backup Strategy
**DynamoDB**:
- Point-in-time recovery: Enabled (restore to any point in last 35 days)
- On-demand backups: Daily automated backups
- Retention: 35 days
- Cross-region replication: Global tables to ap-south-2 (Hyderabad)
**S3**:
- Versioning: Enabled for all buckets
- Cross-region replication: Critical buckets replicated to ap-south-2
- Lifecycle policies: Transition to Glacier after 90 days
- Object lock: Enabled for audit trail (compliance mode)
**Lambda Functions**:
- Version control: All code in GitHub
- Deployment artifacts: Stored in S3 with versioning
- Rollback: Instant via alias update
**SageMaker Models**:
- Model Registry: All models versioned
- Artifacts: Stored in S3 with versioning
- Rollback: Update endpoint to previous model version
### 14.2 Disaster Recovery Plan
**RTO (Recovery Time Objective)**: 4 hours
**RPO (Recovery Point Objective)**: 1 hour
**Disaster Scenarios**:
**Scenario 1: Region Failure (ap-south-1 unavailable)**
- Detection: CloudWatch alarms, health checks
- Action:
1. Activate DR region (ap-south-2)
2. Update Route 53 DNS to point to DR region
3. Restore DynamoDB from global tables
4. Restore S3 from cross-region replication
5. Deploy Lambda functions from artifacts
6. Update external integrations (webhooks, API endpoints)
- Timeline: 4 hours
- Data loss: <1 hour (RPO)
**Scenario 2: Data Corruption**
- Detection: Data validation checks, user reports
- Action:
1. Identify affected tables/buckets
2. Restore from point-in-time backup (DynamoDB)
3. Restore from versioned objects (S3)
4. Validate data integrity
- Timeline: 2 hours
- Data loss: Minimal (point-in-time recovery)
**Scenario 3: Security Breach**
- Detection: CloudTrail anomaly detection, security alerts
- Action:
1. Isolate affected resources (security groups, IAM policies)
2. Rotate all credentials (API keys, passwords)
3. Audit access logs (CloudTrail, VPC Flow Logs)
4. Restore from clean backup if necessary
5. Notify affected users
- Timeline: 1 hour (isolation), 4 hours (full recovery)
### 14.3 Business Continuity
**Critical Services** (must remain operational):
- IVR system (Amazon Connect)
- Master Orchestrator Agent
- SMS notifications (Amazon Pinpoint)
- Aadhaar authentication
**Degraded Mode Operations**:
- If Bedrock unavailable: Use rule-based fallback logic
- If ML models unavailable: Manual review for trade verification, crop damage
- If external APIs unavailable: Queue requests for later processing
- If WhatsApp unavailable: Fall back to SMS
**Communication Plan**:
- Internal: Slack channel for ops team, PagerDuty for on-call
- External: Status page for beneficiaries, SMS notifications for critical outages
- Stakeholders: Email updates every 2 hours during incident
## 15. Future Enhancements
### 15.1 Phase 2 Features (6-12 months)
**Multilingual Knowledge Bases**:
- Current: English-only knowledge bases with runtime translation
- Enhancement: Native language embeddings for 22 Indian languages
- Benefit: Improved semantic search accuracy for regional languages
**Blockchain Integration**:
- Current: SHA-256 hashes for certificates
- Enhancement: Store certificate hashes on blockchain (Hyperledger Fabric)
- Benefit: Immutable verification, prevent tampering
**Predictive Analytics**:
- Current: Reactive (respond to user queries)
- Enhancement: Proactive (predict user needs based on patterns)
- Example: "You're eligible for PM Vishwakarma loan based on your transaction history"
**Voice Biometrics**:
- Current: Aadhaar OTP for authentication
- Enhancement: Voice biometric authentication (Amazon Connect Voice ID)
- Benefit: Faster authentication, better user experience
**Offline Mobile App**:
- Current: IVR, WhatsApp, SMS (require connectivity)
- Enhancement: Progressive Web App (PWA) with offline capabilities
- Benefit: Work in areas with intermittent connectivity
### 15.2 Phase 3 Features (12-24 months)
**Computer Vision for Surveyor Assistance**:
- Current: Photo validation only
- Enhancement: Real-time AR guidance for surveyors (mobile app)
- Example: Overlay damage percentage estimate on live camera feed
**Natural Language Generation for Reports**:
- Current: Template-based reports
- Enhancement: AI-generated narrative reports (Bedrock Claude)
- Example: "Based on your transaction history, you have a strong credit profile..."
**Multi-Modal Interaction**:
- Current: Voice, text, photos
- Enhancement: Video calls with AI avatar (Amazon IVS + Bedrock)
- Benefit: More engaging user experience
**Integration with More Schemes**:
- Current: 3 schemes (PM Vishwakarma, PMFBY, BOCW)
- Enhancement: 10+ schemes (PM-KISAN, Ayushman Bharat, MGNREGA, etc.)
- Benefit: One-stop solution for all government services
**AI-Powered Grievance Resolution**:
- Current: Manual escalation to human operators
- Enhancement: AI-powered grievance analysis and resolution
- Example: Automatically identify root cause, suggest resolution
### 15.3 Research and Innovation
**Federated Learning for Privacy**:
- Current: Centralized ML models
- Enhancement: Federated learning (train on-device, aggregate updates)
- Benefit: Enhanced privacy, no raw data leaves device
**Explainable AI**:
- Current: Black-box ML models
- Enhancement: SHAP/LIME for model interpretability
- Benefit: Transparency, trust, regulatory compliance
**Low-Resource Language Support**:
- Current: 22 Indian languages (major languages)
- Enhancement: 100+ languages (tribal languages, dialects)
- Benefit: Reach underserved communities
**Edge Computing**:
- Current: Cloud-based processing
- Enhancement: Edge processing on mobile devices (AWS IoT Greengrass)
- Benefit: Lower latency, offline capabilities
## 16. Risks and Mitigations
### 16.1 Technical Risks
**Risk 1: Bedrock Agent Hallucination**
- Description: Agent generates incorrect information about schemes
- Impact: High (misinformation to beneficiaries)
- Probability: Medium
- Mitigation:
- Guardrails: Accuracy threshold 70%, reject low-confidence responses
- Knowledge bases: Ground responses in official documents
- Human review: Flag responses for manual review if confidence <80%
- Testing: Regular accuracy testing with ground truth dataset
**Risk 2: ML Model Drift**
- Description: Model accuracy degrades over time due to data distribution changes
- Impact: Medium (incorrect trade verification, crop damage estimates)
- Probability: High
- Mitigation:
- Monitoring: SageMaker Model Monitor for drift detection
- Retraining: Automatic retraining trigger if accuracy <80%
- A/B testing: Test new models on 10% traffic before full rollout
- Fallback: Manual review if model confidence <85%
**Risk 3: External API Failures**
- Description: Government portals, weather APIs, Account Aggregator unavailable
- Impact: High (core functionality blocked)
- Probability: Medium
- Mitigation:
- Retry logic: Exponential backoff with max 3 attempts
- Fallback: Web scraping for government portals, cached weather data
- Circuit breaker: Prevent cascading failures
- Queue: Store requests for later processing when API recovers
**Risk 4: DDoS Attack**
- Description: Malicious traffic overwhelms system
- Impact: High (service unavailable)
- Probability: Low
- Mitigation:
- AWS Shield: Standard DDoS protection (free)
- AWS WAF: Rate limiting, IP blocking
- CloudFront: CDN for static content, absorb traffic spikes
- Auto-scaling: Lambda, DynamoDB scale automatically
### 16.2 Operational Risks
**Risk 5: Data Breach**
- Description: Unauthorized access to user data (Aadhaar, financial data)
- Impact: Critical (legal liability, loss of trust)
- Probability: Low
- Mitigation:
- Encryption: At rest (KMS), in transit (TLS 1.3)
- Access control: IAM policies with least privilege
- Monitoring: CloudTrail for audit, GuardDuty for threat detection
- Compliance: Regular security audits, penetration testing
**Risk 6: Cost Overrun**
- Description: Unexpected traffic spike leads to high AWS bills
- Impact: Medium (budget exceeded)
- Probability: Medium
- Mitigation:
- Budgets: AWS Budgets with alerts at 80%, 100%, 120%
- Throttling: API Gateway rate limiting
- Reserved capacity: Reserved concurrency for Lambda
- Cost optimization: Regular review of usage patterns
**Risk 7: Key Personnel Departure**
- Description: Loss of critical team members (ML engineers, DevOps)
- Impact: Medium (delayed development, knowledge loss)
- Probability: Medium
- Mitigation:
- Documentation: Comprehensive design docs, runbooks
- Knowledge sharing: Regular team meetings, pair programming
- Cross-training: Multiple team members trained on each component
- Vendor support: AWS Professional Services for critical issues
### 16.3 Regulatory Risks
**Risk 8: Aadhaar Act Non-Compliance**
- Description: Violation of Aadhaar Act (purpose limitation, data localization)
- Impact: Critical (legal penalties, system shutdown)
- Probability: Low
- Mitigation:
- Legal review: Regular compliance audits by legal team
- Data localization: All data in ap-south-1 (Mumbai)
- Purpose limitation: Only use Aadhaar for authentication
- Audit trail: UIDAI-compliant logging
**Risk 9: Insurance Regulatory Changes**
- Description: IRDAI changes PMFBY claim procedures
- Impact: Medium (system updates required)
- Probability: Medium
- Mitigation:
- Monitoring: Daily scraping of government gazette
- Flexibility: Configurable workflows (Step Functions)
- Knowledge base: Weekly sync of policy documents
- Stakeholder engagement: Regular meetings with insurance companies
**Risk 10: AI Regulation**
- Description: New AI regulations (e.g., EU AI Act equivalent in India)
- Impact: Medium (compliance requirements, system changes)
- Probability: Medium
- Mitigation:
- Explainability: Implement SHAP/LIME for model interpretability
- Human oversight: Manual review for critical decisions
- Transparency: Disclose AI usage to beneficiaries
- Monitoring: Track AI regulation developments
## 17. Success Metrics
### 17.1 User Adoption Metrics
**Metric 1: Active Users**
- Definition: Unique users interacting with system per month
- Target: 100,000 users by Month 6, 500,000 users by Month 12
- Measurement: DynamoDB query on `user_id` (distinct count)
**Metric 2: Channel Distribution**
- Definition: Percentage of interactions by channel (IVR, WhatsApp, SMS, Web)
- Target: IVR 50%, WhatsApp 30%, SMS 15%, Web 5%
- Measurement: CloudWatch metrics by channel
**Metric 3: Language Distribution**
- Definition: Percentage of interactions by language
- Target: Hindi 40%, English 20%, Regional languages 40%
- Measurement: Amazon Transcribe language detection logs
### 17.2 Operational Metrics
**Metric 4: Application Submission Rate**
- Definition: Number of applications submitted per month
- Target: 10,000 applications by Month 6, 50,000 applications by Month 12
- Measurement: DynamoDB query on `applications` table
**Metric 5: Application Approval Rate**
- Definition: Percentage of applications approved
- Target: >70% approval rate
- Measurement: DynamoDB query on `applications` table (status = Approved)
**Metric 6: Average Processing Time**
- Definition: Days from application submission to approval
- Target: <30 days (PM Vishwakarma), <7 days (PMFBY), <90 days (BOCW)
- Measurement: DynamoDB query on `status_history` timestamps
**Metric 7: Escalation Rate**
- Definition: Percentage of applications requiring escalation
- Target: <20% escalation rate
- Measurement: DynamoDB query on `applications` table (escalation_level > 0)
### 17.3 Quality Metrics
**Metric 8: Agent Accuracy**
- Definition: Percentage of agent responses rated as accurate by users
- Target: >90% accuracy
- Measurement: User feedback (thumbs up/down after each interaction)
**Metric 9: ML Model Accuracy**
- Definition: Accuracy of trade verification, crop damage models
- Target: >85% accuracy (trade verification), <8% MAE (crop damage)
- Measurement: SageMaker Model Monitor on validation set
**Metric 10: First Contact Resolution**
- Definition: Percentage of queries resolved in first interaction
- Target: >80% first contact resolution
- Measurement: Conversation state analysis (no follow-up within 24 hours)
### 17.4 User Satisfaction Metrics
**Metric 11: Net Promoter Score (NPS)**
- Definition: Likelihood of users recommending system to others
- Target: NPS >50 (excellent)
- Measurement: Post-interaction survey (0-10 scale)
**Metric 12: User Satisfaction (CSAT)**
- Definition: User satisfaction with interaction
- Target: CSAT >4.0 out of 5.0
- Measurement: Post-interaction survey (1-5 scale)
**Metric 13: Task Completion Rate**
- Definition: Percentage of users who complete intended task
- Target: >85% task completion
- Measurement: Workflow completion in Step Functions
### 17.5 Business Impact Metrics
**Metric 14: Cost per Interaction**
- Definition: Total system cost / number of interactions
- Target: <$0.50 per interaction
- Measurement: AWS Cost Explorer / CloudWatch metrics
**Metric 15: Time Saved per User**
- Definition: Time saved compared to manual process
- Target: >2 hours saved per application
- Measurement: User survey, comparison with manual process
**Metric 16: Corruption Reduction**
- Definition: Number of bribery reports collected
- Target: 100+ reports per month (indicates awareness)
- Measurement: DynamoDB query on anti-corruption evidence table
**Metric 17: Financial Inclusion**
- Definition: Number of users accessing credit without traditional documentation
- Target: 10,000 Shadow Credit Scores generated by Month 12
- Measurement: DynamoDB query on Shadow Credit Score table
## 18. Implementation Roadmap
### 18.1 Phase 1: Foundation (Months 1-3)
**Month 1: Infrastructure Setup**
- Week 1-2: AWS account setup, IAM roles, VPC configuration
- Week 3-4: DynamoDB tables, S3 buckets, CloudFormation templates
- Deliverables: Infrastructure as Code, CI/CD pipeline
**Month 2: Core Services**
- Week 1-2: Master Orchestrator Agent, conversation state management
- Week 3-4: Voice interface (Amazon Connect, Transcribe, Polly)
- Deliverables: Working IVR system, basic routing
**Month 3: First Domain (PM Vishwakarma)**
- Week 1-2: Shadow Credit Agent (Account Aggregator integration)
- Week 3-4: Proof of Work Agent (Rekognition, SageMaker model training)
- Deliverables: End-to-end PM Vishwakarma workflow
### 18.2 Phase 2: Expansion (Months 4-6)
**Month 4: PMFBY Domain**
- Week 1-2: First Responder Agent (weather monitoring, proactive outreach)
- Week 3-4: Surveyor Assistant Agent (photo validation, insurance submission)
- Deliverables: End-to-end PMFBY workflow
**Month 5: BOCW Domain**
- Week 1-2: Digital Work-Log Agent (GPS tracking, selfie verification)
- Week 3-4: Interstate Bridge Agent (migration detection, benefit portability)
- Deliverables: End-to-end BOCW workflow
**Month 6: Communication Channels**
- Week 1-2: WhatsApp Business API integration
- Week 3-4: SMS fallback, web interface
- Deliverables: Multi-channel access
### 18.3 Phase 3: Optimization (Months 7-9)
**Month 7: ML Model Refinement**
- Week 1-2: Trade verification model retraining (more data)
- Week 3-4: Crop damage model fine-tuning, deepfake detection
- Deliverables: Improved model accuracy (>90%)
**Month 8: Knowledge Base Enhancement**
- Week 1-2: Expand to 1200+ schemes, multilingual support
- Week 3-4: Weekly sync automation, semantic search optimization
- Deliverables: Comprehensive knowledge bases
**Month 9: Security and Compliance**
- Week 1-2: Penetration testing, vulnerability remediation
- Week 3-4: Aadhaar Act compliance audit, data retention policies
- Deliverables: Security audit report, compliance certification
### 18.4 Phase 4: Scale and Launch (Months 10-12)
**Month 10: Load Testing and Optimization**
- Week 1-2: Load testing (10,000 concurrent users)
- Week 3-4: Performance optimization, cost optimization
- Deliverables: System ready for scale
**Month 11: Pilot Launch**
- Week 1-2: Pilot in 3 districts (1 per scheme domain)
- Week 3-4: User feedback collection, bug fixes
- Deliverables: Pilot report, user testimonials
**Month 12: Full Launch**
- Week 1-2: National rollout, marketing campaign
- Week 3-4: Monitoring, support, continuous improvement
- Deliverables: Live system serving 100,000+ users
## 19. Team Structure and Roles
### 19.1 Core Team
**Product Manager** (1)
- Responsibilities: Product vision, roadmap, stakeholder management
- Skills: Government schemes domain knowledge, user research
**Technical Architect** (1)
- Responsibilities: System design, technology decisions, code reviews
- Skills: AWS architecture, AI/ML, distributed systems
**Backend Engineers** (3)
- Responsibilities: Lambda functions, API integrations, database design
- Skills: Python, Node.js, AWS services (Lambda, DynamoDB, Step Functions)
**ML Engineers** (2)
- Responsibilities: Model training, deployment, monitoring
- Skills: TensorFlow, PyTorch, SageMaker, computer vision, NLP
**AI/Agent Engineers** (2)
- Responsibilities: Bedrock Agent configuration, prompt engineering, knowledge bases
- Skills: Amazon Bedrock, Claude, RAG, prompt optimization
**DevOps Engineer** (1)
- Responsibilities: CI/CD, infrastructure, monitoring, security
- Skills: CloudFormation, GitHub Actions, CloudWatch, security best practices
**QA Engineer** (1)
- Responsibilities: Testing strategy, automation, quality assurance
- Skills: Jest, pytest, Postman, load testing, security testing
**UX Designer** (1)
- Responsibilities: Voice interface design, conversation flows, accessibility
- Skills: Voice UI design, user research, accessibility standards
### 19.2 Extended Team
**Domain Experts** (3)
- PM Vishwakarma expert: Artisan credit, trade verification
- PMFBY expert: Crop insurance, claim procedures
- BOCW expert: Construction worker welfare, interstate portability
**Legal Counsel** (1)
- Responsibilities: Aadhaar Act compliance, data privacy, regulatory compliance
- Skills: Indian data protection laws, government regulations
**Security Consultant** (1)
- Responsibilities: Security audits, penetration testing, compliance
- Skills: AWS security, OWASP, vulnerability assessment
**Data Analyst** (1)
- Responsibilities: Metrics tracking, reporting, insights
- Skills: SQL, Python, data visualization (Tableau, QuickSight)
### 19.3 Support Team
**Customer Support Agents** (5)
- Responsibilities: Handle escalations, user support, feedback collection
- Skills: Government schemes knowledge, multilingual (Hindi, English, regional languages)
**Operations Manager** (1)
- Responsibilities: System monitoring, incident response, SLA management
- Skills: AWS operations, incident management, on-call rotation
## 20. Appendices
### 20.1 Glossary of AWS Services
- **Amazon Bedrock**: Managed service for foundation models (Claude, Titan)
- **Amazon Bedrock Agents**: Orchestration framework for multi-step AI workflows
- **Amazon Bedrock Knowledge Bases**: RAG (Retrieval-Augmented Generation) for grounding agents
- **Amazon Connect**: Cloud-based contact center (IVR)
- **Amazon Transcribe**: Speech-to-text service (22 Indian languages)
- **Amazon Polly**: Text-to-speech service (neural voices)
- **Amazon Comprehend**: NLP service (intent classification, entity extraction)
- **Amazon Translate**: Neural machine translation
- **Amazon Rekognition**: Computer vision (image/video analysis, face comparison)
- **Amazon Textract**: Intelligent OCR (document data extraction)
- **Amazon Location Service**: Geospatial services (maps, geocoding, routing)
- **Amazon SageMaker**: ML platform (training, deployment, monitoring)
- **AWS Lambda**: Serverless compute
- **Amazon DynamoDB**: NoSQL database
- **Amazon S3**: Object storage
- **Amazon OpenSearch Serverless**: Managed search and analytics
- **AWS Step Functions**: Workflow orchestration
- **Amazon EventBridge**: Event bus for scheduled rules and event-driven architecture
- **Amazon Pinpoint**: SMS and push notifications
- **Amazon SES**: Email service
- **AWS KMS**: Key management service (encryption)
- **AWS CloudTrail**: API call logging
- **Amazon CloudWatch**: Monitoring and observability
- **AWS X-Ray**: Distributed tracing
- **AWS Secrets Manager**: Secrets storage and rotation
- **AWS IAM**: Identity and access management
- **Amazon API Gateway**: API management
- **AWS CloudFormation**: Infrastructure as Code
- **AWS CodePipeline**: CI/CD pipeline
- **AWS CodeBuild**: Build service
### 20.2 Acronyms
- **AI**: Artificial Intelligence
- **API**: Application Programming Interface
- **AWS**: Amazon Web Services
- **BOCW**: Building and Other Construction Workers
- **CIBIL**: Credit Information Bureau (India) Limited
- **CNN**: Convolutional Neural Network
- **CSAT**: Customer Satisfaction Score
- **DDoS**: Distributed Denial of Service
- **FI**: Financial Information
- **FIP**: Financial Information Provider
- **FIU**: Financial Information User
- **GAN**: Generative Adversarial Network
- **GPS**: Global Positioning System
- **IAM**: Identity and Access Management
- **IMD**: India Meteorological Department
- **IRDAI**: Insurance Regulatory and Development Authority of India
- **IVR**: Interactive Voice Response
- **JWT**: JSON Web Token
- **KMS**: Key Management Service
- **MAE**: Mean Absolute Error
- **ML**: Machine Learning
- **MSME**: Micro, Small, and Medium Enterprises
- **NBFC-AA**: Non-Banking Financial Company - Account Aggregator
- **NLP**: Natural Language Processing
- **NPS**: Net Promoter Score
- **OCR**: Optical Character Recognition
- **ONORC**: One Nation One Ration Card
- **OTP**: One-Time Password
- **PII**: Personally Identifiable Information
- **PMFBY**: Pradhan Mantri Fasal Bima Yojana (crop insurance)
- **PMJAY**: Pradhan Mantri Jan Arogya Yojana (health insurance)
- **PWA**: Progressive Web App
- **RAG**: Retrieval-Augmented Generation
- **RBAC**: Role-Based Access Control
- **REST**: Representational State Transfer
- **RPO**: Recovery Point Objective
- **RTO**: Recovery Time Objective
- **S3**: Simple Storage Service
- **SES**: Simple Email Service
- **SHAP**: SHapley Additive exPlanations
- **SLA**: Service Level Agreement
- **SMS**: Short Message Service
- **SOAP**: Simple Object Access Protocol
- **SSML**: Speech Synthesis Markup Language
- **TLS**: Transport Layer Security
- **TTL**: Time To Live
- **UIDAI**: Unique Identification Authority of India
- **UPI**: Unified Payments Interface
- **VPC**: Virtual Private Cloud
- **WAF**: Web Application Firewall
### 20.3 References
1. **Government Schemes**:
- PM Vishwakarma: https://pmvishwakarma.gov.in
- PMFBY: https://pmfby.gov.in
- e-Shram: https://eshram.gov.in
2. **AWS Documentation**:
- Amazon Bedrock: https://docs.aws.amazon.com/bedrock
- Amazon Connect: https://docs.aws.amazon.com/connect
- Amazon SageMaker: https://docs.aws.amazon.com/sagemaker
3. **Regulatory**:
- Aadhaar Act 2016: https://uidai.gov.in/legal-framework/acts.html
- Account Aggregator Framework: https://sahamati.org.in
4. **Technical Standards**:
- FI Data Schema: https://api.rebit.org.in
- UIDAI Authentication API: https://uidai.gov.in/ecosystem/authentication-devices-documents/about-aadhaar-paperless-offline-e-kyc.html
---
**Document Version**: 1.0
**Last Updated**: January 24, 2026
**Author**: BharatSeva AI Design Team
**Status**: Draft for Review
Full-stack web application for the University of Guelph Rocketry Club featuring AI-powered chatbot, member management, project showcases, and sponsor integration.
Reactory Data (`reactory-data`) is the data, assets, and CDN repository for the Reactory platform. It provides baseline directory structures, fonts, themes, internationalization files, client plugin source code and runtime bundles, email templates, workflow schedules, database backups, AI learning resources, and static content.
globs: src/app/**/*.tsx src/components/**/*.tsx src/hooks/**/*.ts src/lib/**/*.ts
A TypeScript CLI application that initiates and maintains an autonomous conversation between two AI personas using Ollama. The app starts with user input and then continues the conversation automatically until stopped.