AccessGuard AI - Technical Requirements Document (TRD) — .md Directory

# AccessGuard AI - Technical Requirements Document (TRD) **Version:** 1.0.0 **Date:** March 17, 2026 **Status:** Draft --- ## 1. Executive Summary AccessGuard AI is an insider threat detection system combining AI-powered behavior analysis with real-time alerts. The system consists of three primary components: 1. **Chrome Extension** - Browser-side monitoring and real-time alerts 2. **Backend Detection Engine** - Sigma rules + ML-based anomaly detection 3. **Web Dashboard** - React-based analytics and investigation interface All frontend components (extension + dashboard) will implement the **Dark Neumorphic Design System** for consistent, modern UI/UX. --- ## 2. System Architecture ### 2.1 High-Level Architecture ``` ┌─────────────────────┐ │ Chrome Extension │ ──────┐ │ (Browser Monitor) │ │ └─────────────────────┘ │ │ WebSocket/REST ┌─────────────────────┐ │ │ Log Collectors │ ──────┤ │ (Windows/Linux) │ │ └─────────────────────┘ │ ▼ ┌──────────────────┐ │ Backend API │ │ (FastAPI) │ └──────────────────┘ │ ┌─────────┴─────────┐ │ │ ┌───────▼──────┐ ┌──────▼──────┐ │ Rules Engine │ │ ML Engine │ │ (Sigma) │ │ (Anomaly) │ └───────┬──────┘ └──────┬──────┘ │ │ └─────────┬─────────┘ │ ┌─────────▼─────────┐ │ Alert Queue │ │ (RabbitMQ) │ └─────────┬─────────┘ │ ┌─────────▼─────────┐ │ PostgreSQL │ │ InfluxDB │ └───────────────────┘ │ ┌─────────▼─────────┐ │ Web Dashboard │ │ (React) │ └───────────────────┘ ``` ### 2.2 Technology Stack **Frontend:** - React 18+ with TypeScript - Tailwind CSS (configured with Dark Neumorphic theme) - Recharts/D3.js for visualizations - WebSocket client for real-time updates - Chrome Extension Manifest V3 **Backend:** - Python 3.11+ - FastAPI (REST API) - Celery (async task processing) - RabbitMQ (message queue) - PostgreSQL (relational data) - InfluxDB (time-series metrics) **ML/Detection:** - Sigma rules (YAML-based detection) - scikit-learn (behavioral baselining) - pandas/numpy (data processing) **Infrastructure:** - Docker + Docker Compose - AWS S3 (log archival) - Nginx (reverse proxy) - Redis (caching) --- ## 3. Functional Requirements ### 3.1 Chrome Extension **FR-EXT-001:** Monitor browser downloads in real-time **FR-EXT-002:** Capture clipboard copy events (with content hashing) **FR-EXT-003:** Track authentication attempts (login forms) **FR-EXT-004:** Log browsing activity (URL patterns, not full URLs) **FR-EXT-005:** Display real-time alerts via browser notifications **FR-EXT-006:** Block suspicious file downloads (configurable) **FR-EXT-007:** Warn users before dangerous actions (e.g., bulk downloads) **FR-EXT-008:** Send telemetry to backend API via REST **FR-EXT-009:** Implement Dark Neumorphic UI for popup/settings **FR-EXT-010:** Support offline queuing (sync when online) ### 3.2 Backend Detection Engine **FR-BE-001:** Ingest logs from multiple sources (Windows Event Viewer, Syslog, Chrome Extension) **FR-BE-002:** Normalize logs to common schema **FR-BE-003:** Execute 40+ Sigma detection rules **FR-BE-004:** Train ML models on user behavior (4+ weeks baseline) **FR-BE-005:** Compute anomaly scores for each event **FR-BE-006:** Aggregate related alerts (deduplication) **FR-BE-007:** Calculate user risk scores (0-100) **FR-BE-008:** Generate compliance audit trails **FR-BE-009:** Send notifications (Slack, email, webhook) **FR-BE-010:** Archive raw logs to S3 (encrypted, 30-90 day retention) ### 3.3 Web Dashboard **FR-DASH-001:** Executive summary page (KPIs, trends, top 5 users) **FR-DASH-002:** Real-time alert queue (sortable, filterable) **FR-DASH-003:** User timeline view (complete activity history) **FR-DASH-004:** Risk ranking page (scored user list) **FR-DASH-005:** Alert investigation drill-down (context, related events) **FR-DASH-006:** Compliance reporting (SOX, HIPAA, GDPR templates) **FR-DASH-007:** Rule management interface (enable/disable, tune thresholds) **FR-DASH-008:** User search and filtering **FR-DASH-009:** WebSocket-based real-time updates **FR-DASH-010:** Implement Dark Neumorphic Design System throughout **FR-DASH-011:** Role-based access control (Admin, Analyst, Viewer) **FR-DASH-012:** Export reports (PDF, CSV) --- ## 4. Non-Functional Requirements ### 4.1 Performance **NFR-PERF-001:** Alert latency < 5 seconds (from event to dashboard) **NFR-PERF-002:** Dashboard page load < 2 seconds **NFR-PERF-003:** Support 500 concurrent users **NFR-PERF-004:** Process 10,000 events/second **NFR-PERF-005:** Database queries < 500ms (95th percentile) ### 4.2 Reliability **NFR-REL-001:** System availability 99.5% (production) **NFR-REL-002:** Zero data loss (message queue persistence) **NFR-REL-003:** Graceful degradation (if ML engine fails, rules still work) **NFR-REL-004:** Automatic retry for failed log ingestion ### 4.3 Security **NFR-SEC-001:** All data encrypted at rest (AES-256) **NFR-SEC-002:** TLS 1.3 for all network communication **NFR-SEC-003:** JWT-based authentication (15-minute expiry) **NFR-SEC-004:** Role-based access control (RBAC) **NFR-SEC-005:** Audit logging for all admin actions **NFR-SEC-006:** PII masking in logs (email, SSN, credit cards) ### 4.4 Compliance **NFR-COMP-001:** GDPR-compliant data retention (configurable) **NFR-COMP-002:** SOX audit trail (immutable logs) **NFR-COMP-003:** HIPAA-compliant encryption and access controls **NFR-COMP-004:** Data deletion API (right to be forgotten) ### 4.5 Usability **NFR-UX-001:** Dashboard responsive (desktop, tablet) **NFR-UX-002:** False positive rate < 10% by Month 3 **NFR-UX-003:** Consistent Dark Neumorphic UI across all interfaces **NFR-UX-004:** Accessibility (WCAG 2.1 AA target) **NFR-UX-005:** Onboarding tutorial for new users --- ## 5. Design System Implementation ### 5.1 Dark Neumorphic Theme Application All frontend components must implement the design system from `style.json`: **Color Palette:** - Primary Background: `#0f0f1e` - Secondary Background: `#1a1a2e` - Accent Colors: Magenta (`#ff006e`), Cyan (`#00d4ff`), Orange (`#ff6b35`) - Text: White (`#ffffff`) with secondary gray tones **Component Styling:** - Cards: Neumorphic shadows with glass effect - Buttons: Gradient backgrounds with glow effects - Inputs: Subtle borders with focus glow - Charts: Dark backgrounds with neon accent colors **Typography:** - Primary Font: Inter, Segoe UI, Roboto - Monospace: Fira Code (for logs/code) - Font Sizes: 0.75rem - 3rem scale **Effects:** - Glow shadows on interactive elements - Smooth transitions (300ms cubic-bezier) - Backdrop blur for overlays - Neon text shadows for emphasis ### 5.2 Component Library Build reusable React components: - `<Card>` - Neumorphic card container - `<Button>` - Primary, secondary, ghost variants - `<MetricCard>` - KPI display with gradient background - `<ChartContainer>` - Wrapper for charts with dark theme - `<Badge>` - Status indicators (magenta, cyan, success) - `<ProgressRing>` - Circular progress with glow - `<Input>` - Form inputs with focus effects - `<Sidebar>` - Navigation sidebar - `<Navbar>` - Top navigation bar --- ## 6. Data Models ### 6.1 PostgreSQL Schema **users** ```sql id: UUID PRIMARY KEY username: VARCHAR(255) UNIQUE email: VARCHAR(255) department: VARCHAR(100) role: VARCHAR(50) risk_score: INTEGER (0-100) baseline_computed: BOOLEAN created_at: TIMESTAMP updated_at: TIMESTAMP ``` **events** ```sql id: UUID PRIMARY KEY user_id: UUID FOREIGN KEY event_type: VARCHAR(100) source: VARCHAR(50) -- 'windows', 'chrome', 'linux' timestamp: TIMESTAMP raw_data: JSONB normalized_data: JSONB risk_score: INTEGER created_at: TIMESTAMP INDEX ON (user_id, timestamp) INDEX ON (event_type, timestamp) ``` **alerts** ```sql id: UUID PRIMARY KEY user_id: UUID FOREIGN KEY alert_type: VARCHAR(100) severity: VARCHAR(20) -- 'critical', 'high', 'medium', 'low' title: VARCHAR(255) description: TEXT rule_id: VARCHAR(100) event_ids: UUID[] -- related events status: VARCHAR(20) -- 'new', 'investigating', 'resolved', 'false_positive' assigned_to: UUID FOREIGN KEY (nullable) created_at: TIMESTAMP updated_at: TIMESTAMP resolved_at: TIMESTAMP (nullable) INDEX ON (status, created_at) INDEX ON (user_id, created_at) ``` **detection_rules** ```sql id: UUID PRIMARY KEY name: VARCHAR(255) rule_type: VARCHAR(50) -- 'sigma', 'ml' enabled: BOOLEAN sigma_yaml: TEXT (nullable) threshold: FLOAT (nullable) false_positive_count: INTEGER true_positive_count: INTEGER created_at: TIMESTAMP updated_at: TIMESTAMP ``` ### 6.2 InfluxDB Measurements **user_activity_metrics** - Fields: login_count, file_access_count, download_count, failed_auth_count - Tags: user_id, department, time_bucket (hourly) **system_metrics** - Fields: events_processed, alerts_generated, processing_latency_ms - Tags: component (rules_engine, ml_engine, api) --- ## 7. API Specifications ### 7.1 REST API Endpoints **Authentication:** ``` POST /api/v1/auth/login POST /api/v1/auth/logout POST /api/v1/auth/refresh ``` **Alerts:** ``` GET /api/v1/alerts # List alerts (paginated, filtered) GET /api/v1/alerts/{id} # Get alert details PATCH /api/v1/alerts/{id} # Update alert status POST /api/v1/alerts/{id}/comment # Add investigation comment ``` **Users:** ``` GET /api/v1/users # List users with risk scores GET /api/v1/users/{id} # Get user profile GET /api/v1/users/{id}/timeline # Get user activity timeline GET /api/v1/users/{id}/risk # Get risk score breakdown ``` **Events:** ``` POST /api/v1/events # Ingest event (from Chrome extension) GET /api/v1/events # Query events (admin only) ``` **Rules:** ``` GET /api/v1/rules # List detection rules PATCH /api/v1/rules/{id} # Update rule (enable/disable, tune) POST /api/v1/rules/test # Test rule against sample data ``` **Dashboard:** ``` GET /api/v1/dashboard/summary # Executive summary KPIs GET /api/v1/dashboard/trends # Alert trends (time-series) ``` ### 7.2 WebSocket Events **Client → Server:** ``` subscribe_alerts # Subscribe to real-time alerts subscribe_user:{id} # Subscribe to user activity updates ``` **Server → Client:** ``` new_alert # New alert created alert_updated # Alert status changed user_risk_updated # User risk score changed ``` --- ## 8. Detection Rules ### 8.1 Tier 1 Rules (Critical - Immediate Response) **RULE-001: Privilege Escalation** - Trigger: Non-admin user executes admin command - Severity: Critical - Response: Block + immediate alert **RULE-002: Mass Data Exfiltration** - Trigger: 10+ sensitive files accessed in 1 hour - Severity: Critical - Response: Alert + notify CISO **RULE-003: Lateral Movement** - Trigger: Unauthorized access to admin shares - Severity: Critical - Response: Alert + block network access ### 8.2 Tier 2 Rules (High - Daily Review) **RULE-004: Off-Hours Admin Activity** - Trigger: Admin activity outside 9am-6pm - Severity: High - Response: Alert **RULE-005: Credential Misuse** - Trigger: 3+ failed logins → success - Severity: High - Response: Alert + require MFA **RULE-006: Suspicious Downloads** - Trigger: Download of source code, databases, trade secrets - Severity: High - Response: Alert + log file hash ### 8.3 Tier 3 Rules (Medium - Weekly Review) **RULE-007: Role Misuse** - Trigger: User accesses files outside department - Severity: Medium - Response: Log + weekly report **RULE-008: Unusual Login Pattern** - Trigger: Login from new location/device - Severity: Medium - Response: Alert + require verification **RULE-009: Hacking Tool Execution** - Trigger: Mimikatz, psexec, nmap detected - Severity: Medium - Response: Alert + quarantine process **RULE-010: Account Enumeration** - Trigger: Bulk AD queries by non-admin - Severity: Medium - Response: Alert --- ## 9. ML Models ### 9.1 Behavioral Baselining **Model:** Isolation Forest (anomaly detection) **Features:** - Login times (hour of day, day of week) - File access patterns (count, file types, directories) - Download frequency and volume - Failed authentication rate - Network access patterns **Training:** - Minimum 4 weeks of data per user - Retrain weekly with new data - Separate models per department **Output:** - Anomaly score (0-1, higher = more anomalous) - Contributes 30% to overall risk score ### 9.2 Risk Scoring Algorithm ``` risk_score = ( 0.40 * sigma_rule_score + # Sigma rule matches 0.30 * ml_anomaly_score + # ML anomaly detection 0.20 * historical_incidents + # Past incidents 0.10 * privilege_level # User privilege level ) Normalized to 0-100 scale ``` --- ## 10. Deployment Architecture ### 10.1 Production Environment **Components:** - 2x API servers (load balanced) - 1x PostgreSQL (primary + read replica) - 1x InfluxDB - 1x RabbitMQ cluster (3 nodes) - 1x Redis (caching) - 1x Nginx (reverse proxy + SSL termination) **Scaling:** - Horizontal: Add API servers behind load balancer - Vertical: Increase database resources - Queue: Add RabbitMQ nodes for high throughput ### 10.2 Development Environment **Docker Compose:** ```yaml services: - api (FastAPI) - postgres - influxdb - rabbitmq - redis - frontend (React dev server) ``` --- ## 11. Testing Strategy ### 11.1 Unit Tests - Backend: pytest (80% coverage target) - Frontend: Jest + React Testing Library (70% coverage) - Detection rules: Test against known malicious patterns ### 11.2 Integration Tests - API endpoint tests (all CRUD operations) - WebSocket connection tests - Database migration tests ### 11.3 End-to-End Tests - User login → view alerts → investigate → resolve - Chrome extension → event ingestion → alert generation - Rule trigger → notification delivery ### 11.4 Performance Tests - Load testing: 10,000 events/second - Stress testing: 1,000 concurrent dashboard users - Latency testing: Alert generation < 5 seconds ### 11.5 Security Tests - Penetration testing (external consultant) - SQL injection tests - XSS/CSRF tests - Authentication bypass attempts --- ## 12. Monitoring & Observability ### 12.1 Metrics **System Metrics:** - Events processed per second - Alert generation rate - API response times (p50, p95, p99) - Database query performance - Queue depth (RabbitMQ) **Business Metrics:** - False positive rate - True positive rate - Mean time to detect (MTTD) - Mean time to respond (MTTR) - User adoption rate (Chrome extension) ### 12.2 Logging **Log Levels:** - ERROR: System failures, exceptions - WARN: Degraded performance, retries - INFO: Normal operations, audit events - DEBUG: Detailed troubleshooting (dev only) **Log Aggregation:** - Centralized logging (ELK stack or CloudWatch) - Structured JSON logs - Correlation IDs for request tracing ### 12.3 Alerting **System Alerts:** - API server down - Database connection failures - Queue backlog > 10,000 messages - Disk space < 10% **Business Alerts:** - False positive rate > 15% - No events received in 5 minutes - Critical alert not acknowledged in 10 minutes --- ## 13. Security Considerations ### 13.1 Data Privacy - PII masking in logs (email, SSN, credit cards) - Configurable data retention (30-90 days) - Data deletion API (GDPR compliance) - Employee transparency (notify users of monitoring) ### 13.2 Access Control - JWT-based authentication - Role-based access control (Admin, Analyst, Viewer) - API rate limiting (100 requests/minute per user) - Audit logging for all admin actions ### 13.3 Encryption - TLS 1.3 for all network traffic - AES-256 for data at rest (S3, database) - Encrypted backups - Secure key management (AWS KMS or HashiCorp Vault) --- ## 14. Compliance Requirements ### 14.1 GDPR - Data retention policies - Right to be forgotten (data deletion API) - Data portability (export user data) - Consent management (opt-in for monitoring) ### 14.2 SOX - Immutable audit trails - Access control logging - Change management tracking - Quarterly compliance reports ### 14.3 HIPAA - Encrypted data storage and transmission - Access control and authentication - Audit logging - Business associate agreements (if applicable) --- ## 15. Success Criteria ### 15.1 Technical Metrics - ✓ Alert latency < 5 seconds - ✓ False positive rate < 10% by Month 3 - ✓ 95%+ Chrome extension adoption - ✓ 99.5%+ system availability - ✓ 0 false negatives on test scenarios ### 15.2 Business Metrics - ✓ Insider threat detected within 10 minutes - ✓ Investigation time reduced by 80% - ✓ Compliance audit findings drop 50% - ✓ Prevention of 1-2 insider incidents per year ### 15.3 User Satisfaction - ✓ IT security team rating: 7/10 or higher - ✓ <5 support tickets per month - ✓ Monthly tuning sessions with IT team --- ## 16. Risks & Mitigation | Risk | Impact | Probability | Mitigation | |------|--------|-------------|-----------| | False positive explosion | High | Medium | Weekly rule tuning, start with high-confidence rules only | | Small team burnout | High | Medium | Ruthless prioritization, consider contractors | | Privacy violations | Critical | Low | Early legal review, clear retention policies | | Log ingestion bottleneck | High | Medium | Use message queue, load test early, plan for Kafka | | Integration complexity | Medium | High | Start simple (direct logs), build connectors incrementally | | ML model drift | Medium | Medium | Retrain weekly, monitor anomaly score distribution | | Chrome extension adoption | High | Medium | Gradual rollout, user training, executive sponsorship | --- ## 17. Dependencies ### 17.1 External Dependencies - Legal team approval (data collection, retention) - IT security team buy-in - Executive sponsorship - Budget approval ($500-700k) ### 17.2 Technical Dependencies - Windows Event Viewer access - Active Directory integration - Chrome browser deployment (managed) - AWS account or on-premises infrastructure --- ## 18. Assumptions 1. Organization has <500 users 2. Windows-based infrastructure (primary) 3. Chrome is the standard browser 4. IT team available for weekly tuning 5. Legal approval for employee monitoring 6. Budget available for 3-person team + infrastructure --- ## 19. Out of Scope The following are explicitly out of scope for v1.0: - Mobile device monitoring (iOS, Android) - Email content analysis - Network packet inspection - Endpoint DLP (data loss prevention) - Integration with SIEM (Splunk, QRadar) - planned for v2.0 - Multi-tenancy support - On-premises deployment (cloud-only for v1.0) --- ## 20. Glossary - **Sigma Rules:** Open-source detection rule format (YAML-based) - **Behavioral Baselining:** ML technique to learn "normal" user behavior - **Risk Score:** 0-100 metric indicating user threat level - **Neumorphic Design:** UI style combining flat design with subtle shadows - **Lateral Movement:** Attacker moving between systems after initial compromise - **Privilege Escalation:** Gaining higher access than authorized - **Data Exfiltration:** Unauthorized data transfer outside organization --- ## 21. Approval | Role | Name | Signature | Date | |------|------|-----------|------| | Project Sponsor | [Name] | _________ | _____ | | Technical Lead | [Name] | _________ | _____ | | Security Lead | [Name] | _________ | _____ | | Legal Counsel | [Name] | _________ | _____ | --- **Document Control:** - Version: 1.0.0 - Last Updated: March 17, 2026 - Next Review: April 17, 2026 - Owner: [Technical Lead Name]

AccessGuard AI - Technical Requirements Document (TRD)

Related Documents

評估系統

Monitoring Guide - HwpBridge

T20_enhancement_proposals

LLM Judge — Setup & Operations