Loading...
Loading...
The Intelligent Research Assistant is a comprehensive AI-powered research platform built with a modular, scalable architecture. It combines document processing, vector search, multi-agent orchestration, fine-tuning capabilities, RLHF (Reinforcement Learning from Human Feedback), and enterprise-grade security into a unified system.
# Intelligent Research Assistant - Technical Documentation
## ποΈ **System Architecture Overview**
The Intelligent Research Assistant is a comprehensive AI-powered research platform built with a modular, scalable architecture. It combines document processing, vector search, multi-agent orchestration, fine-tuning capabilities, RLHF (Reinforcement Learning from Human Feedback), and enterprise-grade security into a unified system.
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INTELLIGENT RESEARCH ASSISTANT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π― CORE COMPONENTS β
β βββ Document Processing Pipeline β
β βββ Vector Database (Qdrant) β
β βββ Multi-Agent Orchestration β
β βββ Fine-Tuning Framework β
β βββ RLHF Pipeline β
β βββ Security & Compliance β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π WEB INTERFACE (Flask) β
β βββ REST API Endpoints β
β βββ File Upload & Processing β
β βββ Chat Interface β
β βββ Admin Dashboard β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π€ AI/ML STACK β
β βββ Language Models (OpenAI, Ollama, Hugging Face) β
β βββ Embeddings (Sentence-Transformers) β
β βββ Fine-Tuning (LoRA/QLoRA) β
β βββ RLHF (PPO, Reward Models) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π SECURITY & COMPLIANCE β
β βββ Role-Based Access Control (RBAC) β
β βββ Secure Secrets Management β
β βββ PII Redaction & Privacy β
β βββ Rate Limiting & Abuse Detection β
β βββ Data Retention & GDPR Compliance β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## π **Complete File Structure & Purpose**
### **Root Level Files**
```
Intelligent-Research-Assistant-/
βββ app.py # Main Flask application entry point
βββ main.py # Alternative entry point
βββ requirements.txt # Python dependencies
βββ README.md # User-facing documentation
βββ TECHNICAL_README.md # This technical documentation
βββ logging_config.py # Logging configuration
βββ .gitignore # Git ignore rules
βββ uploads/ # File upload directory
```
### **Core Source Code (`src/`)**
```
src/
βββ __init__.py # Package initialization
βββ pipeline/
β βββ pipeline.py # Document processing pipeline
βββ services/
β βββ __init__.py
β βββ chat_service.py # Chat orchestration service
β βββ search_service.py # Vector search service
β βββ document_service.py # Document management
β βββ embedding_service.py # Embedding generation
β βββ llm_service.py # LLM integration service
β βββ memory_service.py # Conversation memory
β βββ metrics_service.py # Metrics and monitoring
βββ models/
β βββ __init__.py
β βββ chat.py # Chat data models (Pydantic)
β βββ search.py # Search data models (Pydantic)
βββ agents/
β βββ __init__.py
β βββ base_agent.py # Base agent class
β βββ planner_agent.py # Task planning agent
β βββ research_agent.py # Information retrieval agent
β βββ reasoner_agent.py # Analysis and reasoning agent
β βββ executor_agent.py # Action execution agent
β βββ agent_orchestrator.py # Multi-agent coordination
βββ finetuning/
β βββ __init__.py
β βββ gpu_config.py # GPU detection and optimization
β βββ dataset_preparation.py # Dataset creation and formatting
β βββ model_finetuning.py # LoRA/QLoRA fine-tuning
β βββ evaluation.py # Model evaluation metrics
β βββ model_registry.py # Model versioning and tracking
βββ rlhf/
β βββ __init__.py
β βββ feedback_collection.py # Human feedback collection
β βββ reward_model.py # Reward model training
β βββ policy_optimization.py # PPO policy optimization
β βββ evaluation.py # RLHF evaluation metrics
β βββ integration.py # Production integration
βββ security/
βββ __init__.py
βββ rbac.py # Role-Based Access Control
βββ secrets.py # Secure Secrets Management
βββ pii_redaction.py # PII Detection & Redaction
βββ rate_limiting.py # Rate Limiting & Abuse Detection
βββ data_retention.py # Data Retention & Opt-out
```
---
## π οΈ **Complete Tech Stack**
### **Backend Framework**
- **Flask 3.0+**: Main web framework for API endpoints and web interface
- **Loguru**: Advanced logging with structured output
- **Pydantic 2.0+**: Data validation and serialization
### **AI/ML Stack**
- **Transformers (Hugging Face)**: Pre-trained language models
- **Sentence-Transformers**: Text embedding generation
- **PyTorch**: Deep learning framework
- **PEFT**: Parameter-Efficient Fine-Tuning (LoRA/QLoRA)
- **Accelerate**: Distributed training and optimization
- **BitsAndBytes**: Quantization for memory efficiency
- **TRL**: Transformers Reinforcement Learning (PPO)
### **Vector Database**
- **Qdrant**: High-performance vector database
- **Qdrant Client**: Python client for database operations
### **Document Processing**
- **PyMuPDF (fitz)**: PDF text extraction and processing
- **Tiktoken**: Tokenization for language models
### **Data Management**
- **Datasets (Hugging Face)**: Dataset handling and processing
- **Pandas**: Data manipulation and analysis
- **NumPy**: Numerical computing
### **Model Evaluation**
- **Evaluate**: Hugging Face evaluation metrics
- **Scikit-learn**: Machine learning utilities
- **ROUGE Score**: Text generation evaluation
- **BERT Score**: Semantic similarity evaluation
- **NLTK**: Natural language processing
### **Experiment Tracking**
- **MLflow**: Model lifecycle management
- **Weights & Biases (W&B)**: Experiment tracking and visualization
### **Security & Compliance**
- **PyJWT**: JWT authentication and authorization
- **Redis**: Rate limiting, caching, and session management
- **Boto3**: AWS KMS integration for secrets management
- **Hvac**: HashiCorp Vault integration
- **Cryptography**: Cryptographic operations and encryption
### **Development & Testing**
- **Flasgger**: Swagger/OpenAPI documentation
- **Requests**: HTTP client for API calls
---
## π **Execution Flow & How It Works**
### **1. Application Startup (`app.py`)**
```python
# Initialize core components
- Load logging configuration
- Initialize Qdrant vector database
- Create document collection
- Initialize multi-agent orchestrator
- Initialize security components (RBAC, rate limiting, etc.)
- Start Flask web server
```
### **2. Document Upload & Processing Flow**
```
User Upload β Security Check β Flask Route β Pipeline Processing β Vector Storage
β β β β β
PDF File Rate Limiting /upload API Text Extraction Qdrant Storage
β β β β β
Validation PII Redaction File Save Chunking + Embeddings Metadata
```
### **3. Chat & RAG Pipeline Flow**
```
User Query β Security Check β Chat Service β Search Service β LLM Service β Response
β β β β β β
/chat API RBAC Check Query Parse Vector Search Context + Formatted
β β β β Generation Response
Validation Rate Limit Memory Add Similarity Prompt Metadata
```
### **4. Multi-Agent Orchestration Flow**
```
User Request β Security Check β Agent Orchestrator β Planner β Research β Reasoner β Executor
β β β β β β β
/chat API Authentication Task Decomposition Tools Vector Analysis Actions
β β β β Search Logic Execute
Validation Authorization Workflow Creation Selection Results Validation Logging
```
### **5. Fine-Tuning Pipeline Flow**
```
Documents β Dataset Prep β Model Loading β LoRA/QLoRA β Training β Evaluation
β β β β β β
Raw Text Alpaca Format Base Model Adapters PPO Loss Metrics
β β β β β β
Extraction Instruction GPU Config Training Validation Registry
```
### **6. RLHF Pipeline Flow**
```
Human Feedback β Reward Model β PPO Training β Policy Alignment β Evaluation
β β β β β
Collection Preference Policy Opt KL Divergence Metrics
β β β β β
CLI/Web Pairwise PPO Loss Stability Comparison
```
### **7. Security & Compliance Flow**
```
Request β Rate Limiting β Authentication β Authorization β PII Redaction β Processing
β β β β β β
API Call Request Count JWT Verify Permission Check Data Masking Business Logic
β β β β β β
Validation Abuse Check User Context Role Check Log Redaction Response
```
---
## π― **Core Features & Capabilities**
### **π Document Processing**
- **PDF Text Extraction**: Per-page extraction with metadata preservation
- **Smart Chunking**: Overlapping chunks with paragraph boundary detection
- **Edge Case Handling**: Empty page detection and error logging
- **Metadata Preservation**: Document ID, page numbers, timestamps
### **π Vector Search & Retrieval**
- **Semantic Search**: Similarity-based document retrieval
- **Metadata Filtering**: Search by document, page, or custom filters
- **Batch Processing**: Efficient bulk operations
- **Collection Management**: Create, delete, and monitor collections
### **π¬ Chat & RAG System**
- **9-Step RAG Pipeline**: Complete retrieval-augmented generation
- **Multi-LLM Support**: OpenAI, Ollama, Hugging Face models
- **Conversation Memory**: Context preservation across turns
- **Source Attribution**: Automatic citation and reference tracking
### **π€ Multi-Agent Orchestration**
- **Planner Agent**: Task decomposition and tool selection
- **Research Agent**: Information retrieval and API calls
- **Reasoner Agent**: Analysis, validation, and content generation
- **Executor Agent**: Side effects and external operations
- **Agent Orchestrator**: Workflow coordination and management
### **π― Fine-Tuning Framework**
- **GPU Optimization**: MPS, CUDA, CPU detection and configuration
- **LoRA/QLoRA**: Parameter-efficient fine-tuning
- **Dataset Preparation**: Alpaca/ShareGPT format conversion
- **Comprehensive Evaluation**: BLEU, ROUGE, BERT Score, perplexity
- **Model Registry**: MLflow and W&B integration
### **π RLHF Pipeline**
- **Human Feedback Collection**: CLI and web interfaces
- **Reward Model Training**: Pairwise preference learning
- **PPO Implementation**: Policy optimization with stability tricks
- **Production Integration**: A/B testing and live feedback
- **Evaluation Metrics**: Factuality, helpfulness, coherence
### **π Security & Compliance**
- **Role-Based Access Control (RBAC)**: Granular user permissions and role management
- **JWT Authentication**: Secure token-based authentication with expiration
- **Secure Secrets Management**: AWS KMS and HashiCorp Vault integration
- **PII Redaction**: Automatic detection and redaction of sensitive information
- **Rate Limiting**: Multi-window rate limiting with abuse detection
- **Input Validation**: Comprehensive data validation and sanitization
- **Security Headers**: CORS, XSS protection, content type options
- **File Upload Security**: Malicious file detection and validation
### **π Monitoring & Analytics**
- **Comprehensive Metrics**: Request tracking, response times, errors
- **Performance Monitoring**: Token usage, embedding generation
- **Session Analytics**: User behavior and interaction patterns
- **Error Tracking**: Detailed logging and debugging information
- **Security Monitoring**: Rate limiting, abuse detection, and access logs
---
## π **API Endpoints & Usage**
### **Core Endpoints**
```python
# Document Management
POST /upload # Upload and process documents
GET /documents/{doc_id} # Get document details
DELETE /documents/{doc_id} # Delete document
# Search & Retrieval
POST /search # Vector similarity search
GET /collection-stats # Database statistics
GET /collection-info # Collection metadata
# Chat & RAG
POST /chat # Main chat interface
GET /model-info # LLM configuration
# Multi-Agent System
GET /agents # Agent status and metrics
POST /agents/{type}/activate # Activate specific agent
GET /workflows # Workflow history
GET /capabilities # Available agent capabilities
# Security & Compliance
POST /auth/login # User authentication
GET /auth/profile # User profile and permissions
POST /auth/logout # User logout
GET /security/rate-limit # Rate limit information
POST /security/opt-out # User opt-out requests
GET /security/data-summary # Data retention summary
# Admin & Monitoring
GET /admin/health # System health check
GET /metrics # Performance metrics
GET /admin/security # Security status and alerts
```
### **Example API Usage**
```python
# Upload document (with authentication)
curl -X POST -F "[email protected]" \
-H "Authorization: Bearer <jwt_token>" \
http://localhost:8008/upload
# Chat with RAG (with rate limiting)
curl -X POST http://localhost:8008/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <jwt_token>" \
-d '{"query": "What is machine learning?", "context": "research"}'
# Search documents (with PII redaction)
curl -X POST http://localhost:8008/search \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <jwt_token>" \
-d '{"query": "artificial intelligence", "limit": 5}'
# User authentication
curl -X POST http://localhost:8008/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "user", "password": "password"}'
```
---
## π§ **Configuration & Setup**
### **Environment Variables**
```bash
# Database Configuration
QDRANT_HOST=localhost
QDRANT_PORT=6333
# LLM Configuration
OPENAI_API_KEY=your_openai_key
OLLAMA_BASE_URL=http://localhost:11434
# Model Configuration
EMBEDDING_MODEL=all-MiniLM-L6-v2
LLM_MODEL=microsoft/DialoGPT-small
# Fine-tuning Configuration
WANDB_API_KEY=your_wandb_key
MLFLOW_TRACKING_URI=your_mlflow_uri
# Security Configuration
JWT_SECRET=your_jwt_secret_key
REDIS_URL=redis://localhost:6379
AWS_KMS_KEY_ID=your_kms_key_id
VAULT_URL=http://localhost:8200
VAULT_TOKEN=your_vault_token
```
### **GPU Configuration**
```python
# Automatic GPU Detection
- Apple Silicon MPS (Metal Performance Shaders)
- NVIDIA CUDA
- CPU Fallback
# Memory Optimization
- 4-bit quantization (QLoRA)
- Gradient checkpointing
- Mixed precision training
```
### **Security Configuration**
```python
# RBAC Configuration
- Default roles: admin, researcher, user, guest
- Granular permissions for all operations
- JWT token expiration and refresh
# Rate Limiting Configuration
- Per-minute, per-hour, per-day limits
- Configurable penalty durations
- Abuse detection thresholds
# PII Redaction Configuration
- 12+ predefined PII patterns
- Custom pattern support
- Confidence-based detection
```
---
## π **Performance & Scalability**
### **Optimization Features**
- **Batch Processing**: Efficient bulk operations
- **Memory Management**: GPU memory optimization
- **Caching**: Model and embedding caching
- **Async Operations**: Non-blocking API calls
- **Connection Pooling**: Database connection management
- **Rate Limiting**: Request throttling and abuse prevention
- **Security Overhead**: Minimal performance impact from security features
### **Scalability Considerations**
- **Horizontal Scaling**: Stateless API design
- **Load Balancing**: Multiple instance support
- **Database Sharding**: Qdrant cluster support
- **Model Serving**: Separate inference servers
- **Queue Management**: Background task processing
- **Security Scaling**: Distributed rate limiting and session management
---
## π§ͺ **Testing & Quality Assurance**
### **Test Coverage**
- **Unit Tests**: Individual component testing
- **Integration Tests**: End-to-end workflow testing
- **Performance Tests**: Load and stress testing
- **Model Tests**: Fine-tuning and RLHF validation
- **Security Tests**: Authentication, authorization, and penetration testing
### **Quality Metrics**
- **Code Coverage**: Comprehensive test coverage
- **Performance Benchmarks**: Response time measurements
- **Accuracy Metrics**: Model evaluation scores
- **Error Rates**: System reliability monitoring
- **Security Metrics**: Rate limiting effectiveness, abuse detection accuracy
---
## π **Security & Privacy**
### **Security Features**
- **Role-Based Access Control (RBAC)**: Granular user permissions and role management
- **JWT Authentication**: Secure token-based authentication with expiration
- **Secure Secrets Management**: AWS KMS and HashiCorp Vault integration
- **PII Redaction**: Automatic detection and redaction of sensitive information
- **Rate Limiting**: Multi-window rate limiting with abuse detection
- **Input Validation**: Comprehensive data validation and sanitization
- **Security Headers**: CORS, XSS protection, content type options
- **File Upload Security**: Malicious file detection and validation
### **Privacy & Compliance**
- **GDPR Compliance**: Complete data retention policy framework
- **Data Anonymization**: User data protection and anonymization
- **Audit Logging**: Comprehensive access and usage tracking
- **Data Retention**: Configurable data lifecycle management
- **User Opt-out**: Complete opt-out mechanisms for data collection
- **Right to be Forgotten**: Data deletion and user rights management
- **Privacy by Design**: Built-in privacy protection throughout the system
### **Security Monitoring**
- **Real-time Monitoring**: Live security event monitoring
- **Abuse Detection**: Automated detection of malicious activities
- **Rate Limit Monitoring**: Request pattern analysis
- **Access Logging**: Detailed authentication and authorization logs
- **Security Alerts**: Automated alerting for security incidents
---
## π **Deployment & Production**
### **Deployment Options**
- **Docker**: Containerized deployment with security hardening
- **Kubernetes**: Orchestrated scaling with security policies
- **Cloud Platforms**: AWS, GCP, Azure support with managed security
- **On-Premise**: Self-hosted deployment with enterprise security
### **Production Checklist**
- [ ] Environment configuration and secrets management
- [ ] Database setup and migration with encryption
- [ ] SSL/TLS certificate configuration
- [ ] Security hardening and firewall configuration
- [ ] Monitoring and alerting setup
- [ ] Backup and recovery procedures
- [ ] Performance optimization and load testing
- [ ] Security audit and penetration testing
- [ ] Compliance validation (GDPR, SOC2, etc.)
- [ ] Incident response plan and procedures
### **Security Hardening**
- [ ] JWT secret rotation and management
- [ ] Rate limiting configuration and tuning
- [ ] PII redaction pattern validation
- [ ] RBAC role and permission audit
- [ ] Secrets management integration
- [ ] Security monitoring and alerting
- [ ] Regular security updates and patches
---
## π **Development Guidelines**
### **Code Standards**
- **PEP 8**: Python style guide compliance
- **Type Hints**: Comprehensive type annotations
- **Documentation**: Inline and API documentation
- **Error Handling**: Robust exception management
- **Logging**: Structured logging throughout
- **Security**: Security-first development practices
### **Security Guidelines**
- **Input Validation**: Always validate and sanitize user input
- **Authentication**: Implement proper authentication for all endpoints
- **Authorization**: Check permissions before performing operations
- **Data Protection**: Encrypt sensitive data and use secure storage
- **Logging**: Log security events without exposing sensitive information
- **Testing**: Include security testing in development workflow
### **Contributing**
- **Git Workflow**: Feature branch development with security review
- **Code Review**: Peer review process with security focus
- **Testing**: Automated test execution including security tests
- **Documentation**: Updated technical docs with security considerations
- **Performance**: Benchmark validation and security impact assessment
---
## π― **Future Roadmap**
### **Planned Enhancements**
- **Advanced RLHF**: More sophisticated reward modeling
- **Multi-Modal Support**: Image and video processing
- **Real-time Collaboration**: Multi-user editing with security
- **Advanced Analytics**: Business intelligence features
- **Mobile Support**: Native mobile applications with secure authentication
### **Security Enhancements**
- **Zero-Trust Architecture**: Advanced security model implementation
- **Advanced Threat Detection**: Machine learning-based threat detection
- **Compliance Automation**: Automated compliance checking and reporting
- **Privacy-Preserving ML**: Federated learning and differential privacy
- **Blockchain Integration**: Decentralized identity and audit trails
### **Research Integration**
- **Academic Paper Processing**: Specialized research tools
- **Citation Management**: Automated reference handling
- **Collaborative Research**: Team-based workflows with security
- **Publication Support**: Manuscript preparation tools
---
## π **Support & Resources**
### **Documentation**
- **User Guide**: `README.md`
- **Technical Docs**: `TECHNICAL_README.md`
- **API Reference**: Swagger documentation
- **Code Examples**: Sample implementations
- **Security Guide**: Security best practices and configuration
### **Community**
- **GitHub Issues**: Bug reports and feature requests
- **Discussions**: Community forums
- **Contributions**: Open source development
- **Feedback**: User experience improvements
- **Security**: Security vulnerability reporting
---
**π The Intelligent Research Assistant represents a state-of-the-art AI research platform with comprehensive capabilities for document processing, intelligent search, multi-agent orchestration, fine-tuning, RLHF, and enterprise-grade security - all designed for production-ready deployment and continuous improvement with full compliance and privacy protection.**
- Without a harness, you **can't compare** prompts, models, retrieval configs, or costs.
Evaluate, benchmark, and regression-test AI/LLM systems. Covers evaluation framework design, benchmark creation, human evaluation protocols, automated evaluation (LLM-as-judge), regression testing, statistical significance, and continuous evaluation pipelines.
<img width="1388" height="298" alt="full_diagram" src="https://github.com/user-attachments/assets/12a2371b-8be2-4219-9b48-90503eb43c69" />
A list of all public EEG-datasets. This list of EEG-resources is not exhaustive. If you find something new, or have explored any unfiltered link in depth, please update the repository.