Loading...
Loading...
Loading...
> This repository contains a ready-to-use Low-Level Design (LLD) for the Intelligent Document Query Platform. It is organized so you can drop each file into a GitHub repo and iterate from there.
# Intelligent Document Query Platform — GitHub-ready Low-Level Design (LLD)
> This repository contains a ready-to-use Low-Level Design (LLD) for the Intelligent Document Query Platform. It is organized so you can drop each file into a GitHub repo and iterate from there.
---
## Repository layout (suggested)
```
intelligent-doc-query-lld/
├─ README.md # High-level summary + how to use this LLD
├─ ARCHITECTURE.md # Detailed architecture diagrams (mermaid + explanation)
├─ docs/
│ ├─ sequence_diagrams.md # Mermaid diagrams for sequences
│ └─ flow_notes.md # Additional flow explanations
├─ terraform/
│ ├─ main.tf
│ ├─ variables.tf
│ ├─ outputs.tf
│ └─ modules/
├─ backend/
│ ├─ lambda_handlers/
│ │ ├─ presign_upload.py
│ │ ├─ document_processor.py
│ │ └─ ask_question.py
│ ├─ requirements.txt
│ └─ graphql/
│ └─ schema.graphql
├─ frontend/
│ ├─ README_FRONTEND.md
│ ├─ src/
│ │ ├─ App.tsx
│ │ ├─ components/
│ │ └─ graphql/
│ │ └─ queries.ts
│ └─ package.json
├─ db/
│ ├─ schema.sql
│ └─ migrations/
├─ .github/
│ └─ workflows/
│ └─ deploy.yml
└─ LLD.md # This low-level design in markdown (detailed)
```
---
# LLD
## 1. Overview
This repo contains the Low-Level Design (LLD) for the Intelligent Document Query Platform. The design is cloud-native, serverless, and event-driven. The main components are:
* React+TypeScript SPA frontend
* Serverless Python backend (AWS Lambda) behind API Gateway (GraphQL)
* PostgreSQL (RDS) with `pgvector` for vector similarity search
* S3 for document storage
* Terraform for IaC
* GitHub Actions for CI/CD
---
## 2. Architecture (ARCHITECTURE.md)
Include this mermaid diagram in `ARCHITECTURE.md` or your README to render visually on GitHub:
```mermaid
flowchart LR
subgraph User
U[User browser]
end
subgraph CDN
CF[CloudFront] --> S3Static[S3 (React static files)]
end
U --> CF
U --> APIGW[API Gateway (GraphQL)]
APIGW --> LAMBDA[Lambda functions]
LAMBDA -->|reads/writes| RDS[(Postgres + pgvector)]
LAMBDA -->|reads/writes| S3[Documents bucket]
LAMBDA --> LLM[LLM Provider API]
S3 -->|Event| ProcessingLambda[Processing Lambda]
ProcessingLambda --> RDS
```
Add a short explanation of each box and security considerations (VPC, IAM roles, KMS for secrets).
---
## 3. Database Schema (`db/schema.sql`)
```sql
-- PostgreSQL schema for documents and chunks
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS vector; -- pgvector
CREATE TABLE IF NOT EXISTS documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id VARCHAR(255),
file_name VARCHAR(1024),
s3_key VARCHAR(2048) UNIQUE,
status VARCHAR(32) CHECK (status IN ('UPLOADED','PROCESSING','READY','ERROR')) DEFAULT 'UPLOADED',
created_at TIMESTAMP WITH TIME ZONE DEFAULT now(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);
CREATE TABLE IF NOT EXISTS document_chunks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
chunk_text TEXT,
embedding vector(384),
created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);
-- Example index for ivfflat (tune lists depending on dataset)
-- CREATE INDEX document_chunks_embedding_idx ON document_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- Optional table for sessions / metadata
CREATE TABLE IF NOT EXISTS sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id VARCHAR(255),
metadata JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);
```
Notes: `vector(384)` dimension should match the embedding model used (e.g., some models use 384, 512, 1536, etc.).
---
## 4. GraphQL schema (`backend/graphql/schema.graphql`)
```graphql
type PresignedUrlResponse {
uploadUrl: String!
s3Key: String!
}
type Document {
id: ID!
fileName: String!
status: String!
}
type Answer {
text: String!
sources: [String!]
}
type Mutation {
generatePresignedUploadUrl(fileName: String!, fileType: String!): PresignedUrlResponse!
askQuestion(documentId: ID!, question: String!): Answer!
}
type Query {
getDocumentStatus(documentId: ID!): Document
}
# Subscription (note: serverless subscriptions may use Websockets or AppSync)
type Subscription {
answerStream(documentId: ID!, question: String!): String
}
```
---
## 5. Backend skeleton (Python Lambda handlers)
**`backend/lambda_handlers/presign_upload.py`**
```python
import os
import boto3
import json
s3 = boto3.client('s3')
BUCKET = os.environ.get('DOC_BUCKET')
def handler(event, context):
body = json.loads(event.get('body','{}'))
file_name = body['fileName']
file_type = body.get('fileType','application/octet-stream')
s3_key = f"uploads/{file_name}"
presigned = s3.generate_presigned_url(
'put_object',
Params={'Bucket': BUCKET, 'Key': s3_key, 'ContentType': file_type},
ExpiresIn=900
)
response = {'uploadUrl': presigned, 's3Key': s3_key}
return {
'statusCode': 200,
'body': json.dumps(response),
'headers': {'Content-Type': 'application/json'}
}
```
**`backend/lambda_handlers/document_processor.py`**
```python
# This lambda is triggered by S3 put events
import os
import boto3
import psycopg2
# Pseudocode: download, extract text, chunk, embed, insert into RDS
def handler(event, context):
# 1. Get S3 key
# 2. Download file
# 3. Extract text (use textract, pdfminer, etc.)
# 4. Chunk text (e.g., 500 tokens or 1000 chars)
# 5. Create embeddings (sentence-transformers or remote provider)
# 6. Insert into document_chunks table
# 7. Update documents.status = 'READY'
return {'status': 'ok'}
```
**`backend/lambda_handlers/ask_question.py`**
```python
# Handles the askQuestion GraphQL mutation
# 1. Generate embedding for the question
# 2. Query pgvector top-k similar chunks
# 3. Build prompt + call LLM (streaming if supported)
# 4. Return Answer object (and optionally push stream messages)
```
Add `requirements.txt` with libraries like `boto3`, `psycopg2-binary`, `requests`, `sentence-transformers` (if used), and any GraphQL handler library you prefer.
---
## 6. Frontend skeleton
**`frontend/src/App.tsx`** (TypeScript + React sketch)
```tsx
import React from 'react'
import { ApolloClient, InMemoryCache, ApolloProvider } from '@apollo/client'
const client = new ApolloClient({
uri: process.env.REACT_APP_GRAPHQL_URL,
cache: new InMemoryCache(),
})
export default function App(){
return (
<ApolloProvider client={client}>
<div className="container mx-auto p-4">
<h1 className="text-2xl">Intelligent Document Query</h1>
{/* Upload flow and Ask widget components go here */}
</div>
</ApolloProvider>
)
}
```
Provide `frontend/README_FRONTEND.md` with instructions to run (`npm install`, `npm start`) and environment variables like `REACT_APP_GRAPHQL_URL`.
---
## 7. Sequence diagrams (docs/sequence_diagrams.md)
Add the two sequence diagrams from the LLD in mermaid format so GitHub can render them. Example (already provided in your original LLD):
```mermaid
sequenceDiagram
participant User
participant Frontend
participant API
participant S3
participant ProcessingLambda
User->>Frontend: Selects file
Frontend->>API: generatePresignedUploadUrl()
API->>S3: generate presigned URL
Frontend->>S3: upload using URL
S3->>ProcessingLambda: S3 event
ProcessingLambda->>RDS: store chunks and embeddings
```
---
## 8. Terraform (terraform/main.tf) — minimal example
```hcl
provider "aws" {
region = var.region
}
resource "aws_s3_bucket" "static_site" {
bucket = var.static_bucket_name
acl = "private"
}
resource "aws_iam_role" "lambda_exec" {
name = "lambda_exec_role"
assume_role_policy = data.aws_iam_policy_document.lambda_assume.json
}
# More resources: api gateway, lambda functions, rds, cloudfront, vpc
```
Add `variables.tf` and `outputs.tf` and split complex resources into `modules/` for maintainability.
---
## 9. GitHub Actions (`.github/workflows/deploy.yml`)
```yaml
name: CI/CD
on:
push:
branches: [ main ]
jobs:
build-and-test-backend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r backend/requirements.txt
- name: Run backend tests
run: |
cd backend && pytest || true
build-and-test-frontend:
runs-on: ubuntu-latest
needs: build-and-test-backend
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install & Test
run: |
cd frontend
npm ci
npm run build --if-present
deploy-infrastructure:
runs-on: ubuntu-latest
needs: build-and-test-frontend
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
- name: Terraform init & apply
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
cd terraform
terraform init
terraform apply -auto-approve
deploy-application:
runs-on: ubuntu-latest
needs: deploy-infrastructure
steps:
- uses: actions/checkout@v4
- name: Deploy frontend to S3
run: |
aws s3 sync frontend/build s3://${{ secrets.STATIC_BUCKET_NAME }} --delete
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Deploy Lambdas
run: |
echo "Package and deploy lambda functions (use terraform or aws cli / sam)")
```
Adjust tests and packaging commands to match your project specifics.
---
## 10. Security & Operations Notes
* Use IAM least privilege for Lambda & processing roles.
* Store secrets in AWS Secrets Manager / Parameter Store and fetch at runtime.
* Use CloudWatch Logs + structured logging for observability.
* Use VPC and security groups for RDS. Ensure Lambda has required access (either via VPC or through RDS proxy).
* Use KMS for encrypting S3 and RDS snapshots as needed.
---
## 11. Implementation checklist (tasks to convert LLD -> code)
1. Create GitHub repo and push this layout.
2. Implement Terraform modules for core infra.
3. Implement Lambda functions with packaging (SAM or Terraform zip deploy).
4. Configure API Gateway (GraphQL) and subscriptions (websocket or AppSync).
5. Implement frontend upload + ask flows with GraphQL calls.
6. Test full pipeline with a small PDF and an LLM provider.
---
## 12. CHANGELOG / CONTRIBUTING
Add a CONTRIBUTING.md and CODE_OF_CONDUCT.md if you expect external collaborators.
---
### Appendix: Quick reference snippets
**SQL: ivfflat index creation**
```sql
-- Requires pgvector >= x.x
CREATE INDEX IF NOT EXISTS idx_doc_chunks_embedding ON document_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
```
**Mermaid: Ask question sequence**
```mermaid
sequenceDiagram
participant User
participant Frontend
participant API
participant RDS
participant LLM
User->>Frontend: Ask question
Frontend->>API: askQuestion()
API->>RDS: find top-k chunks (pgvector)
RDS-->>API: returns chunks
API->>LLM: send prompt
LLM-->>API: stream response
API-->>Frontend: forward stream
```
---
# End of LLD
> Design document analyzing how user actions feed back into ML predictions,
This document provides a complete reference for all exported APIs in the go-attention library.
This document captures important learnings and best practices discovered while building and maintaining the Papr Memory Python SDK, specifically around on-device processing and Core ML integration.
Tensor factorization is a method for decomposing tensors, which are described in [Section @sec:loading-rescal], into lower-rank approximations.