Loading...
Loading...
Freedom Finance's support desk receives thousands of inbound tickets during off-hours — no human dispatcher is online, tickets pile up un-routed, SLA timers burn. When the morning shift arrives they face a cold, unsorted queue.
# 🔥 FIRE — Freedom Intelligent Routing Engine
### Presentation Pitch
---
## 1. The Problem
Freedom Finance's support desk receives thousands of inbound tickets during off-hours — no human dispatcher is online, tickets pile up un-routed, SLA timers burn. When the morning shift arrives they face a cold, unsorted queue.
Three compounding challenges:
| Challenge | Pain |
|---|---|
| **Mixed languages** | RU / KZ / ENG tickets in the same queue |
| **Unequal urgency** | Spam sits next to suspected fraud |
| **Rule-laden routing** | VIP tiers, language skills, seniority, geography, load balance — impossible to keep consistent manually |
**FIRE automates all of it**, end-to-end, with a target SLA of **< 10 seconds per ticket**.
---
## 2. What We Built
> **FIRE** is a full-stack, AI-powered ticket ingestion, enrichment, and smart-routing platform that reads raw CSV exports, classifies every ticket with an LLM, geo-codes client addresses, and assigns each ticket to the best-fit manager using a validated cascade of business rules.
---
## 3. Architecture at a Glance
```
CSV upload / real-time form
│
▼
┌─────────────────────────────────────────────────┐
│ FastAPI (Python 3.12) │
│ ┌─────────┐ ┌──────────┐ ┌───────────────┐ │
│ │ /ingest │ │ /report │ │ /analytics │ │
│ └────┬────┘ └────┬─────┘ └───────┬───────┘ │
│ │ │ │ │
│ enqueue real-time NL → SQL │
│ job (< 10s) → chart JSON │
└───────┼────────────┼────────────────┼───────────┘
│ │ │
▼ ▼ ▼
ARQ Worker Gemini 2.5 Gemini 2.5 Flash
(Redis) Flash + metrics catalogue
│
▼
┌────────────────────────────────────┐
│ PostgreSQL 16 │
│ ticket ─── assignment ─── manager │
│ + pgvector (RAG) │
└────────────────────────────────────┘
│
▼
Next.js 15 Admin UI + Telegram Bot (aiogram 3)
```
**Infrastructure stack**
| Layer | Technology |
|---|---|
| API | FastAPI + Uvicorn |
| Background jobs | ARQ (async Redis Queue) + uvloop |
| Database | PostgreSQL 16 w/ pgvector extension |
| Cache / broker | Redis 7 |
| AI | Gemini 2.5 Flash |
| Geocoding | Yandex Maps API v1 (primary) → Nominatim/OSM (fallback) |
| Frontend | Next.js 15, Recharts, Tailwind CSS |
| Telegram | aiogram 3, MemoryStorage FSM |
| Containerisation | Docker Compose (5 services) |
---
## 4. The AI Pipeline — NLP Module
Each ticket flows through a multi-stage enrichment chain:
### 4.1 LLM Analysis (Gemini 2.5 Flash)
A single Gemini call returns a **strict JSON object** with zero hallucination tolerance:
```json
{
"type": "Неработоспособность приложения",
"priority": 6,
"tonality": "Негативный",
"language": "RU",
"summary": "Клиент не может войти в аккаунт: пароль отклонён, код в email не приходит. Рекомендация: принудительный сброс пароля через CRM."
}
```
**7 ticket categories** with precise prompt-level rules:
| Type | Base Priority |
|---|---|
| Спам | 1 |
| Консультация | 2–3 |
| Смена данных | 3–4 |
| Неработоспособность приложения | 4–6 |
| Жалоба | 5–7 |
| Претензия | 7–8 |
| Мошеннические действия | 8–10 |
**Priority modifiers** (each +1, hard cap at 10):
- Client is VIP / Priority segment
- Explicit urgency words (`срочно`, `немедленно`, `требую`)
- Confirmed financial loss
- Legal / regulatory threat (`суд`, `прокуратура`, `Нацбанк`)
- Repeated complaint
### 4.2 RAG — Few-Shot Grounding via pgvector
This is the system's accuracy insurance policy.
**The problem a cold LLM has:** Freedom Finance has domain-specific vocabulary, product names, and complaint patterns that a general LLM can confuse. Edge cases — fraud vs. complaint, consultation vs. data-change — are where mis-routing costs the most.
**The solution — Retrieval-Augmented Generation with pgvector:**
```
Incoming ticket description
│
▼
gemini-embedding-001 → 768-dim vector
│
▼
SELECT ... FROM rag_example
ORDER BY embedding <=> CAST(:emb AS vector)
LIMIT 6 ← IVFFlat cosine index
│
▼
Diversity filter: max 1 example per category → top 3
│
▼
Inject as few-shot block into Gemini prompt
```
**Prompt injection example:**
```
Domain examples — verified correct classifications:
[1] Ticket: "Не могу войти в аккаунт, пароль не принимается и SMS не приходит"
Classification: {"type": "Неработоспособность приложения", "priority": 5, ...}
[2] Ticket: "Верните мои деньги, иначе подам в суд"
Classification: {"type": "Претензия", "priority": 9, ...}
[3] Ticket: "Хочу изменить номер телефона"
Classification: {"type": "Смена данных", "priority": 3, ...}
Now classify the following ticket in the same JSON format.
```
**How the corpus is built and maintained:**
| Step | Detail |
|---|---|
| `rag_seed_data.py` | Hand-curated golden tickets — one authoritative file, each example has `category`, `priority`, `tonality`, `language`, `note` (never shown to LLM) |
| `seed_rag_examples.py` | Idempotent script: hashes each description, embeds new ones via `gemini-embedding-001`, upserts to `rag_example` table |
| `seed_rag_on_startup()` | Runs automatically at every API startup — any new golden example in the codebase is embedded and live within seconds, zero manual DB work |
| pgvector IVFFlat index | `CREATE INDEX ... USING ivfflat (embedding vector_cosine_ops)` — sub-millisecond similarity search |
| Diversity guard | Fetches top-6 by cosine distance, then keeps at most 1 per category. The few-shot block is never dominated by a single class, giving Gemini broad context for every edge case |
**Why this matters for accuracy:**
- A ticket saying *"не могу войти в приложение"* retrieves past tickets already labelled `Неработоспособность приложения` → LLM is anchored to the correct classification
- A fraud-adjacent complaint retrieves both a `Мошеннические действия` example and a `Жалоба` example → the subtle distinction is shown, not described
- Priority calibration improves: if similar past tickets carried priority 8+, the model follows suit
### 4.3 Fallback Sentiment (local)
A `SentimentAnalyzer` (HuggingFace transformer) provides a confidence-scored tonality signal. If confidence < 0.8, Gemini's tonality classification overrides the sentence-model result — best-of-both-worlds.
### 4.4 Language Detection
`LanguageDetector` uses `langdetect` / `langid` with a KZ-aware override list. Default falls back to **RU** if detection is ambiguous — exactly as specified in TZ §3.1.
### 4.5 Geo-Normalisation
Address components (Country → Region → City → Street → House) are assembled into a single string and geocoded via a **two-provider waterfall**:
```
Yandex Maps API v1 (primary, no rate-limit)
│ 403 / unavailable
▼
Nominatim / OpenStreetMap (1 req/s rate-limited, retry×3)
```
Output: `(lat, lon)` stored on the ticket. Haversine distance then selects the **nearest office city**. Foreign / unresolved addresses are detected and dispatched 50/50 to Астана / Алматы per TZ §3.2.1.
---
## 5. The Routing Engine — Business Rules
All routing logic lives in a **single shared module** (`core/routing.py` + `core/assignment.py`) imported by both the real-time `/report` endpoint and the ARQ batch worker. There is exactly **one implementation** of every business rule.
### 5.1 Full Cascade (TZ §3.2 — Exact Implementation)
```
┌─────────────────────────────────┐
│ Ticket arrives (AI-enriched) │
│ category / language / segment │
└────────────────┬────────────────┘
│
┌──────────────────────▼──────────────────────┐
│ STEP 1 — Geographic Resolution │
│ │
│ country == Kazakhstan? │
│ YES → geocode → Haversine sort offices │
│ NO → [Астана idx%2, Алматы idx%2+1] │
│ Unknown address? → same 50/50 fallback │
└──────────────────────┬──────────────────────┘
│ ordered city list
┌──────────────────────▼──────────────────────┐
│ STEP 2 — Hard-Skill Filter (for each city) │
│ │
│ Rule 1: category == "Смена данных" │
│ → drop all managers where │
│ position != Chief │
│ │
│ Rule 2: segment in {VIP, Priority} │
│ → drop managers without VIP skill │
│ │
│ Rule 3: language == KZ │
│ → drop managers without KZ skill │
│ │
│ Rule 4: language == ENG │
│ → drop managers without ENG skill │
│ │
│ Pool empty? → next city in Haversine order │
└──────────────────────┬──────────────────────┘
│ qualified pool
┌──────────────────────▼──────────────────────┐
│ STEP 3 — Weighted Scoring │
│ Score every manager in the pool │
│ → rank, take Top-3 │
└──────────────────────┬──────────────────────┘
│ top-3 candidates
┌──────────────────────▼──────────────────────┐
│ STEP 4 — Per-Office Round-Robin │
│ SELECT FOR UPDATE office.rr_index │
│ winner = candidates[rr_index % 3] │
│ rr_index += 1 (atomic, concurrency-safe) │
└──────────────────────┬──────────────────────┘
│
┌────────────▼────────────┐
│ Assignment written │
│ manager.current_load +1 │
│ Telegram notification │
└──────────────────────────┘
```
### 5.2 Hard-Filter Worked Examples
| Ticket | Segment | Language | Category | Allowed managers |
|---|---|---|---|---|
| "Хочу сменить номер" | Mass | RU | Смена данных | **Chief only** (any skills) |
| "Мои акции заблокированы" | VIP | RU | Жалоба | **Chief or Lead or Specialist** — but **must have VIP skill** |
| "Қолданба жұмыс істемейді" | Mass | KZ | Неработоспособность | Any position — **must have KZ skill** |
| "I cannot log in" | Priority | ENG | Неработоспособность | **Must have VIP skill AND ENG skill** |
| "Подозреваю мошенников" | VIP | RU | Мошеннические действия | **Must have VIP skill** (priority 10) |
### 5.3 Scoring Model (v2)
Applied inside the already-filtered pool so correctness rules are never traded off against performance:
$$score = W_{load} \cdot e^{-\text{load} / \lambda} + W_{skill} \cdot \text{skill\_depth} + W_{pos} \cdot \text{position\_affinity}$$
| Component | Weight | Formula | Effect |
|---|---|---|---|
| **Load penalty** | 2.0 | $e^{-\text{load}/3}$ | load=0 → 1.0 · 2.0 = **2.0** points; load=10 → 0.04 · 2.0 = **0.08** points. Exponential soft-cap: busy managers lose the race without being excluded |
| **Skill depth** | 0.5 | count of matching skills | Rewards specialists over generalists for skill-heavy tickets. Max +1.5 pts |
| **Position affinity** | 0.4 | 0.5 − 0.3·\|tier_distance\| | Perfect tier match → +0.2; one step away → +0.08; two steps → −0.04. Steers easy tickets to Specialists, hard ones to Chiefs |
**Ideal position** by ticket attributes:
```
category == "Смена данных" → Chief
segment in {VIP, Priority} → Chief
priority ≥ 8 → Chief
5 ≤ priority < 8 → Lead
otherwise → Specialist
```
### 5.4 City Cascade Fallback
If the nearest city has **zero** qualified managers after hard-filtering, the engine silently advances to the next city in Haversine-sorted order:
```
Кокпекты (nearest, 0 VIP managers)
→ Усть-Каменогорск (2nd closest, 1 VIP manager) ✓ assigned
```
This means every ticket is eventually routed — no ticket is ever silently dropped — without requiring a hard-coded list of fallback cities.
### 5.5 Concurrency Safety
With 8 ARQ workers processing tickets simultaneously, naive round-robin would produce race conditions. FIRE solves this at the database level:
```sql
-- Per-office atomic counter (no application-level locking)
SELECT * FROM office WHERE city = $1 FOR UPDATE;
-- reads rr_index, picks winner, increments
UPDATE office SET rr_index = rr_index + 1 WHERE city = $1;
```
`rr_index` lives on the `office` row. PostgreSQL row-level locking guarantees that even under full concurrency (8 workers × burst traffic) each ticket lands on a different manager in strict rotation.
---
## 6. Processing Modes
### 6.1 Batch Ingestion (ARQ Worker)
```
POST /ingest/tickets (CSV)
│
├─ Parse + validate CSV rows
├─ Bulk-insert raw tickets to DB (status = "pending")
├─ Enqueue one ARQ job per ticket (Redis)
└─ Return batch_id immediately
ARQ Worker (Redis queue, max_jobs=8, uvloop):
For each job:
├─ Gemini analysis (geo runs in parallel via thread pool)
├─ find_best_manager()
├─ INSERT assignment + UPDATE manager.current_load
├─ Notify manager via Telegram
└─ Update ticket status → "assigned" | "unassigned"
GET /ingest/batch/{batch_id}/export → full JSON dump
```
Throughput target: **8 concurrent jobs**, each < 10 s → processes 30 tickets in ~40 s.
### 6.2 Real-Time Single Ticket
```
POST /report (multipart/form-data)
├─ File upload validation + storage
├─ Geocoding ←── runs in parallel with ──→ Gemini analysis
├─ find_best_manager()
└─ Full ReportResponse in < 10 s
```
---
## 7. Database Schema
```sql
ticket
id (UUID v7) client_guid description segment
category sentiment priority language summary
lat lon city status
address_country/region/city_raw/street/house
ingest_batch_id processing_timings candidates_considered
manager
id full_name position office_city
skills[] current_load
manager_tg_id manager_chat_id
office
id name city address lat lon
rr_index ← per-office Round-Robin counter (atomic, SELECT FOR UPDATE)
assignment
id ticket_id → ticket manager_id → manager
assigned_at round_robin_index
rag_example ← RAG few-shot corpus
id description description_hash (unique SHA-256)
category priority tonality language
embedding vector(768) ← gemini-embedding-001
[IVFFlat cosine index — pgvector]
saved_chart ← Star Task: persisted NL-to-chart results
id title metric chart_type data (JSON) query created_at
```
Key design decisions:
- **`rr_index` on `office` row** — `SELECT FOR UPDATE` guarantees collision-free round-robin under 8 concurrent workers
- **`candidates_considered` on `ticket`** — JSON audit trail of every evaluated manager (score, load, skills) for every routing decision
- **`rag_example.embedding vector(768)`** — pgvector IVFFlat index allows sub-millisecond cosine similarity queries at scale
---
## 8. Frontend — Admin Panel
Built with **Next.js 15 (App Router)**, dark-first design system.
### Pages
| Route | Purpose |
|---|---|
| `/` | Marketing landing page (Navigation, Hero, Features, Process, FAQ, CTA) |
| `/report` | Live ticket submission form with real-time assignment result |
| `/admin/overview` | Summary KPIs (total tickets, assigned %, avg priority…) |
| `/admin/tickets` | Paginated ticket table with full AI analysis detail |
| `/admin/assignments` | Assignment ledger — ticket ↔ manager links |
| `/admin/managers` | Manager roster with skills and live load |
| `/admin/offices` | Office locations |
| `/admin/ingest` | CSV bulk upload UI (business_units, managers, tickets) |
| `/admin/analytics` | **⭐ Star Task** — NL-to-chart AI assistant |
| `/admin/classification-test` | Test the LLM classifier on arbitrary text |
### Star Task — AI Analytics Assistant
A user types a natural-language question in Russian (or any language):
> *"Покажи распределение типов обращений по городам"*
Flow:
1. Frontend POSTs to `POST /analytics/query`
2. Gemini dispatches to a **metrics catalogue** (12 pre-defined aggregations) and selects chart type
3. Backend runs the corresponding SQL aggregation
4. Returns `{ metric, chart_type, title, data, x_key, series_keys }` — fully serialised chart spec
5. Frontend renders via **Recharts** (bar / pie / line / area)
6. Chart is auto-saved to `saved_chart` table and appears in the "Saved Charts" gallery
**12 built-in metrics:**
- tickets by category, city, category×city, sentiment, language, segment, status, priority
- tickets over time (line), manager load, tickets by city×segment, + custom SQL fallback
---
## 9. Telegram Bot — Manager Notifications
Built with **aiogram 3**, FSM-based.
### Features
| Command / Flow | Description |
|---|---|
| `/register` | Manager links their Telegram account to their DB record via `@username` → receives their profile card |
| `/survey` | Guided FSM survey — manager fills in a structured form (routed to backend) |
| Assignment push | When a ticket is assigned, the backend calls the manager's `chat_id` and sends a rich notification with ticket type, priority, summary, client segment, and recommended action |
Self-healing: `TelegramConflictError` triggers `os._exit(1)` — Docker restarts the bot and the 30-second session expiry resolves the conflict cleanly.
---
## 10. Infrastructure & Ops
```yaml
services:
web: FastAPI + Uvicorn (hot-reload in dev)
worker: ARQ consumer (8 concurrent jobs, uvloop, 200s graceful drain)
db: pgvector/pgvector:pg16
redis: redis:7-alpine
seed_fire_data: one-shot seeder (CSV → DB on first run)
```
- **Volumes**: `postgres-data`, `redis-data`, `uploads-data` (attachment files)
- **Health-checks**: `pg_isready` before web and worker start
- **Env-driven**: all secrets (API keys, DB URL, bot token) in `src/.env`
- **Alembic** migrations run automatically on startup via `run_alembic_upgrade()`
---
## 11. Coverage of TZ Criteria
| Criteria | Weight | How FIRE delivers |
|---|---|---|
| **Routing logic** | 40% | Exact TZ cascade implemented: Chief-only for Смена данных, VIP skill for VIP/Priority segment, KZ/ENG language skill. Top-3 scoring model (load + skill depth + position affinity). Atomic per-office round-robin via `SELECT FOR UPDATE`. Full geo cascade with Haversine distance sort. 50/50 Астана/Алматы for foreign/unknown. Single shared routing module — zero rule duplication between real-time and batch. |
| **AI analysis quality** | 25% | Gemini 2.5 Flash with detailed per-type prompt rules and priority modifiers. **RAG few-shot grounding**: incoming ticket embedded with `gemini-embedding-001`, pgvector cosine search over curated corpus, top-3 category-diverse examples injected into every prompt. Yandex Maps API v1 → Nominatim fallback for geo. HuggingFace sentiment as confidence-scored backup. |
| **UI & Visualisation** | 20% | Dark, professional Next.js 15 panel. Full ticket detail view (AI + assignment). Star Task: NL → live chart with 12 metric types, 4 chart styles, auto-saved gallery. Landing page + live submission form. Classification test panel. |
| **Architecture** | 15% | FastAPI + ARQ + Redis for async throughput. pgvector for RAG. Alembic migrations. `candidates_considered` audit column. Docker Compose one-command deploy. |
---
## 12. Key Technical Differentiators
1. **RAG-grounded classification** — every Gemini call is primed with the 3 most semantically similar real Freedom Finance tickets from a hand-curated, domain-specific corpus. Embeddings are generated by `gemini-embedding-001` (768 dims) and stored in pgvector with an IVFFlat cosine index. A category-diversity filter prevents the few-shot block from being dominated by a single class, giving the model maximum discriminatory signal for edge cases (fraud-vs-complaint, consultation-vs-data-change). The corpus is auto-seeded on every startup — zero manual DB operations needed.
2. **Single-source routing module** — `core/routing.py` + `core/assignment.py` are imported by both the real-time `/report` endpoint and the ARQ batch worker. Every business rule has exactly one implementation; no risk of a hotfix applied in one path and missed in the other.
3. **Complete TZ hard-filter cascade** — all four rules implemented precisely: (1) geo nearest-office or 50/50 fallback, (2) Chief-only for Смена данных, (3) VIP skill for VIP/Priority segment, (4) language skill for KZ/ENG tickets. Filters are applied in cascade order inside `apply_hard_filters()` before any scoring.
4. **Scoring model, not hard top-2** — the exponential load decay ($e^{-\text{load}/3}$) combined with skill-depth and position-affinity components gives a continuous score that naturally distributes work without ever hard-excluding a manager or deadlocking the queue.
5. **Atomic per-office round-robin** — `SELECT FOR UPDATE` on `office.rr_index` makes turn-taking safe under 8 concurrent ARQ workers. No application-level mutex, no race condition, no Redis counter — the database does the work.
6. **Full routing audit trail** — `candidates_considered` JSON on every ticket records every evaluated manager (name, score, load, skills, city). Any routing decision can be replayed and challenged.
7. **City cascade beyond Астана/Алматы** — all offices are sorted by Haversine distance and tried in order. A ticket in Кокпекты that has no qualified VIP manager locally will cascade to Усть-Каменогорск automatically.
8. **Multimodal support** — image attachments (PNG, JPG…) are forwarded directly to Gemini Vision; the LLM can read error screenshots to improve classification accuracy.
9. **AI analytics assistant (Star Task)** — free-text NL queries are dispatched by Gemini to a typed metrics catalogue (12 metrics, prevents SQL injection and column hallucination). Charts are rendered live in 4 styles and auto-saved to the gallery.
---
## 13. Live Demo Flow
1. **Seed data**: upload `business_units.csv` → `managers.csv` via `/admin/ingest`
2. **Batch ingest**: upload `tickets.csv` — watch ARQ worker process 30+ tickets in real time
3. **Single ticket**: submit via `/report` — get AI analysis + manager assignment in < 10 s
4. **Admin panel**: open `/admin/tickets` — see full enriched ticket list with category, priority, sentiment, summary, assigned manager
5. **Analytics**: open `/admin/analytics` — type *"Покажи нагрузку по менеджерам"* — get a bar chart
6. **Telegram**: manager registers via `/register` — receives push notification when assigned
---
## 14. Team & Stack Summary
```
FIRE — Freedom Intelligent Routing Engine
Backend : Python 3.12 · FastAPI · SQLAlchemy 2 · Alembic · ARQ · uvloop
AI/ML : Gemini 2.5 Flash · Google GenAI SDK · LangDetect · HuggingFace
Geo : Yandex Maps API v1 · Nominatim/OSM · geopy · Haversine
Database : PostgreSQL 16 · pgvector · Redis 7
Frontend : Next.js 15 · TypeScript · Recharts · Tailwind CSS
Bot : aiogram 3 · Telegram Bot API
Infra : Docker Compose · Uvicorn · Gunicorn-compatible
```
---
*Built for the FIRE hackathon — Freedom Finance, 2026.*
Vesta, the Roman goddess of hearth and home, represents the sacred flame that must never go out—the core identity that defines who you are. In Hermetic terms, Vesta governs brand identity—the art of creating and maintaining a consistent, authentic presence that resonates across all touchpoints.
The papers have been soreted according to the tools to that they belong, by stating them in the heading of the paper. For ex: `[1, 2, 3]`, which means the paper has 1, 2 and 3 tools ingrined in it. The picture below shows my understanding about the & tools of Causal Inference. This is the link of the paper [Seven Tools of Causal Inference with Reflections on Machine Learning](https://ftp.cs.ucla.edu/pub/stat_ser/r481.pdf)