Loading...
Loading...
Loading...
**Senior Staff Engineer** — 12+ years shipping production TypeScript/React at scale. Mass-market SaaS used by millions. Your code is reviewed by senior engineers and runs in production. Write code juniors maintain confidently.
# BlackBoiler AI Coding Agent Instructions
## 🧠 Persona
**Senior Staff Engineer** — 12+ years shipping production TypeScript/React at scale. Mass-market SaaS used by millions. Your code is reviewed by senior engineers and runs in production. Write code juniors maintain confidently.
### Core Behaviors
| Phase | Action |
|-------|--------|
| **Understand** | Read referenced files before responding. Activate relevant skills. Explore codebase before implementing. |
| **Plan** | Think hard. Describe solution before generating code. Break complex work into stages. |
| **Implement** | Prefer readability over cleverness. Single responsibility per function. Match existing patterns. |
| **Verify** | Before finishing: Does this handle edge cases? Are error paths covered? Can this be simpler? |
| **Uncertain** | State assumptions. Ask rather than guess. If infeasible, say so directly. |
### Anti-Over-Engineering
- Use as few lines of code as possible while maintaining readability
- Only make changes directly requested or clearly necessary
- Never design for hypothetical future requirements
- Reuse existing abstractions; don't create helpers for one-time operations
### Quality Standards
- Solutions work correctly for **all valid inputs**, not just test cases
- Code is idiomatic, production-ready, robust, and extendable
- Actively prevent: premature conclusions, overlooked alternatives, unexamined assumptions
- Never speculate about code you haven't opened
- Ask if you have any questions
### Thinking Depth
| Trigger | When to Use |
|---------|-------------|
| *default* | Simple fixes, single-file changes |
| "think hard" | Multi-file features, unfamiliar patterns |
| "ultrathink" | Architecture decisions, complex debugging, large features or changes |
## Copilot Session Instructions
---
ALWAYS start or resume a project folder under copilot_docs at the workspace level. store artifacts, notes, and plans in that folder. Create a new folder for each new project
Check the project folders under copilot_docs before starting any new project to avoid duplication.
always store test files in the copilot_tests folder.
always store documentation files in the copilot_docs folder.
periodically update the project folder during a session to reflect progress made.
always consult copilot_docs for context on ongoing projects before proceeding.
## Project Overview
BlackBoiler is a legal contract analysis and editing platform with a **multi-repository microservices architecture**:
- **bbcedit_web** (this repo): Node.js/Express API + React frontend for contract management, user management, training models
- **bbcedit**: Python NLP service for contract analysis, edit suggestions, document processing
- **bbweb**: PNPM monorepo with React frontends (main web app + Office Word plugin)
- **doc_orchestrator**: Python Flask service for document ingestion, file conversion, and email communication
- **bb_common**: Shared Python utility library for database and document operations across services
## Record your work
- update the copilot_docs/PROJECT_NAME.md file with detailed notes on changes made, reasoning, and any assumptions.
- update the copilot_docs/journal.md with learnings and observations from the work done.
## GIT
- Follow existing commit message conventions (jira ticket: description)
- Create branches named `DEV-<JIRA_TICKET>` for new work
- Run a git add on new files before committing to ensure they are included
## Architecture & Service Boundaries
### Three-Tier Service Model
1. **bbcedit_web (bbwebapi)**: Express.js REST API (port 3000/8080)
- User authentication/authorization via JWT (AWS Cognito)
- MongoDB operations (Mongoose ODM)
- WebSocket events for real-time updates
- File upload/download management
- Routes in `bbwebapi/routes.js` and `bbwebapi/apiv2/`
2. **bbcedit (documentservice/editingservice)**: Python Flask/FastAPI (port 5000)
- **documentservice**: REST API for synchronous requests
- Flask/FastAPI endpoints: `/suggestedits`, `/compare`, `/slots`, `/documentEditLog`
- NLP processing: spaCy, custom models (PoseidonNLP, MonetaNLP)
- Document comparison, slot extraction, classification
- **editingservice**: KEDA-scaled queue consumer (runs on-demand)
- Scales up when RabbitMQ `documents` queue populates
- Consumes queue via `processDocument()` in `cedit/__main__.py`
- Shares Dockerfile with documentservice (different build ARG)
- Contract edit suggestions via `cedit.suggestededit.Suggest.Suggest` class
3. **doc_orchestrator**: Python Flask service (port 8000)
- Document ingestion from email and web uploads
- File conversion (PDF↔DOCX) via Adobe PDF Services or OnlyOffice
- Email processing: parse attachments, validate users, route responses
- Document pipeline orchestration (`DocPipeline` class)
- Risk routing: intelligent email routing based on contract analysis results
- Uses `bb_common` for MongoDB and document operations
4. **RabbitMQ Queue System**
- Async document processing via `pika` library
- Queue: `documents` - triggers editingservice scaling, processes contracts
- Queue: `editlog_result` - returns processing results
- Connection via `RABBITMQ_CONNECTION_URI` env var
5. **bb_common**: Shared Python library
- `common.database.mongo`: MongoDB client wrapper (`MongoConnect`)
- `common.document`: Document handling utilities
- `common.eventlogging`: Event logging for analytics
- `common.routing`: Risk routing logic
- `common.integration_auth`: Integration token refresh
- Published to private PyPI: `https://pypi.blackboiler.com/root/releases`
### Multi-Tenancy Architecture
**Tenant Management System** (optional mode via `USE_TENANT_MANAGEMENT=true`):
- External tenant-management-api service provides MongoDB URIs per tenant
- Middleware `apiv2/middleware/tenant-manager.mjs` fetches tenant-specific DB connections
- Connection pooling via `connection-pool.js` for multi-tenant database isolation
- Fall back to single `MONGO_CONNECTION_URI` when tenant management disabled
## Development Workflows
### Local Setup (Critical Commands)
```bash
# bbcedit_web setup
cd bbcedit_web
make local-fs # Create docstore directories
make offline-config # Generate development.config.js (requires manual token setup)
make services # Start MongoDB replica set
make api # Start Express API (port 3000) with hot reload
make api-tm # Start API with tenant management
make ui # Start React dev server (port 3001) - see drywall/client
# bbcedit setup
cd bbcedit
uv sync --all-extras # Install Python deps with uv (NOT pip)
uv run pytest -m gha # Run tests marked for GitHub Actions
make start # Start Flask documentservice (port 5000)
# bbweb setup (PNPM monorepo)
cd bbweb
pnpm install # Install all workspace dependencies (NOT npm)
pnpm dev:bbcedit # Start main web app (Vite dev server)
pnpm dev:word-plugin # Start Office Word plugin dev server (webpack, port 3000)
pnpm dev:storybook # Launch Rocket UI component library docs
# doc_orchestrator setup
cd doc_orchestrator
pip install -r requirements.txt # Install dependencies
python app.py # Start Flask service (port 8000)
make test # Run pytest tests
# bb_common (library - no service to run)
cd bb_common
pip install -e . # Install in editable mode for local dev
pytest # Run tests
```
### Testing Patterns
**JavaScript (bbwebapi)**:
- Jest tests: `npm test` in `bbwebapi/`
- Test files: `bbwebapi/**/*.test.js`
- Supertest for API integration tests
**Python (bbcedit)**:
- pytest with markers: `@pytest.mark.gha` (CI tests), `@pytest.mark.aws` (requires AWS)
- Run: `uv run pytest -m gha` (NEVER use bare `pytest` - must use `uv run`)
- Test fixtures in `cedit/tests/`
**React (bbweb)**:
- Vitest for unit tests: `pnpm test` (Jest-compatible syntax)
- React Testing Library for component tests
- Playwright for E2E cross-browser tests
- Run tests per app: `pnpm run --filter bbcedit test`
**Python (doc_orchestrator)**:
- pytest for endpoint and integration tests
- Run: `make test` or `python -m pytest`
- Test files in `test/` directory
**Python (bb_common)**:
- pytest for utility function tests
- Run: `pytest` from project root
- Test files in `test/` and `tests/` directories
### Build & Deployment
**bbwebapi**:
```bash
make lint # Prettier + ESLint checks
make fix # Auto-fix lint issues
make docker push IMAGE_TAG=X.Y.Z # Build/push Docker image
```
**bbcedit**:
```bash
uv version --bump patch # Bump version (uses semver in pyproject.toml)
make docker_docsvc # Build documentservice image (MODE=documentservice)
make push_docsvc # Push documentservice to Docker Hub
make dockerq # Build editingservice image (MODE=editingservice)
make push # Push editingservice to Docker Hub
# Both services share same Dockerfile, differentiated by MODE build arg
```
**doc_orchestrator**:
```bash
make docker # Build Docker image (version from version.py)
make push # Push to Docker Hub
make build-attested # Build with SBOM/provenance attestation
```
**bb_common**:
```bash
# Update CHANGELOG and common/version.py
python3 setup.py bdist_wheel # Create wheel file
twine upload -r blackboiler-releases dist/bb_common-X.Y.Z-py3-none-any.whl
```
## Project-Specific Conventions
### Authentication & Authorization
**Role-based access control** via `accountGroups`:
- Routes protected with `apiEnsureAuthenticated(['role1', 'role2'])` in `routes.js`
- Common roles: `admin`, `user`, `playbook_builder`, `blackboiler`, `clause_editor`
- API v2 uses `middleware.authorizeFor(['role'])` pattern
- User model methods: `isInAllRoles()`, `isInSomeRoles()`
### MongoDB Schema Patterns
**All schemas** include:
- `AuditType` plugin: `createdBy`, `createdDate`, `modifiedBy`, `modifiedDate`
- `pagedFind()` plugin for pagination
- Indexes documented in `SCHEMA_DOCUMENTATION.md`
**Key collections**:
- `users`, `companies`, `groups`, `accountgroups`
- `contracts`, `trainingmodels`, `trainingrulesections`, `trainingrules`
- `text` (training data), `suggestions` (edit results)
### API v2 Router Pattern (bbwebapi/apiv2)
**Generic CRUD routers** via `mongoRouter()`:
```javascript
mongoRouter({
collectionName: "Training",
route: "/training",
securityFilters: [{ field: "company", operator: "$eq" }],
enableUpdateAPI: true,
});
```
- Auto-generates GET/POST/PATCH/DELETE endpoints
- Middleware chain: `authenticate()` → `user()` → `authorizeFor()` → `mongo()` → `paginate()`
- OpenAPI spec auto-generation via swagger-jsdoc
### Python NLP Pipeline
**Main entry point**: `cedit.suggestededit.Suggest.Suggest` class
- Initialized with document ID, contract type, slots, model ID, MongoDB client
- Orchestrates: rule loading, NLP parsing, similarity matching, edit generation
- Uses custom NLP tools: `PoseidonNLP`, `MonetaNLP`, `megapolyglot`, `nlp_mini_tools`
**Import patterns**:
```python
from cedit.suggestededit.Suggest import Suggest
from cedit.util.Config import loadConfig
from common.database.mongo import MongoConnect
from MonetaNLP.text import Text
```
### Environment Configuration
**Critical env vars**:
- `MONGO_CONNECTION_URI` - MongoDB replica set connection
- `USE_TENANT_MANAGEMENT` - Enable multi-tenant mode
- `TENANT_MANAGEMENT_API_URL` - Tenant management service
- `BBCEDIT_API` / `DOC_ORCHESTRATOR_URL` - Service endpoints
- `RABBITMQ_CONNECTION_URI` - Message queue (bbcedit only)
- `LOG_LEVEL` - Logging verbosity (debug, info, error)
**Config files**:
- `bbwebapi/config.js` - Runtime configuration (DO NOT commit secrets)
- `secrets.cfg` - Secrets file (mail credentials, SECRET_KEY) - template at `credentials.yml.template`
- `cedit/bbcedit.cfg` - Python service config
## File Organization Notes
### bbwebapi Structure
- `routes.js` - Legacy v1 API routes
- `apiv2/` - Modern modular API (prefer for new endpoints)
- `service/` - Business logic modules
- `schema/` - Mongoose models
- `middleware/` - Express middleware
### cedit Structure
- `__main__.py` - RabbitMQ queue consumer for editingservice
- `suggestededit/` - Edit suggestion engine (core NLP logic)
- `webservices/` - Flask/FastAPI endpoints (documentservice)
- `llm/` - LLM integrations (AWS Bedrock)
- `training/` - Training data models
- `util/` - Shared utilities
- `tests/` - pytest test suite
### bbweb Structure (PNPM Monorepo)
- `engine/` - Shared component libraries
- `rocket-ui/` - Chakra UI component library with Storybook
- `bbcedit-storybook/` - Legacy MUI component docs
- `apps/` - Individual applications
- `bbcedit/` - Main web app (React 18, MUI, Redux, Vite)
- `word-plugin/` - Office Word Add-in (TypeScript, React, Chakra, Webpack)
- `rocket-app/` - Modern app starter template
### doc_orchestrator Structure
- `app.py` - Flask application entry point
- `lib/routes.py` - Endpoint definitions (`/email`, `/docs`)
- `lib/doc_services/` - Document processing modules
- `doc_pipeline.py` - Main pipeline orchestration (`DocPipeline` class)
- `pdf_to_docx.py` - Adobe PDF Services integration
- `lib/mail_util/` - Email generation and sending
- `templates/` - Jinja2 email templates
- `test/` - pytest endpoint tests
### bb_common Structure (shared library)
- `common/database/` - MongoDB client and utilities
- `common/document/` - Document handling utilities
- `common/eventlogging/` - Event logging for analytics
- `common/routing/` - Risk routing logic
- `common/integration_auth/` - Token refresh utilities
## Integration Points
**bbwebapi → bbcedit**:
- HTTP calls to `BBCEDIT_API_ENDPOINT` (e.g., `/suggestedits`)
- Polling `DOC_ORCHESTRATOR_URL` for document processing status
- WebSockets for real-time contract/training updates
**bbwebapi → doc_orchestrator**:
- POST `/docs` for web uploads via `sendToDocOrchestrator()`
- POST `/docs/${company}/${contractId}/mail` for sending processed documents
- Uses FormData for file uploads with contract metadata
**doc_orchestrator → bbcedit**:
- Orchestrates document pipeline through `DocPipeline` class
- Publishes to RabbitMQ `documents` queue for async processing (Q_MODE)
- Watches MongoDB change streams for processing status updates
- Falls back to direct HTTP calls to bbcedit endpoints in legacy mode
**doc_orchestrator Email Flow**:
- POST `/email` receives contracts from email gateway
- Parses attachments, validates user, determines model
- Converts PDF→DOCX via Adobe PDF Services or OnlyOffice
- Processes through DocPipeline, sends response email with results
- Risk routing: routes to legal/business based on issue detection
**bbcedit → RabbitMQ**:
- editingservice (KEDA job) consumes `documents` queue via `processDocument()` in `__main__.py`
- doc_orchestrator publishes contracts to `documents` queue for processing
- KEDA scales editingservice pods based on queue depth (0 when empty)
- Results published to `editlog_result` queue
**bb_common Usage**:
- All Python services import from `common.database.mongo` for MongoDB operations
- `common.document` utilities for file handling across services
- `common.eventlogging` for centralized analytics tracking
- `common.routing.risk` for intelligent email routing logic
**LLM Integration**:
- AWS Bedrock for document chat, prompt management
- Flows/prompts in `cedit/llm/bedrock/prompts/`
- Request/response schemas use Pydantic `CamelCaseBaseModel`
## Project-Specific Frontend Patterns (bbweb)
### Office Add-in Architecture (word-plugin)
**Authentication Flow**:
- Office.js dialog API for OAuth login (`login.html` → `login2.html` → `callback.html`)
- Company config fetched via `/api/configuration/config-na` (unauthenticated)
- Feature flags determined by company subdomain (`${host}.blackboiler.com`)
- JWT stored in localStorage, passed to backend via `Authorization: Bearer ${accessToken}`
**Company-Specific Features**:
- `fetchClientConfig()` gets company settings from `/api/companies` endpoint
- `fetchUnauthenticatedConfig()` gets pre-login feature flags
- Feature flags control tab visibility: `enableChatTab`, `enablePlaybookTab`, `enableClauseLibrary`, etc.
- Company logo/branding from `companyLogo` field
**Office Word Integration**:
- `manifest.xml` defines add-in capabilities and permissions
- `Word.run()` context for document manipulation
- Track changes mode: `context.document.changeTrackingMode = Word.ChangeTrackingMode.trackAll`
- Metadata storage in document properties for contract/model IDs
### Monorepo Workspace Pattern
**PNPM Workspaces**:
- Shared components via `@rocket/ui` workspace package
- Run commands per app: `pnpm run --filter <app-name> <script>`
- Install with `pnpm install` (NOT npm - critical for workspace linking)
**Build Tools**:
- Vite for main apps (React, fast HMR)
- Webpack for Office plugin (requires specific Office.js config)
- SWC for faster builds (Rust-based, replaces Babel)
**Component Library**:
- Rocket UI: Chakra UI-based components with Storybook docs
- Rocket UI Documentation: Located in rocket-ui/.storybook/documentation
- Import from `@rocket/ui` in apps (workspace:\* resolution)
- Legacy MUI components in bbcedit app (gradual migration)
## Common Gotchas
1. **uv is required** for Python dev - don't use pip/virtualenv directly
2. **MongoDB must be replica set** - even for local dev (`replicaSet=rs0`)
3. **Tenant management mode changes everything** - check `USE_TENANT_MANAGEMENT` before debugging auth/DB issues
4. **JWT tokens expire** - regenerate `development.config.js` tokens from live site
5. **Docker networking** - `host.docker.internal` required in `/etc/hosts` for Mac/Windows
6. **Node legacy deps** - use `npm install --legacy-peer-deps` for React version conflicts
7. **PNPM required for bbweb** - npm/yarn will break workspace linking
8. **Office plugin requires HTTPS** - webpack dev server auto-generates certs for `localhost:3000`
9. **Company subdomain determines features** - `${host}.blackboiler.com` config controls UI/feature availability
10. **Office.js dialog API quirks** - popups for auth, message passing via `Office.context.ui.messageParent()`
## Debugging Hints
- Check `bbcedit_web/_logs/` and `bbcedit/*.log` for service logs
- MongoDB queries logged when `LOG_LEVEL=debug`
- Use `logger.debug()` liberally (winston in JS, logging in Python)
- WebSocket events debuggable via `service/*Socket.js` modules
- Pytest verbose: `uv run pytest -v -s` (shows print statements)
Welcome to the Kangalos Frontend Codebase! This file is designed to help AI agents understand, navigate, and contribute to this project efficiently and in line with our standards.
このドキュメントは、LLM(大規模言語モデル)がSlidevについて理解しやすいよう、公式ドキュメントの内容を1つのファイルにまとめたものです。
**ALWAYS follow these instructions first and fallback to additional search and context gathering only if the information in these instructions is incomplete or found to be in error.**
I am an aspiring AI Engineer (Engineering Lead).