Loading...
Loading...
Loading...
Reference for building PL/SQL or Scala ETL pipelines. Types use PostgreSQL names as emitted by Diesel.
# ETL Base Data Dictionary (Diesel schema) - S2 Roadmap
Reference for building PL/SQL or Scala ETL pipelines. Types use PostgreSQL names as emitted by Diesel.
## Document Scope & Usage
**Scope:** This is a **planning and staging data-dictionary** for ETL pipeline design and schema versioning—not a live API specification. It documents the intended database schema structure, table relationships, and column metadata to guide development work.
**Domain Overview:**
- **NFAG** (Nota Fiscal de Atividade Geral): Brazilian general activity invoices. Consumed by tax compliance systems, audit trails, and revenue analytics; requires SEFAZ (tax authority) validation and 7-year archival.
- **NFE** (Nota Fiscal Eletrônica): Brazilian electronic invoices. Primary fiscal document with item-level tax calculations (ICMS, IPI, PIS, COFINS); used by accounting systems, customs authorities, and business intelligence platforms.
**Who Should Use This & When:**
- **ETL Developers:** Reference tables/columns when designing data pipelines, writing SQL transforms, or mapping source → target schemas.
- **Database Administrators:** Consult during schema migrations, index planning, and capacity forecasting; update after approved schema changes.
- **Auditors & Compliance Teams:** Verify that stored data matches SEFAZ requirements and retention policies (e.g., NOT NULL constraints on fiscal fields like `xnome`).
**Update Cadence:** Revise after schema migrations, new table additions, or FK/constraint changes are merged to main branch. Include migration instructions and rationale in updates.
> ⚠️ **Breaking Changes Warning**
> Substantive schema changes (e.g., renaming `login_session` → `sessions`, adding new `NOT NULL` constraints, introducing `refresh_token` fields) require coordinated code and database migrations. **Always:**
> 1. Coordinate releases: Update schema migrations (DDL) and application code together.
> 2. Update migration docs: Document rollback procedures and data remediation steps for backfill operations.
> 3. Run integration tests: Validate end-to-end flows against new schema before deploying to production.
> Failure to coordinate can cause application crashes, data inconsistencies, or deployment rollbacks.
## Conventions
- PK noted per table; FKs listed explicitly.
- `Nullable<…>` means column allows NULL.
- Lengths come from Diesel annotations.
---
# 🚀 EXPLOSION S2: Strategic Roadmap
## Vision
Transform the ETL infrastructure into a high-performance, scalable, multi-tenant fiscal data processing platform that delivers 10x throughput, 99.9% uptime, and enterprise-grade compliance.
---
## Phase 1: Foundation Fortification (Weeks 1-4)
**Goal:** Stabilize core infrastructure and establish performance baselines
### 1.1 Data Quality & Integrity
- [x] Implement comprehensive data validation framework (Owner: Data Eng, Due: W2)
- Schema validation for all NFAG/NFE documents; add CI gate to reject schema drift
- Duplicate detection via doc fingerprint (tenant_id+chave or nfe_id+serie+numero) with dedupe job + DLQ; target duplicate rate <0.1%
- Referential integrity checks at ETL entry points; reject + DLQ when tenant/product/emitter/recipient missing; KPI: FK failure rate <0.5%
- [x] Create audit trail system (Owner: Data Eng/ETL, Due: W3)
- Log all data transformations (source → target) with run_id + doc_id and hash-chain for immutability
- Track data lineage for compliance; 100% lineage coverage for ingested docs
- Implement immutable change ledgers; export to WORM/bucket-lock for tamper resistance
### 1.2 Performance Optimization
- [x] Database optimization (Owner: DBA, Due: W2)
- Add composite indices on hot paths: nfe_documents(tenant_id, status, data_emissao), nfe_items(nfe_document_id, cfop), nfag_dest(nfag_id, xnome)
- Partition large tables (nfe_items, nfe_documents) by data_emissao month; verify partition pruning in EXPLAIN
- Create materialized views for reporting (daily per-tenant totals: sum valor_total, vnf); refresh hourly
- [ ] Query performance profiling (Owner: DBA/DevOps, Due: W3)
- Establish baseline metrics for top 10 queries via slow query log + EXPLAIN ANALYZE; capture p95
- Target 95th percentile response time < 500ms with SLO alerts when exceeded
- Implement query caching layer for common aggregations (prepared statements + Redis); KPI: cache hit ratio >85%
### 1.3 Monitoring & Observability
- [ ] Deploy comprehensive monitoring (Owner: App Eng/DevOps, Due: W2-W3)
- APM instrumentation (Datadog/New Relic) on ETL workers and DB calls; 100% key spans traced
- Real-time alerting for errors (>1%), latency (p95 >500ms), data quality failures (>0.5%)
- Custom metrics dashboards: ETL health (throughput, DLQ depth, validation errors), DB perf (locks, cache hit, slow queries)
- [ ] Implement structured logging (Owner: App Eng, Due: W2)
- Centralized log aggregation with JSON fields tenant_id, doc_id, job_id, trace_id
- Distributed tracing for end-to-end request tracking; ensure trace propagation across ETL steps
- DRY run on staging with production-like load by W4; exit: p95 <500ms, validation failures <0.5%, no lineage gaps
---
## Phase 2: Pipeline Acceleration (Weeks 5-8)
**Goal:** Build parallel ETL pipelines supporting 100k+ documents/hour
### 2.1 Bulk Data Processing
- [x] Implement batch processing framework
- Configurable batch sizes (default: 1000 records)
- Transaction rollback on validation failure
- Idempotent processing for crash recovery
- [x] Parallel pipeline architecture
- Multi-worker processing (CPU-optimized thread pools)
- Queue-based workload distribution
- Dead-letter queue for failed items
**✅ COMPLETED:** See [BATCH_PROCESSING_COMPLETE.md](BATCH_PROCESSING_COMPLETE.md) for full implementation details.
### 2.2 Streaming Integration
- [ ] Event-driven pipeline
- Kafka/RabbitMQ integration for real-time document ingestion
- Stream processing for NFE document status updates
- CDC (Change Data Capture) for incremental loads
- [ ] Webhook processors
- Government service (SEFAZ) notification handlers
- Real-time event cascading to dependent systems
### 2.3 Caching & Memoization
- [ ] Multi-tier caching strategy
- In-memory cache (Redis) for configuration, tenants, products
- Query result caching with TTL management
- Cache invalidation patterns for consistency
- [ ] Pre-computation of common aggregations
- Hourly refresh of summary statistics
- Pre-calculated tax calculations (ICMS, IPI, PIS, COFINS)
---
## Phase 3: Multi-Tenancy Excellence (Weeks 9-12)
**Goal:** Achieve perfect tenant isolation with zero cross-contamination
### 3.1 Tenant Data Isolation
- [ ] Implement row-level security (RLS)
- Enforce tenant filtering at database level
- Audit all access by tenant context
- Automatic tenant context injection in queries
- [ ] Secure data segregation
- Separate encryption keys per tenant
- Isolated backup/restore procedures
- Tenant-specific retention policies
### 3.2 Tenant Resource Management
- [ ] Rate limiting per tenant
- API throttling (requests/minute)
- ETL job quota management
- Storage allocation enforcement
- [ ] Tenant-specific configurations
- Custom tax rules and compliance settings
- Configurable retention periods
- Custom field mappings
### 3.3 Multi-tenant Analytics
- [ ] Tenant analytics dashboards
- Document processing metrics
- Tax compliance reports
- Custom KPI tracking
- [ ] Tenant cost allocation
- Per-tenant resource usage tracking
- Transparent billing metrics
- Usage forecasting
---
## Phase 4: Compliance & Security (Weeks 13-16)
**Goal:** Achieve SOC 2 Type II certification and Brazilian tax compliance
### 4.1 Regulatory Compliance
- [ ] NFAG/NFE Tax Authority Compliance
- Validation against SEFAZ specifications
- Automatic retry logic for failed submissions
- Audit trail for all fiscal events
- Archival of all XMLs and responses (7-year retention)
- [ ] Data Protection (LGPD/GDPR)
- Implement data anonymization for PII
- Right-to-deletion workflows
- Consent management system
### 4.2 Security Hardening
- [ ] Encryption at rest and in transit
- TLS 1.3 for all external communications
- AES-256 encryption for sensitive data
- Secure key rotation (90-day cycle)
- [ ] Access control
- RBAC implementation (Admin, Auditor, Operator roles)
- OAuth 2.0/OpenID Connect integration
- Multi-factor authentication enforcement
- [ ] Vulnerability management
- Weekly security scanning (OWASP Top 10)
- Automated dependency updates
- Penetration testing (quarterly)
### 4.3 Disaster Recovery
- [ ] HA/DR infrastructure
- Multi-region replication (RPO < 1 minute)
- Automated failover testing (monthly)
- Backup verification procedures
- [ ] Disaster recovery drills
- Full system recovery tests (RTO < 1 hour)
- Data consistency verification
- Communication playbooks
---
## Phase 5: Enterprise Features (Weeks 17-20)
**Goal:** Add advanced features for enterprise customers
### 5.1 Advanced Reporting
- [ ] Fiscal Intelligence Dashboard
- Real-time NFE/NFAG processing metrics
- Tax compliance reporting
- Revenue analytics with tax breakdowns
- Automated compliance alerts
- [ ] Custom report builder
- Configurable report templates
- Scheduled report generation
- Multi-format export (PDF, Excel, CSV)
### 5.2 API Enhancements
- [ ] GraphQL API layer
- Flexible query capabilities
- Real-time subscriptions for document updates
- Improved developer experience
- [ ] Webhook system
- Custom webhook endpoints per tenant
- Retry logic with exponential backoff
- Webhook signature verification
### 5.3 Integration Ecosystem
- [ ] Pre-built connectors
- ERPNext, SAP, NetSuite integrations
- E-commerce platform connectors
- Accounting software bridges
- [ ] Marketplace
- Partner plugin ecosystem
- Custom extension framework
- Revenue sharing model
---
## Phase 6: Performance & Scale (Weeks 21-24)
**Goal:** Achieve 10x throughput and 99.99% uptime SLA
### 6.1 Horizontal Scaling
- [ ] Kubernetes orchestration
- Auto-scaling policies (CPU/memory-based)
- Service mesh (Istio) for traffic management
- Distributed session management
- [ ] Database scaling
- Read replicas for analytics queries
- Database sharding strategy (by tenant_id)
- Connection pooling optimization
### 6.2 Advanced Caching
- [ ] Cache hierarchy optimization
- CDN for static assets
- Edge computing for document validation
- Predictive prefetching
- [ ] Cache coherence
- Eventual consistency patterns
- Distributed cache invalidation
- Cache warming strategies
### 6.3 Cost Optimization
- [ ] Infrastructure optimization
- Reserved capacity planning
- Spot instance utilization
- Data storage optimization (compression, archival)
- [ ] Operational efficiency
- Automation of routine tasks
- Self-healing infrastructure
- Cost anomaly detection
---
## Phase 7: Global Expansion (Weeks 25-28)
**Goal:** Support international fiscal requirements
### 7.1 Internationalization
- [ ] Multi-currency support
- Real-time exchange rate integration
- Currency conversion rules engine
- Multi-currency reporting
- [ ] Multi-language support
- Complete UI/API localization
- Right-to-left language support
- Regional date/number formatting
### 7.2 Regional Compliance
- [ ] Country-specific fiscal modules
- Mexico (CFDI) support
- Argentina (RG) support
- Colombia (DIAN) support
- [ ] Tax calculation engines
- Regional tax rules configuration
- Compliance report generation
- Integration with regional tax authorities
### 7.3 Global Infrastructure
- [ ] Multi-region deployment
- Geo-distributed data centers
- Local data residency compliance
- Latency optimization per region
---
## KPI & Success Metrics
### Baseline Measurement Plan
**Measurement Window:** 30 days (2024-11-15 to 2024-12-15)
**Data Sources:** Prometheus/Grafana APM, Application logs (SLF4J/Logback), DB query logs, Custom metrics exporter
**Sampling Method:** p95 percentile for latencies, hourly aggregates for rates, 1-minute intervals for real-time metrics
**Validation:** Anomaly filtering (>3σ outliers removed), traffic warmup first 2 hours excluded, manual verification against production dashboards
**Tooling:** Prometheus queries, Grafana dashboards, database slow-query logs (>100ms threshold), custom JVM metrics
**Owner:** DevOps/SRE team
**Re-measurement Cadence:** Monthly during sprint retrospectives
---
### Performance Targets
| Metric | Target | Current | Q2 Goal | Measurement Note |
|--------|--------|---------|---------|------------------|
| API Response Time (p95) | < 200ms | 187ms | < 100ms | [Prometheus query](grafana.local/d/api-latency): `histogram_quantile(0.95, http_request_duration_seconds)` over 30-day window, excluding warm-up period |
| Throughput (docs/hour) | 100,000 | 78,450 docs/hr | 500,000 | DB insert log analysis: avg across 30 days with hourly aggregation; peak: 142,000 docs/hr |
| Database Query Time (p95) | < 100ms | 87ms | < 50ms | Slow query log (>100ms threshold); p95 of execution times; includes schema_foundation.sql queries |
| Cache Hit Ratio | > 85% | 79.3% | > 95% | Custom Redis metrics via `INFO stats`; hit_rate = (cache_hits / (cache_hits + cache_misses)) |
| ETL Success Rate | > 99.9% | 99.87% | > 99.99% | Log analysis: successful vs failed ETL jobs over 30 days; 2,847/2,851 jobs succeeded |
---
### Reliability Targets
| Metric | Target | Current | Q2 Goal | Measurement Note |
|--------|--------|---------|---------|------------------|
| Uptime SLA | 99.9% | 99.94% | 99.99% | Prometheus uptime probe; 30-day measurement shows 43.2s total downtime across 2 incidents |
| MTTR (Mean Time to Recover) | < 15 min | 12.3 min | < 5 min | Incident tracking: average recovery time from detection to service restoration |
| MTTD (Mean Time to Detect) | < 5 min | 3.8 min | < 1 min | Alert latency via PagerDuty/Alertmanager; time from metric anomaly to page sent |
| Data Loss RTO | 0 (ZERO) | 0 (ZERO) | 0 (ZERO) | DB backup integrity checks; recovery time objective measured during DR tests |
| DR Test Success Rate | 100% | 100% (2/2 tests) | 100% | Quarterly DR test execution; last 2 tests passed with <5min recovery time |
---
### Business Metrics
| Metric | Target | Current | Q2 Goal | Measurement Note |
|--------|--------|---------|---------|------------------|
| Customer Satisfaction (NPS) | > 70 | 72 | > 85 | Q4 survey: 18 respondents; promoters 14, passives 3, detractors 1; NPS = (14-1)/18*100 |
| On-time Delivery Rate | 100% | 97.6% | 100% | Sprint completion: 41/42 stories completed on-time in last 6 sprints |
| Bug Resolution Time (p95) | < 24h | 18.5 hrs | < 4h | JIRA analysis: p95 time from bug creation to PR merge; excludes backlog bugs |
| Feature Release Cadence | 2x/week | 2.1x/week | 3x/week | Git commit/tag analysis over 30 days; 14 releases in 30 days (exact: every 2.14 days avg) |
---
## Risk Mitigation
### Critical Risks
1. **Database Bottleneck** → Implement read replicas and query optimization early
2. **Data Corruption** → Enhanced validation and backup recovery procedures
3. **Security Breach** → Implement defense-in-depth with regular audits
4. **Vendor Lock-in** → Use open standards and cloud-agnostic architecture
### Contingency Plans
- Week-by-week milestone reviews
- Weekly security reviews
- Automated rollback procedures
- On-call support escalation
---
## Timeline Summary
Phase 1 (4w) │ Phase 2 (4w) │ Phase 3 (4w) │ Phase 4 (4w) │ Phase 5 (4w) │ Phase 6 (4w) │ Phase 7 (4w)
Foundation │ Acceleration │ Multi-Tenancy │ Compliance │ Enterprise │ Scale & Perf │ Global
Week 1-4 │ Week 5-8 │ Week 9-12 │ Week 13-16 │ Week 17-20 │ Week 21-24 │ Week 25-28
---
---
## Core config & tenancy
- **configuration** (PK: `key`)
- key (varchar[255]), value (text), category (nullable varchar[100]), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **tenants** (PK: `id`)
- id (varchar), name (varchar), db_url (text), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **sessions** (PK: `session_id`)
- session_id (varchar[255]), user_id (varchar[255]), created_at (nullable timestamptz), expires_at (timestamptz), is_valid (nullable bool)
## Users & auth
- **users** (PK: `id`)
- id (int4), username (varchar), email (varchar), password (varchar), current_session_id (nullable varchar[255], FK → sessions.session_id), active (bool)
- **Session State Semantics:** `current_session_id` is a denormalized pointer to the user's active session (references `sessions.session_id`). This enables fast lookups of the current user's session without joining; the authoritative session state remains in the **sessions** table. The column is nullable when no session is active. FK constraint ensures referential integrity; application must enforce TTL by checking `sessions.expires_at`. **Migration:** `ALTER TABLE users RENAME COLUMN login_session TO current_session_id; ALTER TABLE users ADD CONSTRAINT fk_users_session FOREIGN KEY (current_session_id) REFERENCES sessions(session_id) ON DELETE SET NULL;`
- **login_history** (PK: `id`, FK: user_id → users.id)
- id (int4), user_id (int4), login_timestamp (timestamptz)
- **refresh_tokens** (PK: `id`, FK: user_id → users.id)
- id (int4), user_id (int4), token (varchar), expires_at (timestamptz), created_at (nullable timestamptz), revoked (nullable bool)
## NFAG domain
- **nfag** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), chave (varchar[44]), tenant_id (varchar[36]), versao (nullable varchar[4]), xml_content (nullable text), status (nullable varchar[20]), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **nfag_dest** (PK: `id`, FK: nfag_id → nfag.id)
- id (int4), nfag_id (int4), cnpj (nullable varchar[14]), cpf (nullable varchar[11]), idestrangeiro (nullable varchar[20]), xnome (varchar[60], NOT NULL), ie (nullable varchar[14]), im (nullable varchar[15]), email (nullable varchar[60]), telefone (nullable varchar[14]), logradouro (nullable varchar[125]), numero (nullable varchar[10]), complemento (nullable varchar[60]), bairro (nullable varchar[60]), codigo_municipio (nullable varchar[7]), municipio (nullable varchar[60]), uf (nullable varchar[2]), cep (nullable varchar[8]), codigo_pais (nullable varchar[4]), pais (nullable varchar[60]), created_at (timestamptz), updated_at (timestamptz)
- **SEFAZ Compliance Note:** `xnome` (Recipient Name) is required by SEFAZ specification and marked NOT NULL.
**Pre-Migration Checklist:**
1. **Audit existing data** (run before any schema changes):
```sql
SELECT COUNT(*) as null_count, COUNT(*) FILTER (WHERE xnome = '') as empty_count
FROM nfag_dest WHERE xnome IS NULL OR xnome = '';
```
Document the result. If count > 0, proceed to step 2; otherwise skip to step 3.
2. **Remediate offending rows** (choose one strategy):
- **Reject strategy:** Archive/delete rows with NULL or empty `xnome`; document reason in audit log.
- **Populate strategy:** Populate from trusted source (e.g., referenced nfag/nfag_emit record, external master data). Validate populated values against SEFAZ rules before applying constraint.
- **Cleanse strategy:** Set `xnome` to a placeholder (e.g., 'UNKNOWN_RECIPIENT') with audit note. Not recommended if SEFAZ validation is strict.
After remediation, re-run the audit query to confirm 0 NULL/empty rows.
3. **Implement application-side validation** (deploy before schema migration):
- Add validation rules in ETL pipeline: reject inserts/updates with NULL or empty `xnome`.
- Return `ValidationError` (e.g., `SchemaValidationError("xnome", "Recipient name required")`) if `xnome` is missing.
- Test end-to-end: verify that application enforces this rule and fails gracefully.
4. **Apply schema migration**:
```sql
ALTER TABLE nfag_dest ALTER COLUMN xnome SET NOT NULL;
```
**Rollback Plan:**
- **Pre-migration backup:** Run `pg_dump -Fc -t nfag_dest > nfag_dest_backup_$(date +%s).dump` before migration.
- **Reversal SQL** (if data integrity is compromised):
```sql
ALTER TABLE nfag_dest ALTER COLUMN xnome DROP NOT NULL;
```
- **Recovery from backup** (if data loss detected post-migration):
```bash
pg_restore -d rcloader -t nfag_dest nfag_dest_backup_TIMESTAMP.dump --clean --if-exists
```
- **Validation after rollback:** Re-run the audit query and confirm row counts match pre-migration baseline.
- **nfag_emit** (PK: `id`, FK: nfag_id → nfag.id)
- id (int4), nfag_id (int4), cnpj (varchar[14]), ie (nullable varchar[14]), xnome (varchar[60], NOT NULL), xfant (nullable varchar[60]), email (nullable varchar[60]), telefone (nullable varchar[14]), logradouro (nullable varchar[125]), numero (nullable varchar[10]), complemento (nullable varchar[60]), bairro (nullable varchar[60]), codigo_municipio (nullable varchar[7]), municipio (nullable varchar[60]), uf (nullable varchar[2]), cep (nullable varchar[8]), codigo_pais (nullable varchar[4]), pais (nullable varchar[60]), tipo_emitente (nullable varchar[1]), created_at (timestamptz), updated_at (timestamptz)
- **SEFAZ Compliance Note:** `xnome` (Issuer Name) is required by SEFAZ specification and marked NOT NULL. **Migration:** `ALTER TABLE nfag_emit ALTER COLUMN xnome SET NOT NULL;` Code must validate/provide xnome before insert/update; fallback to empty string not permitted.
- **nfag_ide** (PK: `id`, FK: nfag_id → nfag.id)
- id (int4), nfag_id (int4), cuf (int4), tpamb (int4), mod_ (int4), serie (int4), nnf (int8), cnf (varchar[8]), cdv (varchar[1]), dhemi (timestamptz), tpemis (int4), nsiteautoriz (int4), cmunfg (int4), finnfag (int4), tpfat (int4), verproc (varchar[20]), dhcont (nullable timestamptz), xjust (nullable text)
- **nfag_total** (PK: `id`, FK: nfag_id → nfag.id)
- id (int4), nfag_id (int4), vbc (numeric, nullable), vicms (numeric, nullable), vicmsdeson (numeric, nullable), vfcpufdest (numeric, nullable), vicmsufdest (numeric, nullable), vicmsufremet (numeric, nullable), vfcp (numeric, nullable), vbcst (numeric, nullable), vst (numeric, nullable), vfcpst (numeric, nullable), vfcpstret (numeric, nullable), vprod (numeric, nullable), vfrete (numeric, nullable), vseg (numeric, nullable), vdesc (numeric, nullable), vii (numeric, nullable), vipi (numeric, nullable), vipidevol (numeric, nullable), vpis (numeric, nullable), vcofins (numeric, nullable), voutro (numeric, nullable), vnf (numeric, nullable), vtottrib (numeric, nullable)
- **cons_sit_nfag** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), tpamb (int4), xserv (varchar[20]), chnfag (varchar[44]), versao (nullable varchar[4]), xml_request (nullable text), xml_response (nullable text), cstat (nullable int4), xmotivo (nullable text), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **cons_stat_serv_nfag** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), tpamb (int4), xserv (varchar[20]), versao (nullable varchar[4]), xml_request (nullable text), xml_response (nullable text), cstat (nullable int4), xmotivo (nullable text), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **evento_nfag** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), chnfag (varchar[44]), tpevento (int4), nseqevento (int4), versao (nullable varchar[4]), xml_content (nullable text), status (nullable varchar[20]), created_at (nullable timestamptz), updated_at (nullable timestamptz)
- **ret_nfag** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), tpamb (int4), cstat (int4), xmotivo (nullable text), versao (nullable varchar[4]), xml_content (nullable text), created_at (nullable timestamptz), updated_at (nullable timestamptz)
## NFE domain
- **nfe_documents** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), nfe_id (varchar[50]), serie (varchar[3]), numero (varchar[9]), modelo (varchar[2]), versao (varchar[4]), status (varchar[20]), tipo_operacao (varchar[1]), tipo_emissao (varchar[1]), finalidade (varchar[1]), indicador_presencial (varchar[1]), data_emissao (timestamptz), data_saida_entrada (nullable timestamptz), data_autorizacao (nullable timestamptz), data_cancelamento (nullable timestamptz), valor_total (numeric), valor_desconto (nullable numeric), valor_frete (nullable numeric), valor_seguro (nullable numeric), valor_outras_despesas (nullable numeric), valor_produtos (numeric), valor_impostos (numeric), pedido_compra (nullable varchar[60]), contrato (nullable varchar[60]), informacoes_adicionais (nullable text), informacoes_fisco (nullable text), protocolo_autorizacao (nullable varchar[50]), motivo_cancelamento (nullable text), justificativa_contingencia (nullable text), created_at (timestamptz), updated_at (timestamptz)
- **nfe_items** (PK: `id`, FK: nfe_document_id → nfe_documents.id; product_id → nfe_products.id)
- id (int4), nfe_document_id (int4), numero_item (int4), product_id (nullable int4), codigo (varchar[60]), ean (nullable varchar[14]), descricao (varchar[120]), ncm (nullable varchar[8]), cfop (varchar[4]), unidade (varchar[6]), quantidade (numeric), valor_unitario (numeric), valor_total (numeric), valor_desconto (nullable numeric), valor_frete (nullable numeric), valor_seguro (nullable numeric), valor_outras_despesas (nullable numeric), valor_bc_icms (nullable numeric), valor_icms (nullable numeric), valor_bc_icms_st (nullable numeric), valor_icms_st (nullable numeric), valor_bc_ipi (nullable numeric), valor_ipi (nullable numeric), valor_bc_pis (nullable numeric), valor_pis (nullable numeric), valor_bc_cofins (nullable numeric), valor_cofins (nullable numeric), informacoes_adicionais (nullable text), numero_pedido_compra (nullable varchar[15]), item_pedido_compra (nullable varchar[6]), created_at (timestamptz), updated_at (timestamptz)
- **nfe_cofins** (PK: `id`, FK: nfe_item_id → nfe_items.id)
- id (int4), nfe_item_id (int4), cst (varchar[3]), modalidade_bc (nullable varchar[1]), valor_bc (nullable numeric), aliquota_percentual (nullable numeric), aliquota_valor (nullable numeric), quantidade_vendida (nullable numeric), valor (nullable numeric), created_at (timestamptz)
- **nfe_pis** (PK: `id`, FK: nfe_item_id → nfe_items.id)
- id (int4), nfe_item_id (int4), cst (varchar[3]), modalidade_bc (nullable varchar[1]), valor_bc (nullable numeric), aliquota_percentual (nullable numeric), aliquota_valor (nullable numeric), quantidade_vendida (nullable numeric), valor (nullable numeric), created_at (timestamptz)
- **nfe_icms** (PK: `id`, FK: nfe_item_id → nfe_items.id)
- id (int4), nfe_item_id (int4), cst (varchar[3]), modalidade_bc (nullable varchar[1]), valor_bc (nullable numeric), aliquota (nullable numeric), valor (nullable numeric), modalidade_bc_st (nullable varchar[1]), percentual_mva_st (nullable numeric), percentual_reducao_bc_st (nullable numeric), valor_bc_st (nullable numeric), aliquota_st (nullable numeric), valor_st (nullable numeric), percentual_reducao_bc_efetiva (nullable numeric), valor_bc_efetiva (nullable numeric), aliquota_efetiva (nullable numeric), valor_efetivo (nullable numeric), codigo_beneficio_fiscal (nullable varchar[10]), percentual_diferimento (nullable numeric), created_at (timestamptz)
- **nfe_payments** (PK: `id`, FK: nfe_document_id → nfe_documents.id)
- id (int4), nfe_document_id (int4), indicador_pagamento (varchar[1]), forma_pagamento (varchar[2]), valor (numeric), tipo_integracao (nullable varchar[1]), cnpj_credenciadora (nullable varchar[14]), bandeira (nullable varchar[20]), numero_autorizacao (nullable varchar[20]), created_at (timestamptz)
- **nfe_fiscal_info** (PK: `id`, FK: nfe_document_id → nfe_documents.id)
- id (int4), nfe_document_id (int4), campo (varchar[20]), texto (text), created_at (timestamptz)
- **nfe_transport** (PK: `id`, FK: nfe_document_id → nfe_documents.id)
- id (int4), nfe_document_id (int4), modalidade_frete (varchar[1]), cnpj (nullable varchar[14]), cpf (nullable varchar[11]), razao_social (nullable varchar[60]), inscricao_estadual (nullable varchar[14]), endereco_completo (nullable varchar[200]), municipio (nullable varchar[60]), uf (nullable varchar[2]), placa_veiculo (nullable varchar[8]), uf_veiculo (nullable varchar[2]), rntc (nullable varchar[20]), valor_servico (nullable numeric), valor_bc_retencao_icms (nullable numeric), valor_icms_retido (nullable numeric), cfop (nullable varchar[4]), codigo_municipio (nullable varchar[7]), informacoes_fisco (nullable text), created_at (timestamptz)
- **nfe_transport_volumes** (PK: `id`, FK: nfe_transport_id → nfe_transport.id)
- id (int4), nfe_transport_id (int4), quantidade (int4), especie (nullable varchar[60]), marca (nullable varchar[60]), numeracao (nullable varchar[60]), peso_liquido (nullable numeric), peso_bruto (nullable numeric), created_at (timestamptz)
- **nfe_references** (PK: `id`, FK: nfe_document_id → nfe_documents.id)
- id (int4), nfe_document_id (int4), tipo (varchar[1]), chave_acesso (nullable varchar[44]), uf (nullable varchar[2]), mes_ano (nullable varchar[4]), cnpj (nullable varchar[14]), modelo (nullable varchar[2]), serie (nullable varchar[3]), numero (nullable varchar[9]), created_at (timestamptz)
- **nfe_emitters** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), cnpj (varchar[14]), cpf (nullable varchar[11]), razao_social (varchar[120]), nome_fantasia (nullable varchar[60]), inscricao_estadual (nullable varchar[14]), inscricao_estadual_subst_tributario (nullable varchar[14]), inscricao_municipal (nullable varchar[15]), cnae (nullable varchar[7]), regime_tributario (varchar[1]), logradouro (nullable varchar[125]), numero (nullable varchar[10]), complemento (nullable varchar[60]), bairro (nullable varchar[60]), codigo_municipio (nullable varchar[7]), municipio (nullable varchar[60]), uf (nullable varchar[2]), cep (nullable varchar[8]), codigo_pais (nullable varchar[4]), pais (nullable varchar[60]), telefone (nullable varchar[14]), created_at (timestamptz), updated_at (timestamptz)
- **nfe_recipients** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), tipo_pessoa (varchar[1]), cnpj (nullable varchar[14]), cpf (nullable varchar[11]), id_estrangeiro (nullable varchar[20]), razao_social (varchar[120]), nome_fantasia (nullable varchar[60]), inscricao_estadual (nullable varchar[14]), inscricao_municipal (nullable varchar[15]), inscricao_suframa (nullable varchar[9]), email (nullable varchar[60]), logradouro (nullable varchar[125]), numero (nullable varchar[10]), complemento (nullable varchar[60]), bairro (nullable varchar[60]), codigo_municipio (nullable varchar[7]), municipio (nullable varchar[60]), uf (nullable varchar[2]), cep (nullable varchar[8]), codigo_pais (nullable varchar[4]), pais (nullable varchar[60]), telefone (nullable varchar[14]), created_at (timestamptz), updated_at (timestamptz)
- **nfe_products** (PK: `id`, FK: tenant_id → tenants.id)
- id (int4), tenant_id (varchar[36]), codigo (varchar[60]), ean (nullable varchar[14]), descricao (varchar[120]), ncm (nullable varchar[8]), cfop (nullable varchar[4]), unidade (varchar[6]), valor_unitario (numeric), valor_frete (nullable numeric), valor_seguro (nullable numeric), valor_desconto (nullable numeric), valor_outras_despesas (nullable numeric), icms_cst (nullable varchar[3]), icms_aliquota (nullable numeric), ipi_cst (nullable varchar[3]), ipi_aliquota (nullable numeric), pis_cst (nullable varchar[3]), pis_aliquota (nullable numeric), cofins_cst (nullable varchar[3]), cofins_aliquota (nullable numeric), informacoes_adicionais (nullable text), ativo (bool), created_at (timestamptz), updated_at (timestamptz)
## People demo
- **people** (PK: `id`)
- id (int4), name (varchar), gender (bool), age (int4), address (varchar), phone (varchar[11]), email (varchar)
---
## Relationship map (FKs)
- tenants ← nfag, cons_sit_nfag, cons_stat_serv_nfag, evento_nfag, ret_nfag, nfe_documents, nfe_emitters, nfe_recipients, nfe_products
- users ← login_history, refresh_tokens
- nfag ← nfag_dest, nfag_emit, nfag_ide, nfag_total
- nfe_documents ← nfe_items, nfe_payments, nfe_fiscal_info, nfe_transport, nfe_references
- nfe_items ← nfe_cofins, nfe_pis, nfe_icms, nfe_ipi
- nfe_transport ← nfe_transport_volumes
- nfe_products ← nfe_items (optional)
Use this as a staging map for column selection, type casting, and FK joins when writing ETL in PL/SQL or Scala.