**The past few weeks have been wild.**
Our team recently adopted **AI-assisted programming** (not vibe-coding). We wanted speed, consistency, and fewer repetitive tasks.
What we got in week one was... chaos.
> TL;DR
> We had **inconsistent AI output** across teammates, treated it as a **systems problem**, applied Agentic SDLC principles, and built a `.codex` structure that made Codex outputs **far more consistent**.
---
## Quick Jump
- [What worked (and what didn’t)](#what-worked-and-what-didnt)
- [The shift: Agentic Software Development Lifecycle](#the-shift-agentic-software-development-lifecycle)
- [What’s inside `.codex` (and why it matters)](#whats-inside-codex-and-why-it-matters)
- [How we used it in this repo](#how-we-used-it-in-this-repo-github)
- [What’s next](#whats-next)
Even with detailed technical tickets, our AI agents kept producing outputs that didn’t line up:
- different **naming conventions**
- different **folder structures**
- different **approaches to the same problem**
- different **error-handling and testing styles**
Every merge felt like stitching code from three different universes.

> **Week one energy:** *us trying to keep standards while random AI output keeps walking by.*
---
## What worked (and what didn’t)
We tested different setups.
A strong `CLAUDE.md` helped a lot early. Claude Opus was excellent at reasoning and breaking work down step by step.
The issue for us was practical: **limits and timeouts**.
Since our company sponsors Codex, we moved to Codex as our main tool. First impression? **A bit lackluster** compared to Claude out of the box.
That pushed us to a better question:
**Maybe this isn’t just a model problem. Maybe it’s a <u>system problem</u>.**

---
## The shift: Agentic Software Development Lifecycle
Quick diff:
- Traditional SDLC: **mostly human implementation through linear phases**.
- A-SDLC: **human + AI-agent collaboration in tight feedback loops**.
In A-SDLC, developers don’t just write code. We **orchestrate**:
- guardrails
- context
- fast reviews
- memory updates
And yes, we’re **actively applying** these principles in real day-to-day work, not just talking about them:
- better prompts
- tighter feedback loops
- stronger project memory
- clear constraints, patterns, and checklists
Once we treated this as a systems problem, **output quality improved fast**.
That’s why I built this boilerplate: to showcase the `.codex` setup that worked best for us.
Repo: [github.com/jimzandueta/codex-nestjs](https://github.com/jimzandueta/nestjs-http-server-boilerplate-codex-ai-assisted)
---
## What’s inside `.codex` (and why it matters)
This isn’t just a random folder. It’s the **operating system** for consistent AI-assisted engineering.
```text
.codex/
START_HERE.md
RULES.md
MANIFEST.yaml
instructions/
patterns/
anti-patterns/
checklists/
skills/
prompts/
templates/
overrides/
memory/
```

### 1) `START_HERE.md` + `RULES.md`
These are your baseline guardrails.
```md
## Output Rules
1. Reuse existing patterns before inventing new ones.
2. Keep diffs minimal.
3. Never commit secrets.
```
This alone prevents a lot of “same task, five coding styles” situations.
### 2) `MANIFEST.yaml`
This is context routing. It tells the agent what to load for each task type.
```yaml
task_routes:
new-feature:
read:
- .codex/instructions/global.md
- .codex/patterns/repo-structure.md
- .codex/patterns/error-handling.md
skills:
- .codex/skills/new-feature/SKILL.md
```
So agents don’t start cold. They start with the right playbook.
### 3) `instructions/`, `patterns/`, `anti-patterns/`
- `instructions/`: how to work
- `patterns/`: preferred way to build
- `anti-patterns/`: what to avoid
Think of this as turning tribal team knowledge into repeatable, machine-readable engineering practice.
### 4) `checklists/`, `skills/`, `prompts/`, `templates/`
This is the day-to-day execution layer:
- `checklists/`: quality gates
- `skills/`: repeatable workflows
- `prompts/`: reusable prompt scaffolds
- `templates/`: starter artifacts
Example checklist snippet:
```md
## Tests
- [ ] New logic has happy path + failure test
- [ ] Coverage stays at 100% threshold
- [ ] No flaky tests introduced
```
### 5) `overrides/`
This lets you keep generic Codex assets while declaring project reality.
Example:
- generic pattern: “recommended structure”
- project override: “this NestJS repo uses `src/common`, `src/clients`, `src/modules`, etc.”
### 6) `memory/` (the secret sauce)
**This is where consistency compounds:**
- `memory/project-facts.md` → stable project truths
- `memory/decisions.md` → ADR-style decisions/tradeoffs
- `memory/learned-patterns.md` → recurring conventions discovered during work
As new decisions are made between the developer and AI agent, memory gets updated so future tasks inherit the **same context and tradeoffs**.
A realistic flow:
- Developer: “We need `requestId` in logs for traceability.”
- Agent: “Two options: `AsyncLocalStorage` or explicit propagation.”
- Team decision: explicit propagation first (simpler + easier to test).
- Memory updates: ADR + project convention + learned pattern.
Sample ADR:
```md
### ADR-002: Standardize request correlation IDs in HTTP logs
**Date**: 2026-04-02
**Status**: Accepted
**Context**: Debugging incidents was slow because logs across layers were hard to correlate.
**Decision**: Add `requestId` at the HTTP boundary and propagate it through services/clients.
**Consequences**: Better traceability, with slight method-signature overhead.
```
Sample project facts update:
```md
## Conventions
- Logging: include `requestId` in structured logs for HTTP flows.
- Request context: generate/forward `x-request-id` at ingress and propagate downstream.
```
Sample learned pattern:
```md
### LP-001: Propagate requestId from boundary to integrations
**Observed**: Missing correlation fields made multi-step failures harder to debug.
**Rule**: Controllers create context; services/clients forward `requestId`; logs include it at each layer.
```
This memory layer is the difference between **“new agent, same mistakes”** and **“new agent, same team brain.”**

---
## How we used it in this repo ([GitHub](https://github.com/jimzandueta/nestjs-http-server-boilerplate-codex-ai-assisted))
Using this `.codex` setup, we built a NestJS sample HTTP server with consistent architecture and quality gates:
- clear boundaries (`common`, `clients`, `integrations`, `modules`, `http`, `errors`)
- validated runtime config (`HOST`, `PORT`, `NODE_ENV`, `LOG_LEVEL`)
- structured logging
- reusable HTTP client with timeout/retry
- typed external API errors + global exception filter
- sample feature module (`posts`) using JSONPlaceholder
- strict tests with **100% coverage thresholds**
- open-source docs (`LICENSE`, `CONTRIBUTING`, `CODE_OF_CONDUCT`, `SECURITY`)
So this repo isn’t just “another Nest starter.”
It’s a **working example of structured AI-assisted delivery**.
---
## One important note
Codex setups are usually **stack-specific**.
This `.codex` is tuned for a NestJS HTTP app. I maintain a different Codex baseline for Terraform/infrastructure because workflows, anti-patterns, and quality gates are different.
Same core idea, different playbook.
---
## What’s next
I’ll keep evolving this repo with:
- richer feature module examples
- better integration patterns
- stricter review automations
- stack-specific Codex variants
- deeper AI-agent orchestration experiments using open-source tools like LangChain, Langfuse, and local models
If your team is in that “week one AI chaos” phase, start with structure first.
Model quality matters, but **system quality matters more**.

---
## IMPORTANT: Here's a picture of my cat!
