Architecture
Nano Banana is an Nx monorepo with five packages and two Cloud Run services.
Repo layout
packages/
├── server/ # Genkit backend (web API + Cloud Tasks stage handlers)
├── mcp-server/ # Remote MCP server (OAuth 2.1 + Streamable HTTP)
├── pipeline-core/ # Shared lib: Firestore client, Cloud Tasks enqueue, GCS,
│ # PipelineDocument state machine, subscribeToPipeline,
│ # startGeneration / startArchitect
├── frontend/ # React + Vite + MUI web client
├── color-edit/ # Python HSL color replacement tool (subprocess stage)
└── docs/ # Docusaurus documentation site (this site)
skills/
└── nano-banana/ # Claude skill that wraps the MCP for AI clients
Pipeline stages
- Architect (optional, kicked off by
propose_conceptsor by the web UI's ideation phase): produces 3–5 stylistically coherent concept proposals from a rough user idea. Each proposal carries a refined prompt + keywords. - Prompt engineer (optional, kicked off when no concept is provided):
rewrites a raw prompt against the corporate style guide so the diffusion model
receives brand-aware language. Web UI normally skips this (architect already
refined). MCP one-shot
generate_imageruns it. See Brand styling. - Generate image: calls Gemini / Imagen with the (refined) prompt + reference icons + optional sketch. Outputs intermediate image to GCS.
- Enhance image (optional): post-processes for sharpness / aesthetic uplift.
- Color adjust: Python subprocess (
packages/color-edit/color_tool.py) replaces brand colors via HSL with luminance preservation.
Each stage is an independent HTTP handler on nano-server. Stages are connected
via three Cloud Tasks queues (text-generation, image-generation, processing)
with different concurrency and rate-limit profiles.
Cloud Run services
| Service | Role | Public URL |
|---|---|---|
nano-server | Web API + Cloud Tasks stage handlers | https://nano-server-bc5eqn62ka-ez.a.run.app |
nano-mcp-server | Remote MCP server (4 tools, OAuth 2.1) | https://mcp.nano.cpl.ai (Cloud Run domain mapping) |
Both services share @nano/pipeline-core and write to the same Firestore
pipelines collection. The mcp-server enqueues Cloud Tasks that target
nano-server's /tasks/* handlers — there's only ever one set of stage
implementations.
Firestore state machine
Each generation creates a pipelines/<uuid> document:
{
id: string;
userId: string;
status: 'pending' | 'running' | 'completed' | 'failed';
stageOrder: string[]; // deterministic iteration order
stages: Record<string, {
status: 'pending' | 'queued' | 'running' | 'completed' | 'failed';
startedAt?, completedAt?, error?, result?: unknown;
}>;
input: { prompt, aspectRatio, enhance, ... };
results?: Array<{ image, image_uri, thumbnail }>;
createdAt, updatedAt;
}
- The frontend
onSnapshot-subscribes to the document for live progress. - The mcp-server
subscribeToPipeline(id, cb)-subscribes server-side and pushes MCPprogressnotifications to the connected Claude client. The client never touches Firestore directly. markStage*andmarkPipeline*helpers (inpipeline-core/state.ts) are the single source of truth for transitions.
Data storage
| What | Where |
|---|---|
| Pipeline state | Firestore (default) DB, pipelines collection |
| Reference icons | Firestore icons collection (vector RAG) |
| Conversation history | Firestore conversations, history |
| User settings | Firestore users/<uid>/settings |
| Generated images | GCS bucket cpl-gen-ai-marketing-images |
| OAuth state (MCP) | Firestore oauth_* + mcp_jwks collections |