MCP tools

The nano-banana MCP server exposes four tools. They all map onto the same pipeline-core helpers that power the web app, so behavior is identical modulo who triggers the call.

All four tools return both a human-readable text content block (JSON, kept for back-compat) and a typed structuredContent object validated against an output schema. Programmatic callers (agents, MCP clients) should consume structuredContent rather than parsing the text block.

Per-user quota

Each call to propose_concepts, generate_image, or generate_image_async counts as one unit against your daily / hourly quota — iterations does not multiply the count. get_image is free. Defaults: 300 / day, 60 / hour per user. Quota exhaustion returns a tool error; the Claude client surfaces it. See MCP auth & operations → Quotas for details.

`propose_concepts`

Synchronous. Turns a rough idea into one stylistically coherent visual concept, with a refined prompt + thematic keywords + an id of the form <pipelineId>#0 that can be fed back into generate_image via concept_id.

Tool annotations: side-effecting (readOnlyHint: false, idempotentHint: false, openWorldHint: true).

{
  "prompt": "marketing hero for our gemini enterprise launch",
  "conversationId": "optional, for stitching multi-turn sessions"
}

Returns:

{
  "pipelineId": "abc-123",
  "concepts": [
    {
      "id": "abc-123#0",
      "title": "Translucent prism",
      "refined_prompt": "…full styled prompt the diffusion model will see…",
      "keywords": ["isometric", "translucent", "blue"]
    }
  ]
}

concepts is always a single-element array today — the architect emits one refined concept per call. To explore alternative directions, call propose_concepts again with a tweaked prompt; to explore visual variants of the same concept, use generate_image_async with iterations: 2..4 on the returned id.

`generate_image` (synchronous)

Blocks up to 180 seconds while the pipeline runs, streams progress notifications, returns final signed GCS URLs. Best for interactive flows where the human is actively watching.

Tool annotations: side-effecting (readOnlyHint: false, idempotentHint: false, openWorldHint: true).

{
  "prompt": "raw user prompt (optional if concept_id provided)",
  "concept_id": "abc-123#0 (optional, supersedes prompt)",
  "refinement": "optional free-text tweak applied on top of concept",
  "aspectRatio": "1:1 | 16:9 | 9:16 | 4:3 | 3:4",
  "iterations": 1,
  "enhance": true,
  "resolution": "1k | 2k | 4k",
  "response_format": "concise"
}

response_format (default concise):

Value	Fields returned per result
`concise`	`image` (signed URL), `inline` status
`detailed`	above + `image_uri` (gs://…) and `prompt`

In detailed mode the top-level response object also includes duration_ms (generate_image only).

Returns (concise):

{
  "images": [
    {
      "pipeline_id": "def-456",
      "status": "completed",
      "results": [
        { "image": "<7d signed GCS URL>", "inline": "ok" }
      ]
    }
  ]
}

In detailed mode duration_ms is added to the top-level object and each result also includes image_uri and prompt.

`generate_image_async`

Same input shape as generate_image (minus response_format), but returns immediately with pipeline_ids. Use this when you want to fire-and-forget several pipelines and poll them in parallel, or when the model expects to be free while images render.

Tool annotations: side-effecting (readOnlyHint: false, idempotentHint: false, openWorldHint: true).

Returns:

{ "pipeline_ids": ["def-456", "def-457"], "status": "running" }

`get_image`

Snapshot or long-poll on one pipeline.

Tool annotations: read-only and idempotent (readOnlyHint: true, idempotentHint: true, openWorldHint: true).

{
  "pipeline_id": "def-456",
  "wait_seconds": 270,
  "response_format": "concise"
}

wait_seconds: 0 (default): instant snapshot.
wait_seconds: 1..270: blocks up to N seconds, returns whenever status transitions to completed or failed (or the deadline passes).

response_format (default concise):

Value	Fields returned per result
`concise`	`image` (signed URL), `inline` status
`detailed`	above + `image_uri` (gs://…) and `prompt`

Returns:

{
  "pipeline_id": "def-456",
  "status": "running",
  "progress_percent": 60,
  "stage": "enhance-image",
  "results": []
}

Tool selection cheat sheet

Scenario	Tool combination
"Make me an image of X"	`generate_image` (raw prompt, auto-styled via prompt-engineer)
"Show me a direction first"	`propose_concepts` → user accepts/refines → `generate_image` (`concept_id`)
Bulk variations in parallel	`generate_image_async` (`iterations: 4`) → wait → `get_image`
Long-running flow that survives client restart	`generate_image_async` → `get_image` with `wait_seconds`

propose_concepts​

generate_image (synchronous)​

generate_image_async​

get_image​

Tool selection cheat sheet​

`propose_concepts`

`generate_image` (synchronous)

`generate_image_async`

`get_image`

Tool selection cheat sheet