Image generation

Generate images from text prompts. Nano Banana, Flux, Recraft, Imagen — all via the AI SDK.

FIG.

FIG. 00 · IMAGE GENERATIONPROMPT → PIXELS

Image generation models on Synapse Garden split into two families with slightly different return shapes. The AI SDK normalizes both — pick the right function (multimodal generateText for the Nano Banana family, generateImage for image-only models) and the right model, and you're done.

FIG. 01TWO FAMILIES

SCHEMATIC

Nano Banana models lower into `generateText` and return image bytes in `result.files`. Image-only models lower into `generateImage` and return base64 strings in `result.images`. The wire shape and storage shape differ — pick the helper that matches the model.

Two families, two functions

Function	Model family	Returns
`generateText`	Nano Banana (`google/gemini-3-pro-image`, `google/gemini-2.5-flash-image`, OpenAI `gpt-image-2`)	`result.files` (array of `Uint8Array`)
`experimental_generateImage`	Image-only models (`bfl/flux-2-flex`, `recraft/recraft-v3`, `google/imagen-4.0-generate-001`)	`result.images` (array with `base64`)

The Nano Banana family is multimodal LLM territory — text and images both flow through generateText. The image-only models live behind experimental_generateImage. Use whichever the upstream model supports.

Nano Banana family

import { generateText } from "ai"
import fs from "node:fs"

const result = await generateText({
  model: "google/gemini-3-pro-image",
  baseURL: "https://synapse.garden/api/v1",
  apiKey: process.env.MG_KEY,
  prompt: "A serene mountain landscape at sunset, watercolor style.",
})

const image = result.files.find((f) => f.mediaType?.startsWith("image/"))
if (image) {
  const ext = image.mediaType?.split("/")[1] ?? "png"
  fs.writeFileSync(`output.${ext}`, image.uint8Array)
}

Nano Banana models can take text + reference images as input, generate text and/or images, and reference earlier images in a conversation. Useful for editing, variation, and "show me what this looks like" workflows.

const result = await generateText({
  model: "google/gemini-3-pro-image",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Make this room cozier — add a fireplace and warm lighting." },
        { type: "image", image: fs.readFileSync("room.jpg") },
      ],
    },
  ],
})

Image-only models

import { experimental_generateImage as generateImage } from "ai"
import fs from "node:fs"

const result = await generateImage({
  model: "bfl/flux-2-flex",
  prompt: "A vibrant coral reef with tropical fish, photorealistic.",
  aspectRatio: "16:9",
})

const buffer = Buffer.from(result.images[0].base64, "base64")
fs.writeFileSync("output.png", buffer)

Common parameters

generateImage({
  model: "bfl/flux-2-flex",
  prompt: "...",
  aspectRatio: "16:9",   // "1:1" | "4:3" | "16:9" | "21:9" | "3:4" | "9:16" | "9:21"
  size: "1024x1024",     // alternative to aspectRatio
  n: 4,                  // generate 4 variations
  seed: 42,              // deterministic — same seed + prompt = same image
})

n and seed support varies per model. aspectRatio is the most portable.

Choosing a model

Model	Style	Speed	Notes
`google/gemini-3-pro-image` (Nano Banana Pro)	Photorealistic, painterly, illustration	Medium	Multimodal — accepts reference images
`google/gemini-2.5-flash-image` (Nano Banana)	General purpose	Fast	Cheaper than Pro
`openai/gpt-image-2`	Stylized illustration	Medium	Strong typography rendering
`bfl/flux-2-flex`	Photorealistic	Fast	Best photo realism in the catalog
`bfl/flux-2-pro`	Photorealistic, ultra-detail	Slow	Higher cost; more detail
`recraft/recraft-v3`	Vector + raster, brand work	Medium	Strong at logos, illustrations
`google/imagen-4.0-generate-001`	Photorealistic	Medium	Google's flagship; strong on faces

When in doubt, run the same prompt through bfl/flux-2-flex and google/gemini-3-pro-image and compare. They have very different aesthetic defaults.

Editing images

Some Nano Banana models accept an input image and edit it:

const result = await generateText({
  model: "google/gemini-3-pro-image",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Remove the background. Replace with a soft gradient." },
        { type: "image", image: fs.readFileSync("portrait.jpg") },
      ],
    },
  ],
})

For OpenAI's image models with edit/inpaint support, the OpenAI SDK route is more direct:

const file = await client.images.edit({
  model: "openai/gpt-image-2",
  image: fs.createReadStream("input.png"),
  mask: fs.createReadStream("mask.png"), // alpha channel = where to edit
  prompt: "Replace the sky with a sunset.",
  n: 1,
  size: "1024x1024",
})

Saving the result

The two return shapes:

Shape	How to save
`result.files[i].uint8Array` (Nano Banana)	`fs.writeFileSync(path, image.uint8Array)`
`result.images[i].base64` (image-only)	`fs.writeFileSync(path, Buffer.from(image.base64, "base64"))`

For uploading to a CDN, both shapes can be passed directly to the SDK of your storage provider (e.g. @vercel/blob's put() accepts a Buffer or Uint8Array).

import { put } from "@vercel/blob"

const { url } = await put(`generated/${Date.now()}.png`, image.uint8Array, {
  access: "public",
  contentType: "image/png",
})
console.log(url)

Pricing

Image generation is billed per image for image-only models, and per request + per image for Nano Banana (which also charges for the text portion of the conversation). The model detail page on /models shows the live rate.

Rough ranges:

Cheap (Nano Banana, Flux 2 Flex, Recraft) — $0.02–$0.08 per image
Mid (Imagen, Flux Pro) — $0.04–$0.12 per image
Premium (with editing, ultra-detail) — $0.10–$0.30 per image

Caveats

Resolution caps vary by model. Most cap at 2048×2048; ask for larger and you'll get the cap silently.
Faces, brands, IP — provider safety policies may refuse certain prompts. The error code in the response makes it explicit.
Aspect ratios support varies. 1:1, 4:3, 16:9 are universal; obscure ratios fall back to the closest supported.
Seeds aren't always honored. If you need reproducibility, log the response's seed field and re-use it.
Safety overlays — most models add an invisible watermark for provenance. This doesn't affect normal use; it's detectable by C2PA tools.

Image generation

On this page