Image generation

Generate images from text prompts. Nano Banana, Flux, Recraft, Imagen — all via the AI SDK.

FIG.
FIG. 00 · IMAGE GENERATIONPROMPT → PIXELS

Image generation models on Synapse Garden split into two families with slightly different return shapes. The AI SDK normalizes both — pick the right function (multimodal generateText for the Nano Banana family, generateImage for image-only models) and the right model, and you're done.

FIG. 01TWO FAMILIES
SCHEMATIC
Nano Banana models lower into `generateText` and return image bytes in `result.files`. Image-only models lower into `generateImage` and return base64 strings in `result.images`. The wire shape and storage shape differ — pick the helper that matches the model.

Two families, two functions

FunctionModel familyReturns
generateTextNano Banana (google/gemini-3-pro-image, google/gemini-2.5-flash-image, OpenAI gpt-image-2)result.files (array of Uint8Array)
experimental_generateImageImage-only models (bfl/flux-2-flex, recraft/recraft-v3, google/imagen-4.0-generate-001)result.images (array with base64)

The Nano Banana family is multimodal LLM territory — text and images both flow through generateText. The image-only models live behind experimental_generateImage. Use whichever the upstream model supports.

Nano Banana family

import { generateText } from "ai"
import fs from "node:fs"

const result = await generateText({
  model: "google/gemini-3-pro-image",
  baseURL: "https://synapse.garden/api/v1",
  apiKey: process.env.MG_KEY,
  prompt: "A serene mountain landscape at sunset, watercolor style.",
})

const image = result.files.find((f) => f.mediaType?.startsWith("image/"))
if (image) {
  const ext = image.mediaType?.split("/")[1] ?? "png"
  fs.writeFileSync(`output.${ext}`, image.uint8Array)
}

Nano Banana models can take text + reference images as input, generate text and/or images, and reference earlier images in a conversation. Useful for editing, variation, and "show me what this looks like" workflows.

const result = await generateText({
  model: "google/gemini-3-pro-image",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Make this room cozier — add a fireplace and warm lighting." },
        { type: "image", image: fs.readFileSync("room.jpg") },
      ],
    },
  ],
})

Image-only models

import { experimental_generateImage as generateImage } from "ai"
import fs from "node:fs"

const result = await generateImage({
  model: "bfl/flux-2-flex",
  prompt: "A vibrant coral reef with tropical fish, photorealistic.",
  aspectRatio: "16:9",
})

const buffer = Buffer.from(result.images[0].base64, "base64")
fs.writeFileSync("output.png", buffer)

Common parameters

generateImage({
  model: "bfl/flux-2-flex",
  prompt: "...",
  aspectRatio: "16:9",   // "1:1" | "4:3" | "16:9" | "21:9" | "3:4" | "9:16" | "9:21"
  size: "1024x1024",     // alternative to aspectRatio
  n: 4,                  // generate 4 variations
  seed: 42,              // deterministic — same seed + prompt = same image
})

n and seed support varies per model. aspectRatio is the most portable.

Choosing a model

ModelStyleSpeedNotes
google/gemini-3-pro-image (Nano Banana Pro)Photorealistic, painterly, illustrationMediumMultimodal — accepts reference images
google/gemini-2.5-flash-image (Nano Banana)General purposeFastCheaper than Pro
openai/gpt-image-2Stylized illustrationMediumStrong typography rendering
bfl/flux-2-flexPhotorealisticFastBest photo realism in the catalog
bfl/flux-2-proPhotorealistic, ultra-detailSlowHigher cost; more detail
recraft/recraft-v3Vector + raster, brand workMediumStrong at logos, illustrations
google/imagen-4.0-generate-001PhotorealisticMediumGoogle's flagship; strong on faces

When in doubt, run the same prompt through bfl/flux-2-flex and google/gemini-3-pro-image and compare. They have very different aesthetic defaults.

Editing images

Some Nano Banana models accept an input image and edit it:

const result = await generateText({
  model: "google/gemini-3-pro-image",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Remove the background. Replace with a soft gradient." },
        { type: "image", image: fs.readFileSync("portrait.jpg") },
      ],
    },
  ],
})

For OpenAI's image models with edit/inpaint support, the OpenAI SDK route is more direct:

const file = await client.images.edit({
  model: "openai/gpt-image-2",
  image: fs.createReadStream("input.png"),
  mask: fs.createReadStream("mask.png"), // alpha channel = where to edit
  prompt: "Replace the sky with a sunset.",
  n: 1,
  size: "1024x1024",
})

Saving the result

The two return shapes:

ShapeHow to save
result.files[i].uint8Array (Nano Banana)fs.writeFileSync(path, image.uint8Array)
result.images[i].base64 (image-only)fs.writeFileSync(path, Buffer.from(image.base64, "base64"))

For uploading to a CDN, both shapes can be passed directly to the SDK of your storage provider (e.g. @vercel/blob's put() accepts a Buffer or Uint8Array).

import { put } from "@vercel/blob"

const { url } = await put(`generated/${Date.now()}.png`, image.uint8Array, {
  access: "public",
  contentType: "image/png",
})
console.log(url)

Pricing

Image generation is billed per image for image-only models, and per request + per image for Nano Banana (which also charges for the text portion of the conversation). The model detail page on /models shows the live rate.

Rough ranges:

  • Cheap (Nano Banana, Flux 2 Flex, Recraft) — $0.02–$0.08 per image
  • Mid (Imagen, Flux Pro) — $0.04–$0.12 per image
  • Premium (with editing, ultra-detail) — $0.10–$0.30 per image

Caveats

  • Resolution caps vary by model. Most cap at 2048×2048; ask for larger and you'll get the cap silently.
  • Faces, brands, IP — provider safety policies may refuse certain prompts. The error code in the response makes it explicit.
  • Aspect ratios support varies. 1:1, 4:3, 16:9 are universal; obscure ratios fall back to the closest supported.
  • Seeds aren't always honored. If you need reproducibility, log the response's seed field and re-use it.
  • Safety overlays — most models add an invisible watermark for provenance. This doesn't affect normal use; it's detectable by C2PA tools.