Image Generation
Image Series
Image Generation
POST
Image Generation
Introduction
The image generation API supports text-to-image, image-to-image, image editing, and more. Through a unified API interface, you can call multiple mainstream image generation models including Gemini, Doubao Seedream, GPT Image, and Tongyi Qianwen.Authentication
Bearer Token, e.g.
Bearer sk-xxxxxxxxxxRequest Parameters
Model identifier, supported models include:
- Gemini series:
gemini-2.5-flash-image(Nano Banana),gemini-3-pro-image-preview(Nano Banana Pro), etc. - Doubao Seedream series:
doubao-seedream-3-0-t2i-250415,doubao-seedream-4-0-250828,doubao-seedream-4-5-251128,doubao-seededit-3-0-i2i-250628, etc. - GPT Image series:
gpt-image-1, etc. - Tongyi Qianwen series:
qwen-image-plus,qwen-image-edit-plus, etc.
Text prompt for text-to-image generation
Response format:
b64_json or urlNote: Different models have different support for response_format:- Gemini series: Only supports
b64_jsonformat, always returns base64-encoded image data regardless of the value passed - Doubao Seedream series: Usually returns URL links,
response_formatparameter may not take effect - GPT Image series: Only supports
b64_jsonformat, forces base64-encoded image data - Tongyi Qianwen series: Supports both
b64_jsonandurl, returns the corresponding format based on the parameter value (b64_jsonwill download from URL and convert to base64)
Multi-turn content for image-to-image or contextual conversation
Basic Examples
- Gemini
- Doubao Seedream
- GPT Image
- Tongyi Qianwen
- Text-to-Image
- Image-to-Image
- Multi-Image Fusion
Model-Specific Parameters
Different models support different parameters. Below are detailed parameter descriptions for each model:- Doubao Seedream
- GPT Image
- Gemini
- Tongyi Qianwen
doubao-seedream-3-0-t2i-250415 does not support this parameter for input images.
Supports URL or Base64 encoding. Among them, doubao-seedream-4.5 and doubao-seedream-4.0 support single-image or multi-image input (see the Multi-Image Fusion example), while doubao-seededit-3.0-i2i only supports single-image input.
Image size, supported sizes depend on model version:
- doubao-seedream-3.0:
1024x1024,1152x864,864x1152,1280x720,720x1280,1248x832,832x1248,1512x648 - doubao-seedream-4.0/4.5:
2048x2048,2304x1728,1728x2304,2560x1440,1440x2560,2496x1664,1664x2496,3024x1296(2K) or4096x4096,4704x3520,3520x4704,5504x3040,3040x5504,4992x3328,3328x4992,6240x2656(4K)
Whether to add watermark
Random seed for controlling the randomness of generation results. Same seed produces similar results. Range:
0 to 2147483647Guidance scale, controls how closely the generated image matches the prompt. Higher values are stricter, lower values are more free. Recommended range:
1.0-10.0, default: 2.5. Only supported by doubao-seedream-3.0-t2i-250415 and doubao-seededit-3.0-i2i-250628Sequential image generation toggle, only supported by
doubao-seedream-4.0 and doubao-seedream-4.5:"auto": Enable sequential image generation"disabled": Disable sequential image generation (default)
Sequential image generation configuration options, only effective when
sequential_image_generation is "auto":max_images(integer): Maximum number of images, range1-4, default4
Only doubao-seedream-4.5 (currently only supports standard mode) and doubao-seedream-4.0 support this parameter.
mode(string): Optimization mode"standard": Standard mode, higher quality but longer time (default, supported by both 4.0 and 4.5)"fast": Fast mode, shorter time but average quality (only 4.0)
Response Format
Supported Models
Gemini Series
Model Name:gemini-2.5-flash-image (Nano Banana)
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (single image + text generates new image)
- ✅ Multi-image-to-one (2-5 images fusion generation)
- ✅ Multi-turn conversational image generation (contextual continuous modification)
gemini-3-pro-image-preview (Nano Banana Pro)
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (single image + text generates new image)
- ✅ Multi-image-to-one (2-5 images fusion generation)
- ✅ Multi-turn conversational image generation (contextual continuous modification)
- ✅ Higher quality output
Doubao Seedream Series
Model Name:doubao-seedream-3-0-t2i-250415
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Supports guidance scale adjustment
- ✅ Supports random seed control
- ❌ Does not support image-to-image
doubao-seedream-4-0-250828
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (single image + text generates new image)
- ✅ Multi-image fusion (2-5 images fusion generation)
- ✅ Sequential image generation
- ✅ Supports 2K/4K resolution
- ✅ Supports multiple image formats
- 2K: 2048×2048, 2304×1728, 1728×2304, 2560×1440, 1440×2560, 2496×1664, 1664×2496, 3024×1296
- 4K: 4096×4096, 4704×3520, 3520×4704, 5504×3040, 3040×5504, 4992×3328, 3328×4992, 6240×2656
doubao-seedream-4-5-251128
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (single image + text generates new image)
- ✅ Multi-image fusion (2-5 images fusion generation)
- ✅ Sequential image generation
- ✅ Supports 2K/4K resolution
- ✅ Supports prompt optimization options
- ✅ Supports multiple image formats
doubao-seededit-3-0-i2i-250628
Core Capabilities:
- ✅ Image editing (single image + text editing)
- ✅ Supports guidance scale adjustment
- ✅ Supports random seed control
- ✅ Image editing (content modification, style transfer, etc.)
- ❌ Does not support pure text-to-image
GPT Image Generation Series
Model Name:gpt-image-1
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (up to 10 images + text)
- ✅ Supports image quality selection
- ✅ Supports input fidelity adjustment
- ✅ Multi-image fusion generation
low, medium, high
Generation Count: Can generate 1-10 images per request
Image Input: Supports JPEG, PNG, GIF, WEBP formats, max 10MB, up to 10 images
Model Name: gpt-image-1-mini
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Image-to-image (up to 10 images + text)
- ✅ Supports image quality selection
- ✅ Faster generation speed
- ✅ Lower cost
low, medium, high
Generation Count: Can generate 1-10 images per request
Tongyi Qianwen Series
Model Name:qwen-image-plus
Core Capabilities:
- ✅ Text-to-image (pure text description generates images)
- ✅ Chinese and English text rendering (excels at generating complex text in images)
- ✅ Multiple artistic styles
- ✅ Intelligent prompt extension
- ❌ Does not support image-to-image
qwen-image-edit-plus
Core Capabilities:
- ✅ Image editing (input one image, output up to 6 images)
- ✅ Modify text in images
- ✅ Add/remove/move objects
- ✅ Transfer image styles
- ✅ Enhance image details
Best Practices
Prompt Optimization Tips
- Gemini (Nano Banana)
- Doubao Seedream
- GPT Image
- Tongyi Qianwen
-
Specify aspect ratio needs: Describe composition direction in the prompt
- Landscape: Use “horizontal composition”, “widescreen view”
- Portrait: Use “vertical composition”, “vertical view”
-
High-quality keywords:
- “high quality”, “HD”, “professional photography”
- “8k resolution”, “rich details”
-
Multi-image fusion techniques:
- Clearly describe the role of each image
- Specify fusion method (style transfer, element combination, etc.)
Size Selection Tips
- Design Purposes
FAQ
- General Questions
- Gemini (Nano Banana)
- Doubao Seedream
- GPT Image
- Tongyi Qianwen
What image formats are supported?
What image formats are supported?
Different models support different formats:
- Gemini: PNG, JPEG, JPG, WEBP, max 7MB
- Doubao Seedream 3.0/4.0: JPEG, PNG, max 10MB
- Doubao Seedream 4.5: JPEG, PNG, WEBP, BMP, TIFF, GIF, max 10MB
- GPT Image: JPEG, PNG, GIF, WEBP, max 10MB
- Tongyi Qianwen: JPEG, JPG, PNG, BMP, TIFF, WEBP, max 10MB
How long are generated images valid?
How long are generated images valid?
Image URLs are valid for approximately 24 hours. It is recommended to download and save immediately after receiving the response, or upload to your own storage service.
Can I generate multiple images at once?
Can I generate multiple images at once?
Tongyi Qianwen series generates 1 image per request. For multiple images, please make multiple concurrent requests.
Related Resources
Video Generation
View video generation API documentation
Model List
View all supported model information
