Gemini Native (Video)
Video Series
Gemini Native (Video)
Google Veo video generation models
POST
Gemini Native (Video)
Introduction
Veo is Google Vertex AI’s multimodal video generation model. It supports text-to-video (T2V), first-frame constraint, and first-and-last-frame constraint (3.1 series only) for coherent video generation. Use LeapX’s unified video API: submit a task to get atask_id, then query the task to poll status and get the result.
Authentication
Bearer Token, e.g.
Bearer sk-xxxxxxxxxxSupported models
| Model ID | Description |
|---|---|
veo-3.0-fast-generate-001 | Text-to-video, first-frame; fast (audio included by default) |
veo-3.1-fast-generate-preview | Text-to-video, first/first+last frame; fast |
veo-3.0-generate-preview | Text-to-video, first-frame |
veo-3.1-generate-preview | Text-to-video, first/first+last frame |
Call flow
- Submit task:
POST /v1/video/generationswithmodel,prompt, and Veo-specific parameters. - Poll status:
GET /v1/video/generations/{task_id}untilstatusissucceededorfailed. - Get result: On success,
urlin the response contains the video (Veo may returndata:video/mp4;base64,...or an OSS link).
Veo-specific parameters
Video generation prompt describing the scene and motion.
Video duration in seconds; supported:
4, 6, 8.Aspect ratio; only
16:9 and 9:16 supported.Resolution:
720p, 1080p.First-frame reference image (URL or Base64) for image-to-video / first-frame constraint.
Last-frame reference image (veo-3.1 series only); use with first frame for start-and-end constraint.
Whether to generate synchronized audio. Fast models ignore this and always include audio.
Number of videos to generate per request, range
1-4.personGeneration, addWatermark, seed), see Submit Video Task.
