Vidu Q3 Pro vs Turbo on Novita AI: Which Video Mode Should You Use?

Vidu Q3 Pro vs Turbo on Novita AI: Which Video Mode Should You Use?

Choose Vidu Q3 Turbo first when you need lower cost, fast iteration, or high-volume video tests; choose Vidu Q3 Pro when you are willing to pay the higher per-second price for the Pro variant and want to compare it against Turbo for a final creative pass. On Novita AI, both variants expose text-to-video, image-to-video, and start-end-to-video endpoints, support asynchronous generation, and use the same public per-second pricing pattern across those three modes.

Vidu Q3 Pro vs Turbo selection summary

The clearest source-backed difference between Vidu Q3 Pro and Vidu Q3 Turbo on Novita AI is pricing. The public Novita AI pricing payload lists Turbo at lower per-second rates than Pro for 540p, 720p, and 1080p. The API docs also show that both variants are available through separate asynchronous endpoints for text-to-video, image-to-video, and start-end-to-video.

Decision pointStart with Vidu Q3 TurboStart with Vidu Q3 Pro
Main goalExplore prompts, run more variants, reduce per-second spendCompare the Pro variant for final candidate clips
Budget profileLower peak and off-peak prices at every listed resolutionHigher per-second prices at every listed resolution
API modes on Novita AIText-to-video, image-to-video, start-end-to-videoText-to-video, image-to-video, start-end-to-video
Output options in docsUp to 1080p; 1-16 secondsUp to 1080p; 1-16 seconds
Audio support in docsQ3 audio-video generation controls are availableQ3 audio-video generation controls are available
Best first testHigh-volume iteration, prompt search, rough cuts, social variantsFinal comparison pass after Turbo narrows the prompt and mode

Turbo and Pro are better viewed as two pricing and workflow options than as a simple good-versus-bad ranking. The public docs and pricing pages support a cost and endpoint comparison, but they do not publish a universal benchmark, latency score, or scene-quality ranking that settles the question for every prompt. If the output really matters, the more reliable way to decide is to run the same prompt or image set through both variants and compare the results side by side.

Vidu Q3 text-to-video, image-to-video, and start-end modes

Vidu Q3 is not a single setup. On Novita AI, the useful choice is two-dimensional: pick Pro or Turbo, then pick the generation mode that matches your source material.

ModeWhat you provideUse it whenPro endpointTurbo endpoint
Text-to-videoA text promptYou are exploring a new scene, character, camera move, ad concept, or storyboard idea from scratch/v3/async/vidu-q3-pro-t2v/v3/async/vidu-q3-turbo-t2v
Image-to-videoOne reference image plus optional motion promptYou already have a product image, character frame, style reference, or still composition to animate/v3/async/vidu-q3-pro-i2v/v3/async/vidu-q3-turbo-i2v
Start-end-to-videoTwo images, one start frame and one end frameYou need the model to interpolate between a known first and last frame/v3/async/vidu-q3-pro-f2v/v3/async/vidu-q3-turbo-f2v

For text-to-video, the docs list a required prompt, an audio boolean, duration, resolution, aspect_ratio, off_peak, and watermark controls. Pro text-to-video accepts prompts up to 2,000 characters; Turbo text-to-video accepts prompts up to 5,000 characters.

For image-to-video, the docs require an images array. Pro image-to-video currently supports one image input, with JPG, JPEG, PNG, and WebP accepted, a maximum 50 MB per image, and an aspect ratio between 1:4 and 4:1. The Pro image-to-video docs list audio as a custom audio URL field for background music. Turbo image-to-video also uses a reference image array, supports the same listed image formats and 50 MB limit, and lists an audio boolean plus an audio_type option: all, speech_only, or sound_effect_only.

For start-end-to-video, both Pro and Turbo docs require exactly two images: the first image is the start frame and the second image is the end frame. The docs list 1-16 second duration and 540p, 720p, and 1080p resolution options. Use this mode when you care about where a transition begins and ends more than you care about discovering a scene from a blank prompt.

Vidu Q3 Pro and Turbo pricing

Novita AI pricing is listed per second for Vidu Q3 Pro and Vidu Q3 Turbo. Current public pricing checked on June 23, 2026 shows the same rates across text-to-video, image-to-video, and start-end-to-video for each variant and resolution.

ResolutionVidu Q3 Pro peakVidu Q3 Pro off-peakVidu Q3 Turbo peakVidu Q3 Turbo off-peak
540p$0.0625/s$0.0313/s$0.0357/s$0.0179/s
720p$0.1339/s$0.0670/s$0.0536/s$0.0268/s
1080p$0.1429/s$0.0714/s$0.0714/s$0.0357/s

Here is what that means for common test clips:

Test clipPro peakPro off-peakTurbo peakTurbo off-peak
5 seconds at 540p$0.3125$0.1565$0.1785$0.0895
10 seconds at 720p$1.3390$0.6700$0.5360$0.2680
16 seconds at 1080p$2.2864$1.1424$1.1424$0.5712

Off-peak mode makes the most sense when turnaround is flexible. The Vidu Q3 API docs describe off-peak tasks as lower-cost tasks processed within 48 hours, which can work well when you are exploring prompts and want a broader batch of tests at a lower cost. If you are building a user-facing flow, peak mode is still the safer default unless delayed delivery is already part of the product experience.

Which Vidu Q3 mode should you test first?

The easiest way to choose a mode is to start with the input you already have. A lot of disappointing tests come from picking the most exciting option first, instead of the one that best matches the material on hand.

SituationFirst mode to testRecommended variantWhy
You only have a written ideaText-to-videoTurboIt lets you explore more prompt directions at a lower per-second cost.
You have a product render or character stillImage-to-videoTurbo first, then Pro for finalistsThe reference image constrains the visual target, and Turbo keeps iteration cheaper.
You have a storyboard with a known first and last frameStart-end-to-videoTurbo first, then Pro if neededThe two images give the model explicit endpoints, which is useful for controlled transitions.
You need a silent clip for later editingText-to-video or image-to-video with audio disabledTurboThe docs expose an audio control, so you can avoid generating audio you will replace.
You are deciding between final candidate clipsSame mode in both variantsPro and Turbo side by sideUse identical inputs and compare outputs for your scene instead of relying on generic assumptions.

If you are new to Vidu Q3 on Novita AI, this is usually the smoothest way to start:

  1. Run Turbo text-to-video at 540p or 720p to find the prompt direction.
  2. Move to image-to-video if you need identity, product, or visual-style control from a still image.
  3. Use start-end-to-video only when you have a real first frame and last frame.
  4. Re-run your strongest candidate in Pro at the target resolution before deciding whether the higher price is justified for that scene.

That sequence keeps the more expensive comparison step close to the final decision, when you already have a promising direction. It also helps you avoid spending Pro budget on early prompt exploration that you may end up discarding anyway.

Vidu Q3 API endpoints and request flow

All six Vidu Q3 endpoints in this comparison use Novita AI’s v3 asynchronous task pattern. You submit a generation request, receive a task_id, then call the Task Result API with that task_id to retrieve the generated video when the task succeeds.

EndpointMethodResult pattern
/v3/async/vidu-q3-pro-t2vPOSTReturns task_id
/v3/async/vidu-q3-pro-i2vPOSTReturns task_id
/v3/async/vidu-q3-pro-f2vPOSTReturns task_id
/v3/async/vidu-q3-turbo-t2vPOSTReturns task_id
/v3/async/vidu-q3-turbo-i2vPOSTReturns task_id
/v3/async/vidu-q3-turbo-f2vPOSTReturns task_id
/v3/async/task-resultGETReturns task status and generated media when available

A minimal Turbo text-to-video request looks like this:

curl --request POST \
  --url https://api.novita.ai/v3/async/vidu-q3-turbo-t2v \
  --header "Authorization: Bearer $NOVITA_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "A close-up product launch video on a clean studio table, soft camera push-in, subtle lighting movement",
    "duration": 5,
    "resolution": "720p",
    "aspect_ratio": "16:9",
    "audio": true,
    "off_peak": false
  }'

Then poll the task result endpoint:

curl --request GET \
  --url "https://api.novita.ai/v3/async/task-result?task_id=$NOVITA_TASK_ID" \
  --header "Authorization: Bearer $NOVITA_API_KEY"

For image-to-video, replace the endpoint with the I2V endpoint and provide the images array. For start-end-to-video, use the F2V endpoint and provide two images in order: start frame first, end frame second.

Practical Vidu Q3 test plan

Use a small test matrix instead of one-off impressions. The goal is not to prove a universal winner; it is to choose the right variant and mode for your use case.

Test passVariantModeResolutionWhat to evaluate
Prompt searchTurboText-to-video540p or 720pWhich prompt structure gives the right scene, motion, and framing?
Reference controlTurboImage-to-video720pDoes the model preserve the subject or product enough for your use case?
Transition controlTurboStart-end-to-video720pDoes the motion between first and last frame feel usable?
Final comparisonTurbo and ProSame winning modeTarget resolutionIs the Pro result worth the higher per-second cost for this scene?
Cost passWinning variantSame winning modeTarget resolutionShould this run peak, or can it move to off-peak?

When you compare Pro and Turbo, keep these variables the same:

  • Same prompt, image inputs, duration, resolution, and aspect ratio.
  • Same audio setting.
  • Same off-peak setting when you are comparing output results.
  • Same evaluation criteria: identity consistency, motion clarity, camera movement, audio usefulness, and editability.

If you change the prompt and the model variant at the same time, the comparison gets muddy, because you can no longer tell which change actually improved the result.

FAQ

Is Vidu Q3 Turbo cheaper than Vidu Q3 Pro on Novita AI?

Yes. Current Novita AI pricing checked on June 23, 2026 lists Turbo below Pro at 540p, 720p, and 1080p for text-to-video, image-to-video, and start-end-to-video.

Do Vidu Q3 Pro and Turbo support the same modes?

Novita AI docs list separate Pro and Turbo endpoints for text-to-video, image-to-video, and start-end-to-video. Each endpoint returns a task_id and uses the v3 asynchronous task result flow.

Should I use text-to-video or image-to-video first?

Use text-to-video first when you only have an idea or written scene. Use image-to-video first when a reference image matters, such as a product shot, character frame, or fixed visual style.

When should I use start-end-to-video?

Use start-end-to-video when you have two frames and need the model to create the motion between them. It is the most structured of the three modes because the first and last frame are both specified.

Does Vidu Q3 support audio controls?

Yes. The Vidu Q3 docs include audio controls. Text-to-video and start-end-to-video expose an audio boolean. Pro image-to-video lists audio as a custom audio URL field for background music, while Turbo image-to-video lists an audio boolean plus audio_type options for all, speech_only, and sound_effect_only.

Should I run both Vidu Q3 Turbo and Pro for the same prompt?

Run Turbo first when you are exploring prompts, references, durations, and aspect ratios. If one result is close to what you need, rerun the same setup on Pro so the comparison isolates the model variant instead of mixing prompt and input changes.