MiniMax Hailuo 2.3 on Novita AI is easiest to choose by starting with your input: use Text to Video when the scene only exists as a prompt, use Image to Video when a first frame or reference image must anchor the clip, and use Fast Image to Video when you already have an image and want to test the lower-priced Fast I2V endpoint before spending on standard I2V.
MiniMax Hailuo 2.3 Mode Selection Table
| Decision | Start here | Why |
|---|---|---|
| You only have a written scene | Hailuo 2.3 Text to Video | The T2V endpoint requires prompt and does not require an image. |
| You have a product frame, character frame, storyboard frame, or approved visual | Hailuo 2.3 Image to Video | The I2V endpoint requires both prompt and image, so the input image anchors the first visual state. |
| You have an image and need a cheaper first I2V pass | Hailuo 2.3 Fast Image to Video | The Fast I2V pricing rows are lower than standard I2V for the same visible duration and resolution combinations. |
| You need 1080P output | Any of the three modes, limited to 6 seconds | Novita docs list 1080P support for 6-second Hailuo 2.3 jobs; 10-second jobs are listed at 768P only. |
| You need a 10-second clip | T2V, I2V, or Fast I2V at 768P | The docs list 10 seconds as an available duration, with 768P as the supported 10-second resolution. |
| You need prompt camera commands documented in the API reference | T2V or standard I2V | The T2V and I2V docs list 15 supported camera commands; the Fast I2V page does not list that camera-command section. |
| You are budgeting a broad test batch | Start at 6s 768P; use Fast I2V only when an image is available | 6s 768P is the lowest visible row for each mode, and Fast I2V is the lowest visible Hailuo 2.3 I2V row. |
| You are producing a near-final prompt-only clip | T2V at the target resolution and duration | T2V is the only Hailuo 2.3 2.3-family endpoint here that does not need image input. |
| You are producing a near-final asset-led clip | Standard I2V at the target resolution and duration | Standard I2V keeps the image-led workflow while using the non-Fast endpoint. |
The important distinction is not “which mode is best” in the abstract. It is whether your first useful test should be prompt-only, image-led, or a lower-priced Fast I2V pass. If a source image is not available, Fast mode is not an option because Fast Hailuo 2.3 is documented as an image-to-video endpoint.
MiniMax Hailuo 2.3 API Modes on Novita AI
Novita AI documents three separate asynchronous MiniMax Hailuo 2.3 video APIs:
| Mode | Novita API path | Required inputs | Shared controls |
|---|---|---|---|
| MiniMax Hailuo 2.3 Text to Video | POST /v3/async/minimax-hailuo-2.3-t2v | prompt | duration, resolution, enable_prompt_expansion, fast_pretreatment |
| MiniMax Hailuo 2.3 Image to Video | POST /v3/async/minimax-hailuo-2.3-i2v | prompt, image | duration, resolution, enable_prompt_expansion, fast_pretreatment |
| MiniMax Hailuo 2.3 Fast Image to Video | POST /v3/async/minimax-hailuo-2.3-fast-i2v | prompt, image | duration, resolution, enable_prompt_expansion |
All three endpoints are asynchronous. The create request returns a task_id, not a finished video URL. Applications should store the task ID and use the Novita AI Task Result API to retrieve the generated output when the job completes.
The Hailuo 2.3 API references checked on June 23, 2026 list duration options of 6 and 10. They list resolution defaults at 768P, with 768P and 1080P available for 6-second videos and 768P available for 10-second videos. The prompt field is required across all three modes and supports 1 to 2000 characters.
For image-led modes, the image field supports a public URL or Base64 data URL such as data:image/jpeg;base64,.... That makes I2V and Fast I2V better suited to workflows where the visual seed already exists in storage, a design tool, a product catalog, or a previous generation step.
What Is the Difference Between T2V, I2V, and Fast I2V?
Choose Hailuo 2.3 Text to Video when the first version of the clip should come from language alone. This is the cleaner starting point for concept exploration, scene ideation, shot planning, and prompt tests where you do not yet have a fixed product image or character frame.
T2V is also the simplest request shape. It has no image upload or image URL requirement, so a product can collect a prompt, choose duration and resolution, submit the task, and poll for the result. Use it when the acceptance criteria are about the scene idea rather than fidelity to a specific starting image.
Choose Hailuo 2.3 Image to Video when the input image is part of the acceptance criteria. A product still, approved character frame, brand visual, storyboard panel, or generated keyframe should not be recreated from text if you already have the asset. Use I2V so the generation starts from the image you provide.
Standard I2V also fits review workflows where a designer, marketer, or product team signs off on a still frame before motion is added. The input image becomes the reference point for the clip, while the prompt describes how the scene should move.
Choose Hailuo 2.3 Fast Image to Video when you already have an image and want to test the Fast I2V endpoint’s lower listed prices. Fast I2V is not a prompt-only mode; it still requires image plus prompt. It is therefore a cost and endpoint choice inside an image-led workflow, not a replacement for T2V.
The docs describe Fast Hailuo 2.3 as accelerated and positioned to balance quality and performance at a more cost-effective rate. For practical planning, treat that as a reason to test it early with your own assets instead of assuming it will always replace standard I2V. If Fast I2V passes your visual acceptance criteria, it may be the better iteration lane. If it does not, move the same image and prompt direction to standard I2V.
There is one documented control difference to notice. The T2V and standard I2V request bodies include fast_pretreatment; the Fast I2V request body shown in the docs does not. The T2V and standard I2V docs also list 15 supported camera commands, including pan, tilt, zoom, truck, push, pull, pedestal, shake, tracking shot, and static shot. The Fast I2V page checked for this article does not include that camera-command section, so avoid promising identical camera-command behavior across all three modes unless your own tests confirm it.
How Much Does Hailuo 2.3 Cost on Novita AI?
Current Novita model-library and pricing data checked on June 23, 2026 lists the following MiniMax Hailuo 2.3 rows:
| Mode | Duration | Resolution | Price |
|---|---|---|---|
| Hailuo 2.3 Text to Video | 6s | 768P | $0.28/video |
| Hailuo 2.3 Text to Video | 10s | 768P | $0.56/video |
| Hailuo 2.3 Text to Video | 6s | 1080P | $0.49/video |
| Hailuo 2.3 Image to Video | 6s | 768P | $0.28/video |
| Hailuo 2.3 Image to Video | 10s | 768P | $0.56/video |
| Hailuo 2.3 Image to Video | 6s | 1080P | $0.49/video |
| Hailuo 2.3 Fast Image to Video | 6s | 768P | $0.19/video |
| Hailuo 2.3 Fast Image to Video | 10s | 768P | $0.32/video |
| Hailuo 2.3 Fast Image to Video | 6s | 1080P | $0.33/video |
Two pricing takeaways matter for planning. First, standard T2V and standard I2V have the same visible price rows at the same duration and resolution. Choose between them based on input, not cost. Second, Fast I2V is lower priced than standard I2V across the visible Hailuo 2.3 rows, but it requires an image and should be evaluated against your own output criteria.
For early exploration, 6s at 768P is the lowest visible option in each mode. For a prompt-only test, that means $0.28/video with T2V. For an image-led test, that means $0.28/video with standard I2V or $0.19/video with Fast I2V.
For 1080P, the visible Hailuo 2.3 rows are 6-second jobs: $0.49/video for T2V or standard I2V, and $0.33/video for Fast I2V. For 10-second jobs, the visible rows are 768P: $0.56/video for T2V or standard I2V, and $0.32/video for Fast I2V.
Prices can change. Before a high-volume batch, verify the exact row in the Novita AI model library or the console pricing view.
Which Mode Should You Test First?
If the idea still lives only in a brief, start with T2V. Keep the first prompt short, choose 6s 768P, and add camera direction only when it will affect the review. This keeps the first pass lightweight: no image prep, no asset upload, just a quick check of whether the scene concept is worth developing.
If the image is already part of the brief, use standard I2V first. A product still, character frame, or storyboard panel changes the job from “invent a scene” to “animate this exact starting point.” T2V may create something plausible, but it cannot preserve a specific source image unless that image is passed into an I2V endpoint.
Fast I2V is useful when you already have the image and want more room to experiment before picking finalists. Because it has the lowest visible Hailuo 2.3 price rows for image-led jobs, it is a sensible lane for testing motion direction, prompt wording, and whether the source image works as a seed. After that pass, keep using Fast I2V if the output meets your bar, or move the stronger candidates to standard I2V.
The main trap is choosing Fast I2V too early. It is still image-to-video, so it is not a shortcut for a written brief with no image attached. In that case, T2V is the better first test.
For broad exploration, 768P is usually enough to judge prompt direction, image fit, and motion ideas. Save 1080P for the smaller set of clips that are close enough to inspect in detail.
Use 10-second tests when the extra time changes the creative decision, not as the default first pass. The documented Hailuo 2.3 row for 10-second jobs is 768P; if your team needs 1080P, plan around 6-second jobs or confirm whether the live console has added a newer option before committing to a batch.
MiniMax Hailuo 2.3 API Workflow
A production integration should treat Hailuo 2.3 as an async job workflow:
- Choose T2V, I2V, or Fast I2V from the available input.
- Choose 6s or 10s duration.
- Choose
768Por1080P, noting that 1080P is documented for 6-second jobs. - Submit the request to the model-specific async endpoint.
- Store the returned
task_id. - Poll the Task Result API until the task succeeds or fails.
- Store the returned media URL according to your product’s retention rules.
Here is a minimal text-to-video request:
curl --location --request POST 'https://api.novita.ai/v3/async/minimax-hailuo-2.3-t2v' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${NOVITA_API_KEY}" \
--data-raw '{
"prompt": "A compact smart speaker on a kitchen counter lights up as the camera slowly pushes in. Soft morning light, clean product demo, no text overlays.",
"duration": 6,
"resolution": "768P",
"enable_prompt_expansion": true,
"fast_pretreatment": false
}'
Here is a minimal image-to-video request:
curl --location --request POST 'https://api.novita.ai/v3/async/minimax-hailuo-2.3-i2v' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${NOVITA_API_KEY}" \
--data-raw '{
"image": "https://example.com/product-frame.jpg",
"prompt": "Animate the product with a subtle light pulse while the camera makes a slow push-in. Keep the product centered and avoid adding text.",
"duration": 6,
"resolution": "768P",
"enable_prompt_expansion": true,
"fast_pretreatment": false
}'
Here is the same image-led test using Fast I2V:
curl --location --request POST 'https://api.novita.ai/v3/async/minimax-hailuo-2.3-fast-i2v' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${NOVITA_API_KEY}" \
--data-raw '{
"image": "https://example.com/product-frame.jpg",
"prompt": "Animate the product with a subtle light pulse while the camera makes a slow push-in. Keep the product centered and avoid adding text.",
"duration": 6,
"resolution": "768P",
"enable_prompt_expansion": true
}'
In all three cases, build for queued, processing, succeeded, and failed states. The initial response gives you a task ID; it does not mean the video is already available.
Final Recommendation
For most teams, the first MiniMax Hailuo 2.3 test should be 6s 768P. Use T2V if you only have a prompt. Use standard I2V if the first frame or reference image is non-negotiable. Use Fast I2V if you have an image and want a lower-priced iteration lane before deciding which outputs deserve standard I2V or 1080P review.
That sequence keeps the first pass tied to the actual input constraint. It also avoids a common mistake: treating Fast mode as a universal shortcut. Fast Hailuo 2.3 is an image-to-video endpoint, so it is useful only after an image exists.
FAQ
Is MiniMax Hailuo 2.3 available for text-to-video on Novita AI?
Yes. Novita AI documents POST /v3/async/minimax-hailuo-2.3-t2v for MiniMax Hailuo 2.3 Text to Video.
Is MiniMax Hailuo 2.3 available for image-to-video on Novita AI?
Yes. Novita AI documents POST /v3/async/minimax-hailuo-2.3-i2v for standard Image to Video and POST /v3/async/minimax-hailuo-2.3-fast-i2v for Fast Image to Video.
What is the difference between Hailuo 2.3 I2V and Fast I2V?
Both require prompt and image. The Fast I2V endpoint has lower visible price rows than standard I2V and is documented separately as minimax-hailuo-2.3-fast-i2v. Standard I2V includes fast_pretreatment in the documented request body; Fast I2V does not show that field.
Does Hailuo 2.3 support 1080P?
Yes, for 6-second jobs. The Hailuo 2.3 docs checked on June 23, 2026 list 768P and 1080P for 6-second videos and 768P only for 10-second videos.
How much does MiniMax Hailuo 2.3 cost on Novita AI?
The visible rows checked on June 23, 2026 start at $0.28/video for 6s 768P T2V or standard I2V, $0.19/video for 6s 768P Fast I2V, $0.49/video for 6s 1080P T2V or standard I2V, and $0.33/video for 6s 1080P Fast I2V.
