Novita AI Now Offers Qwen-Image: Advanced 20B Text-to-Image Model with Superior Text Rendering

Qwen-Image on Novita AI

We’re excited to announce that Qwen-Image is now available on Novita AI at just $0.02 per image! This groundbreaking 20B MMDiT image foundation model brings significant advances in complex text rendering and precise image editing to our AI inference platform.

About Qwen-Image

Qwen-Image is a 20B MMDiT image foundation model that achieves significant advances in complex text rendering and precise image editing. The model represents a major breakthrough in AI-powered image generation technology.

The key features include:

  • Superior Text Rendering: Qwen-Image excels at complex text rendering, including multi-line layouts, paragraph-level semantics, and fine-grained details. It supports both alphabetic languages (e.g., English) and logographic languages (e.g., Chinese) with high fidelity.
  • Consistent Image Editing: Through our enhanced multi-task training paradigm, Qwen-Image achieves exceptional performance in preserving both semantic meaning and visual realism during editing operations.
  • Strong Cross-Benchmark Performance: Evaluated on multiple public benchmarks, Qwen-Image consistently outperforms existing models across diverse generation and editing tasks, establishing a strong foundation model for image generation.
Overview of the Qwen-Image architecture.
Source from: Tech Report 

Proven Performance

Qwen-Image has been comprehensively evaluated across multiple public benchmarks, including GenEval, DPG, and OneIG-Bench for general image generation, as well as GEdit, ImgEdit, and GSO for image editing. Qwen-Image achieves state-of-the-art performance on all benchmarks, demonstrating its strong capabilities in both image generation and editing.

Furthermore, results on LongText-Bench, ChineseWord, and TextCraft show that it excels in text rendering—particularly in Chinese text generation—outperforming existing state-of-the-art models by a significant margin. This highlights Qwen-Image’s unique position as a leading image generation model that combines broad general capability with exceptional text rendering precision.

benchmark of Qwen-Image
Source from: 📑 Blog   

Access Qwen-Image on Novita AI

As an AI inference provider, Novita AI has integrated Qwen-Image as a 20B MMDiT model for next-gen text-to-image generation. The model is especially strong at creating stunning graphic posters with native text, making it perfect for professional applications requiring high-quality text integration. For full implementation details, please refer to our documentation.

How Our API Works

We’ve implemented Qwen-Image as an asynchronous API system. When you make a request, only the task_id will be returned initially. You then use the task_id to request our Task Result API to retrieve the image generation results.

API Specifications

Endpoint: https://api.novita.ai/v3/async/qwen-image-txt2img

Request Headers:

  • Content-Type (string, required): Supports application/json
  • Authorization (string, required): Bearer authentication format, for example: Bearer {{API Key}}

Request Body:

  • prompt (string, required): Text prompt for image generation
  • size (string): The size of the generated media in pixels (width*height). Default is 1024*1024. Range: 256 ~ 1536 per dimension

Response:

  • task_id (string, required): Use the task_id to request our Task Result API to retrieve the generated outputs

Getting Started with Qwen-Image on Novita AI

Here’s how to use Qwen-Image through our API:

Step 1: Generate a task_id

Send a POST request to our Qwen-Image Text to Image API:

Request:

curl --location 'https://api.novita.ai/v3/async/qwen-image-txt2img' \
--header 'Authorization: Bearer {{API Key}}' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": "A cinematic scene of a quiet girl with short brown hair sitting by a misty lake at dawn. She wears an oversized sweater, holding a warm mug. Soft morning light filters through the trees, cool tones, tranquil mood, light fog, 50mm photography style.",
    "size": "1024*1024"
}'

Response:

{
    "task_id": "{Returned Task ID}"
}

Step 2: Retrieve your generated images

Use the task_id to get your output images:

curl --location --request GET 'https://api.novita.ai/v3/async/task-result?task_id={Returned Task ID}' \
--header 'Authorization: Bearer {{API Key}}'

HTTP status codes in the 2xx range indicate that the request has been successfully accepted, while status codes in the 5xx range indicate internal server errors. You can get the image URL in the images field of the response.

Why We Added Qwen-Image to Our Platform

As an AI inference provider, we chose to integrate Qwen-Image because it addresses a critical gap in AI image generation: high-quality text rendering. Our users can now:

  • Create professional graphic posters with clear, readable text
  • Generate images with multi-line text layouts and paragraph-level semantics
  • Support both English and Chinese text with high fidelity
  • Achieve state-of-the-art results across multiple image generation benchmarks
  • Access flexible sizing options from 256×256 to 1536×1536 pixels

Qwen-Image Demo

Mount Fuji with cherry blossoms in the foreground, clear sky, peaceful spring day, soft natural light, realistic landscape.

A man in a suit is standing in front of the window, looking at the bright moon outside the window. The man is holding a yellowed paper with handwritten words on it: “A lantern moon climbs through the silver night, Unfurling quiet dreams across the sky, Each star a whispered promise wrapped in light, That dawn will bloom, though darkness wanders by.” There is a cute cat on the windowsill.

A young girl wearing school uniform stands in a classroom, writing on a chalkboard. The text “Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing” appears in neat white chalk at the center of the blackboard. Soft natural light filters through windows, casting gentle shadows. The scene is rendered in a realistic photography style with fine details, shallow depth of field, and warm tones. The girl’s focused expression and chalk dust in the air add dynamism. Background elements include desks and educational posters, subtly blurred to emphasize the central action. Ultra-detailed 32K resolution, DSLR-quality, soft bokeh effect, documentary-style composition

A young girl wearing school uniform stands in a classroom

The text ‘Qwen-Image on Novita AI’ designed in a sleek, translucent glass style. Each letter appears as if made from frosted or glossy glass, with realistic lighting, soft shadows, and subtle reflections. The background is minimal and modern — possibly a soft gradient, abstract blur, or dark surface — to enhance the glass effect. The overall look is elegant, futuristic, and visually striking.

The text 'Qwen-Image on Novita AI'

Start Using Qwen-Image Today

Ready to experience superior text rendering in AI-generated images? Get started with Qwen-Image on our AI inference platform:

  1. Sign up for your Novita AI account
  2. Get your API key from the dashboard
  3. Use our comprehensive API documentation
  4. Start generating images with exceptional text quality

Qwen-Image is now available on Novita AI – bringing you the next generation of text-to-image generation with unmatched text rendering capabilities through our AI inference platform.

Novita AI is an AI cloud platform that helps developers easily deploy AI models through a simple API, backed by affordable and reliable GPU cloud infrastructure. By supporting open-source libraries for LLM inference and serving, Novita AI is driving the future of AI innovation.


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading