How to Access ERNIE 4.5: Effortless Ways via Web, API, and Code

https://blogs.novita.ai/how-to-access-ernie-4-5-effortless-ways-via-web-api-and-code/

ERNIE 4.5 is Baidu’s advanced AI model family for powerful text and multimodal processing. With options for both large-scale and lightweight deployment, ERNIE 4.5 offers efficient performance and cost-effective access for developers and businesses. Whether you’re working with text, images, or both, ERNIE 4.5 can be accessed easily through web interfaces, APIs, and cloud platforms—no complex setup required.

Simple Introduction to ERNIE 4.5

ERNIE 4.5 is a family of advanced AI models developed by Baidu, focusing on efficient multimodal and text-based processing. These models utilize Mixture of Experts (MoE) architectures for larger variants and dense architectures for smaller ones. They support text and vision modalities, with options for pre-training (PT) and base versions. Below is a table of the key model variants and a diagram highlighting ERNIE’s innovations in AI training flow.

ModelBaseActiveTypeModalityTrain
ERNIE 4.5 VL 424B A47B424B47BMoET+VPT
ERNIE 4.5 VL 424B A47B Base424B47BMoET+VBase
ERNIE 4.5 VL 28B A3B28B3BMoET+VPT
ERNIE 4.5 VL 28B A3B Base28B3BMoET+VBase
ERNIE 4.5 VL 28B A3B Thinking28B3BMoET+VPT
ERNIE 4.5 300B A47B300B47BMoETextPT
ERNIE 4.5 300B A47B Base300B47BMoETextBase
ERNIE 4.5 21B A3B21B3BMoETextPT
ERNIE 4.5 21B A3B Base21B3BMoETextBase
ERNIE 4.5 21B A3B Thinking21B3BMoETextPT
ERNIE 4.5 0.3B0.3BDenseTextPT
ERNIE 4.5 0.3B Base0.3BDenseTextBase

AI Training Flow: ERNIE Innovations Highlighted

1. Multimodal Heterogeneous MoE Pre-Training

Joint text & vision pre-training with heterogeneous MoE structure, modality-isolated routing, and balanced multimodal loss.

2. Scaling-Efficient Infrastructure

Hybrid parallelism, hierarchical load balancing, expert parallelism, memory-optimized scheduling, and lossless quantization for high throughput and efficient inference.

3. Modality-Specific Post-Training

Fine-tuning for text or vision tasks, supporting SFT, DPO, and UPO to meet diverse real-world application needs.

Performance Comparison: ERNIE 4.5 vs. GPT-4o

ERNIE 4.5 delivers superior performance and exceptional cost efficiency compared to GPT-4o, making it a highly competitive choice for large-scale AI deployments. This is price on Novita AI!

Performance Comparison: ERNIE 4.5 vs. GPT-4o
From Internet

Access ERNIE 4.5 Through Baidu Platform(Free Trail)

You can try it out directly through the Baidu platform’s web interface, with no installation required. Simply visit the website and start your free trial instantly.

Access Through Baidu Platform(Free Trail)

Alternatively, you can use the Novita API Playground to experiment with ERNIE 4.5 in a developer-friendly environment.

start a free trail on ernie 4.5

Access ERNIE 4.5 Locally

What are the system requirements for using Ernie 4.5?

FP16 Precision

ModelParameters (Active)VRAM NeededIdeal GPU(s)
ERNIE 4.5 VL 424B424B (47B active)~945 GBNVIDIA H100 (80GB) × 12
ERNIE 4.5 300B300B (47B active)~668 GBNVIDIA H100 (80GB) × 9
ERNIE 4.5 VL 28B / ERNIE 4.5 VL 28B A3B Thinking28B (3B active)~80 GBNVIDIA A100/H100 (80GB)
ERNIE 4.5 21B / ERNIE 4.5 21B A3B Thinking21B (3B active)~48GBNVIDIA RTX 4090 (24GB)X2
ERNIE 4.5 0.3B300M~2.5 GBNVIDIA RTX 4090 (8GB) / RTX 3060 (12GB)

INT4 Precision

ModelParameters (Active)VRAM NeededIdeal GPU(s)
ERNIE 4.5 VL 424B424B (47B active)~237 GBNVIDIA H100 (80GB) × 3
ERNIE 4.5 300B300B (47B active)~168 GBNVIDIA H100 (80GB) × 3
ERNIE 4.5 VL 28B / ERNIE 4.5 VL 28B A3B Thinking28B (3B active)~17 GBNVIDIA RTX 4090 (24GB) / A10G (24GB)
ERNIE 4.5 21B / ERNIE 4.5 21B A3B Thinking21B (3B active)~13 GBNVIDIA RTX 4080 (16GB) / A10G (24GB)
ERNIE 4.5 0.3B300M~1.8 GBMost GPUs with >4GB VRAM

Based on the official ERNIEToolkit and open-source release:

  • OS: Linux is strongly recommended (Ubuntu or similar).
  • Framework: PaddlePaddle (latest version) required.
    • For inference/training: use ERNIEKit (based on PaddlePaddle).
    • Deployment can be accelerated with FastDeploy.
  • Dependencies:
    • Python 3.8+
    • CUDA and cuDNN matching your GPU setup.
    • For PyTorch environment: models are also available via transformers with trust_remote_code=True

If purchasing a GPU seems too costly, you can take advantage of Novita AI’s cost-effective and reliable cloud GPU services. For instance, you can access a 1x H100 SXM 80GB instance with 80 GB VRAM for just $2.56 per hour, or scale up to 8 GPUs for $20.48 per hour.

Access ERNIE 4.5 from Python Application

  • Hugging Face: Use QERNIE 4.5 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
  • Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like ContinueAnythingLLM, LangChainDify and Langflow through official connectors and step-by-step integration guides.
  • OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline and Cursor, designed for the OpenAI API standard.
Access ERNIE 4.5 from Python Application
You can get more details in Docs

Access ERNIE 4.5 via API

Step 1: Log In and Access the Model Library

Log in to your account and click on the Model Library button.

Log In and Access the Model Library

Step 2: Choose Your Model

Browse through the available options and select the model that suits your needs.

Choose Your Model

Step 3: Start Your Free Trial

Begin your free trial to explore the capabilities of the selected model.

start a free trail on ernie 4.5

Step 4: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

get api key

Step 5: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="",
)

model = "baidu/ernie-4.5-300b-a47b-paddle"
stream = True # or False
max_tokens = 6000
system_content = ""Be a helpful assistant""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)
  
  

Accessing ERNIE 4.5 is flexible and straightforward—choose the approach that fits your workflow, from instant web trials to robust API integration and local deployment. With superior performance and efficient pricing, ERNIE 4.5 is a practical choice for next-generation AI applications.

Frequently Asked Questions

Is ERNIE 4.5 really better than other big AI models?

Yes, ERNIE 4.5 scores higher than DeepSeek V3 671B in most benchmarks and is very competitive with other top models.

What are the system requirements for running ERNIE 4.5 locally?

Requirements vary by model size, but you’ll need a Linux system, Python 3.8+, PaddlePaddle, and a compatible NVIDIA GPU. Cloud GPU options are available if you don’t have local hardware.

How much VRAM do I need to run ERNIE 4.5?

Running the largest versions of ERNIE 4.5 (like 424B or 300B) requires very high VRAM—hundreds of GBs and multiple high-end GPUs. Smaller or quantized versions need much less VRAM.

Novita AI is the All-in-one cloud platform that empowers your AI ambitions. Integrated APIs, serverless, GPU Instance — the cost-effective tools you need. Eliminate infrastructure, start free, and make your AI vision a reality.

Recommend Reading


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading