How to Access Gemma 3 27B for Developers

Key Highlights

Gemma 3 27B is an open-source, multimodal LLM released by Google in March 2025.

Supports 140+ languages with a new tokenizer and 128K context window.

Handles both text and image input, outputs text.

Trained on 14 trillion tokens, excels in math, code, and instruction following.

Benchmark scores: 1339 Elo, 69.0 (MATH), 67.5 (MMLU-Pro).

Can run on a single NVIDIA H100 or be deployed via Ollama (local) or Novita AI API / Cloud GPU.

Gemma 3 27B is a powerful, flexible LLM built by Google. It combines multilingual reach, multimodal input, and high performance, making it ideal for diverse AI workloads—locally or in the cloud.

Table Of Contents

What is Gemma 3 27B?
How to Access Gemma 3 27B Locally?
How to Access Gemma 3 27B via Novita API?
Using Gemma 3 27B via Chatbox
Using Gemma 3 27B via Cloud GPU
Frequently Asked Questions
Simple APIs and Scalable GPU

What is Gemma 3 27B?

Notable Features

Advanced Multilingual Support: With its new tokenizer, Gemma 3 is highly effective across 140+ languages.

Multimodal Input: The ability to process both images and text makes it a versatile tool for a range of applications.

Extended Context Window: The 128K token capacity allows for handling extensive and detailed inputs.

Open Source and Community-Friendly: Being open-source, the model encourages community experimentation and broad adoption.

Category	Item	Details
Basic Info	Release Date	March 12, 2025
	Model Size	27 billion parameters
	Open Source	Yes (released by Google)
Language Support	Supported Multilingual Languages	Over 140 languages
Training	Training Data	14 trillion tokens
	Strengths	Math, coding, instruction following
Multimodal	Multimodal Capability	Yes (processes images and text, outputs text)
Context	Context Window	128K tokens
Model Size by Precision	bf16 (Raw)	Weights: 54.0 GB; Weights + KV Cache: 72.7 GB
	INT4	Weights: 14.1 GB; Weights + KV Cache: 32.8 GB
	INT4 (blocks=32)	Weights: 15.3 GB; Weights + KV Cache: 34.0 GB
	SFP8	Weights: 27.4 GB; Weights + KV Cache: 46.1 GB

Gemma 3 27B Benchmark

Benchmark	Gemma 3 27B	DeepSeek R1	LLaMA 3.3 70B
LMSys Elo Score	1339	~1360	~1260
MMLU-Pro	67.5	84.0	66.4
LiveCodeBench	29.7	65.9	~29
GPQA Diamond	42.4	71.5	50.5
MATH	69.0	97.3	77.0

How to Access Gemma 3 27B Locally?

Hardware Requirements

Gemma 3 27B is described as the “most capable model you can run on a single GPU!

From Google

Setup	VRAM Requirement	Notes
Cloud Deployment	About 80GB VRAM (single/multi-GPU)	A100 or H100 GPUs are recommended for optimal cloud deployment performance.Or RTX 4090 24GB (x3)
Apple Silicon	Gemma 3 4B supported via mlx-vlm	Gemma 3 4B ships with day zero support in mlx-vlm, an open-source library for running vision-language models on Apple Silicon devices, including Macs and iPhones.

Step-by-step process to install Gemma 3 27B locally

# Step 0: Check NVIDIA GPU
nvidia-smi

# Step 1: Update Ubuntu package sources
apt update

# Step 2: Install Ollama dependencies for GPU detection
apt install pciutils lshw

# Step 3: Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Step 4: Start Ollama server (run this in one terminal and keep it open)
ollama serve

# Step 5: (In a new terminal) Check if Ollama is working
ollama

# Step 6: Install Gemma-3 models (choose one)

# Run Gemma-3 1B
# ollama run gemma3:1b

# Run Gemma-3 4B
# ollama run gemma3:4b

# Run Gemma-3 12B
# ollama run gemma3:12b

# ✅ Recommended: Run Gemma-3 27B
ollama run gemma3:27b

# Step 7: Interact with the model directly via prompt in the console
# Example:
# You are an AI-powered trading analyst specializing in cryptocurrency markets.
# Your task is to design an autonomous AI agent that can predict market trends,
# execute trades, and manage risks efficiently. Your response should include:
# - A strategy for analyzing on-chain + off-chain data
# - Model choice for price prediction and sentiment
# - A Python code snippet
# - Risk management methods
# - Ethical concerns

How to Access Gemma 3 27B via Novita API?

Step 1: Log In and Access the Model Library

Try Gemma 3 27B Demo Now!

Step 2: Start Your Free Trial

Begin your free trial to explore the capabilities of the selected model.

Step 3: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

Step 4: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="<YOUR Novita AI API Key>",
)

model = "google/gemma-3-27b-it"
stream = True # or False
max_tokens = 2048
system_content = """Be a helpful assistant"""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)

Using Gemma 3 27B via Chatbox

Step 1: Install Chatbox

Select the “Setting” option. This setting ensures compatibility with APIs following the OpenAI API standard, like Novita AI.
Fill in the configuration fields:
- Base URL: Enter https://api.novita.ai/v3/openai.
- API Key: Paste your Novita AI API Key here.
- Model Name: Paste the model name you copied earlier (e.g., google/gemma-3-27b-it).
Once the configuration is filled out, click Done.

Using Gemma 3 27B via Cloud GPU

Step1：Register an account

If you’re new to Novita AI, begin by creating an account on our website. Once you’re registered, head to the “GPUs” tab to explore available resources and start your journey.

Step2：Exploring Templates and GPU Servers

Start by selecting a template that matches your project needs, such as PyTorch, TensorFlow, or CUDA. Choose the version that fits your requirements, like PyTorch 2.2.1 or CUDA 11.8.0. Then, select the A100 GPU server configuration, which offers powerful performance to handle demanding workloads with ample VRAM, RAM, and disk capacity.

novita ai website screenshot using cloud gpu

Try Novita AI’s High-Performance GPUs

Step3：Tailor Your Deployment

After selecting a template and GPU, customize your deployment settings by adjusting parameters like the operating system version (e.g., CUDA 11.8). You can also tweak other configurations to tailor the environment to your project’s specific requirements.

Step4：Launch an instance

Once you’ve finalized the template and deployment settings, click “Launch Instance” to set up your GPU instance. This will start the environment setup, enabling you to begin using the GPU resources for your AI tasks.

With strong benchmarks and simple deployment options, Gemma 3 27B is a top choice for developers and researchers seeking open, high-quality AI tools.

Frequently Asked Questions

What is Gemma 3 27B?

Gemma 3 27B is a 27-billion-parameter open-source large language model developed by Google. It supports multimodal input (text + image), over 140 languages, and features a 128K token context window.

What are the hardware requirements for running Gemma 3 27B locally?

You’ll need approximately 80GB VRAM. A single NVIDIA H100 is sufficient. You can also run it with multiple RTX 4090s (e.g., 3×24GB).

Is there an API version of Gemma 3 27B available?

Yes! You can access Gemma 3 27B through the Novita AI API, which is fully compatible with the OpenAI API standard.

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing an affordable and reliable GPU cloud for building and scaling.

Simple APIs and Scalable GPU

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing an affordable and reliable GPU cloud for building and scaling.

Try Gemma 3 27B Demo Now

Discover more from Novita

Subscribe to get the latest posts sent to your email.

How to Access Gemma 3 27B Locally, via API, on Cloud GPU

Key Highlights

What is Gemma 3 27B?

Notable Features

Gemma 3 27B Benchmark

How to Access Gemma 3 27B Locally?

Hardware Requirements

Step-by-step process to install Gemma 3 27B locally

How to Access Gemma 3 27B via Novita API?

Step 1: Log In and Access the Model Library

Step 2: Start Your Free Trial

Step 3: Get Your API Key

Step 4: Install the API

Using Gemma 3 27B via Chatbox

Step 1: Install Chatbox

Using Gemma 3 27B via Cloud GPU

Step1：Register an account

Step2：Exploring Templates and GPU Servers

Step3：Tailor Your Deployment

Step4：Launch an instance

Frequently Asked Questions

Simple APIs and Scalable GPU

Discover more from Novita

Leave a CommentCancel reply

CONTACT

RESOURCES

COMPANY

PARTNERS

Key Highlights

What is Gemma 3 27B?

Notable Features

Gemma 3 27B Benchmark

How to Access Gemma 3 27B Locally?

Hardware Requirements

Step-by-step process to install Gemma 3 27B locally

How to Access Gemma 3 27B via Novita API?

Step 1: Log In and Access the Model Library

Step 2: Start Your Free Trial

Step 3: Get Your API Key

Step 4: Install the API

Using Gemma 3 27B via Chatbox

Step 1: Install Chatbox

Using Gemma 3 27B via Cloud GPU

Step1：Register an account

Step2：Exploring Templates and GPU Servers

Step3：Tailor Your Deployment

Step4：Launch an instance

Frequently Asked Questions

Recommend Reading

Simple APIs and Scalable GPU

Discover more from Novita

Related Posts

Leave a CommentCancel reply

CONTACT

RESOURCES

COMPANY

PARTNERS

Discover more from Novita