How to Access Gemma-3-12B-IT in 3 Ways?

Gemma-3-12B-IT belongs to Google’s Gemma family of open models and delivers strong multimodal performance while remaining lightweight and efficient. Built on the same advanced foundation as Gemini, it handles tasks such as text generation, summarization, reasoning, and image understanding with ease, offering a powerful yet accessible option for developers and researchers alike.

In this guide, we will begin with a quick review of Gemma-3-12B-IT and then walk through different ways to access it, including web interfaces, API integration, and local deployment.

What is Gemma-3-12B-IT?

Basic Information

FeatureDetails
Model Size12B Parameters
ArchitectureDense
Open SourceYes
Context Window128K Tokens
Multilingual SupportExcels in English, support 140 languages
MultimodalityText and Images (normalized to 896 x 896 resolution)
LicenseGemma
BenchmarkPerformance
GPQA Diamond35%
MMLU-Pro60%
IFBench37%
SciCode17%
LiveCodeBench14%
AIME 202518%
Humanity’s Last Exam4.8%
AA-LCR7%

Extended Context Processing

With a 128,000-token context window, Gemma-3-12B-IT moves beyond a mere technical upgrade as it redefines how organizations process lengthy documents and intricate analytical workflows. Its advanced design removes the fragmentation issues found in conventional models, allowing seamless comprehension across large volumes of text without losing coherence or context.

This expanded capacity opens new frontiers for document intelligence, letting AI systems retain understanding throughout entire research papers, contracts, or technical manuals while also interpreting visual components such as graphs, charts, and illustrations.

Advanced Multimodal Integration

Built with a vision-language framework, Gemma-3-12B-IT advances far beyond standard image recognition to achieve human-like analytical reasoning. By linking textual and visual information, it can interpret relationships between the two modalities and extract deeper insights that would be inaccessible through text-only or image-only analysis.

Key Highlights

  • Document Analysis: Pull out useful insights from reports that include charts, graphs, and visuals.
  • Visual Understanding: Answer complex image-based questions with clear and logical reasoning.
  • Content Generation: Write clear descriptions, captions, and explanations that connect visuals and text naturally.
  • Learning Support: Offer thorough, easy-to-grasp explanations that combine text with helpful visual examples.

Instruction-Tuned Architecture

Gemma-3-12B-IT’s refined instruction-tuning design streamlines the AI deployment process by minimizing the need for complex prompt engineering or advanced technical setup. It naturally interprets human language commands and preserves context through extended, multi-turn conversations, enabling smoother and more intuitive interaction with the model.

How to Access Gemma-3-12B-IT: Web Interface (for Beginner)

Gemma3 12B IT Web Interface on Novita AI

How to Access Gemma-3-12B-IT: Using the API (for Developer)

Novita AI provides Gemma-3-12B-IT API with 131K context, and costs of $0.05/input and $0.1/output, allowing developers to seamlessly tap into Google’s lightweight multimodal model for advanced reasoning, summarization, and generation tasks via one unified API.

Novita AI

Step 1: Log In and Access the Model Library

Log in to your account and click on the Model Library button.

Log In and Access the Model Library

Step 2: Start Your Free Trial

Select your modal and begin your free trial to explore the capabilities of the selected model.

Gemma3-12b-it playground

Step 3: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

get API Key

Step 4: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/openai",
    api_key="session_Um3Ozta39g2J__yeP9b_rOegzeA_qSYYquKzJS2oitKENIo8_H2FL2sCtl25-sKWjCY_wsmN18iuDp1zv_Xkaw==",
)

model = "google/gemma-3-12b-it"
stream = True # or False
max_tokens = 4096
system_content = "Be a helpful assistant"
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)

Option 2: Multi-Agent Workflows with OpenAI Agents SDK

Build sophisticated multi-agent systems leveraging Gemma-3-12B-IT’s dual-mode capabilities:

  • Plug-and-Play Integration: Use DeepSeek V3.1 in any OpenAI Agents workflow
  • Advanced Agent Capabilities: Support for handoffs, routing, and tool integration
  • Scalable Architecture: Design agents that leverage DeepSeek V3.1’s capabilities

How to Access Gemma-3-12B-IT: Local Deployment (for Advanced Users)

Gemma3-12B-IT Hardware Requirements

QuantizationWeights Only (Approx.)With KV-cache (Approx.)Minimum ConfigurationRecommended GPU
BF1624.0 GB38.9 GBNvidia L40S ×1Nvidia H100 ×1
SFP812.4 GB27.3 GBNvidia L40S ×1Nvidia A100 ×1
INT46.6 GB21.5 GBNvidia L4 ×1Nvidia L40S ×1

For users seeking greater control and flexibility, Novita AI provides on-demand Cloud GPU instances includeing L40S, A100, H100, as well as other demanding options such as RTX 4090, RTX 5090, and RTX 6000 Ada, allowing users to deploy high-performance workloads effortlessly without relying on local hardware.

GPU List 1 on Novita AI
GPU List 2 on Novita AI

Best Practices for Using Gemma-3-12B-IT

  • Choose the Right Access Method: Beginners can start with the web interface for quick trials, while developers should use the Novita AI API for integration into apps and workflows. Advanced users may prefer local deployment for full control and offline use.
  • Mind Resource Requirements: If deploying locally, confirm your GPU meets the minimum configuration—quantized models such as INT4 or SFP8 are ideal for balancing performance and memory efficiency.
  • Optimize for Context and Throughput: Gemma-3-12B-IT supports up to 128K tokens. For longer inputs, split content into structured segments or use summarization to maintain coherent results.
  • Leverage Multimodal Strengths: Combine text and images in prompts to explore the model’s analytical reasoning and descriptive generation capabilities.
  • Experiment and Iterate: Adjust parameters like temperature, top_p, and max_tokens to fine-tune creativity, factuality, and response length according to your task.

Frequently Asked Questions

What is Gemma-3-12B-IT?

Gemma-3-12B-IT is an instruction-tuned, multimodal model from Google’s Gemma series, capable of handling both text and image inputs to generate natural, context-aware text outputs.

How is Gemma-3-12B-IT different from other Gemma models?

It offers a balanced combination of performance and efficiency, featuring 12 billion parameters optimized for reasoning, summarization, and visual understanding tasks.

How can I start with Gemma-3-12B-IT?

You can access it through the official web interface, Novita AI API or GPU instances, or local deployment using Hugging Face. Novita AI offers affordable pricing and robust performance.

Novita AI is a leading AI cloud platform that provides developers with easy-to-use APIs and affordable, reliable GPU infrastructure for building and scaling AI applications.


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading