Gemini 3 Developer Guide

AI Summary11 min read

TL;DR

Gemini 3 is Google's advanced AI model family with features like dynamic thinking levels, media resolution control, and thought signatures for enhanced reasoning. This guide explains how to optimize performance and handle API parameters for tasks like coding and multimodal analysis.

Key Takeaways

  • Gemini 3 Pro defaults to high thinking for complex reasoning but allows low thinking for faster responses in simple tasks.
  • Media resolution settings (low, medium, high) control token usage and detail in image, PDF, and video processing.
  • Thought signatures must be managed in API calls to maintain reasoning context, especially in function calling and chat interactions.
  • Temperature should remain at the default 1.0 to avoid degraded performance in reasoning tasks.
  • The model supports structured outputs with tools like Google Search and code execution for advanced applications.

Tags

geminiapiaiprogramming

Gemini 3 is our most intelligent model family to date, built on a foundation of state-of-the-art reasoning. It is designed to bring any idea to life by mastering agentic workflows, autonomous coding, and complex multimodal tasks. This guide covers key features of the Gemini 3 model family and how to get the most out of it.

High/Dynamic Thinking

Gemini 3 Pro uses dynamic thinking by default to reason through prompts. For faster, lower-latency responses when complex reasoning isn't required, you can constrain the model's thinking level to low.

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Find the race condition in this multi-threaded C++ snippet: [code here]",
)

print(response.text)
Enter fullscreen mode Exit fullscreen mode

Low Thinking

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="How does AI work?",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_level="low")
    ),
)

print(response.text)
Enter fullscreen mode Exit fullscreen mode

Explore

Gemini 3 Applets Overview

Explore our collection of Gemini 3 apps to see how the model handles advanced reasoning, autonomous coding, and complex multimodal tasks.

Meet Gemini 3

Gemini 3 Pro is the first model in the new series. gemini-3-pro-preview is best for your complex tasks that require broad world knowledge and advanced reasoning across modalities.

Model ID Context Window (In / Out) Knowledge Cutoff Pricing (Input / Output)*
gemini-3-pro-preview 1M / 64k Jan 2025 $2 / $12 (<200k tokens)
$4 / $18 (>200k tokens)

* Pricing is per 1 million tokens. Prices listed are for standard text; multimodal input rates may vary.

For detailed rate limits, batch pricing, and additional information, see the models page.

New API features in Gemini 3

Gemini 3 introduces new parameters designed to give developers more control over latency, cost, and multimodal fidelity.

Thinking level

The thinking_level parameter controls the maximum depth of the model's internal reasoning process before it produces a response. Gemini 3 treats these levels as relative allowances for thinking rather than strict token guarantees. If thinking_level is not specified, Gemini 3 Pro will default to high.

  • low: Minimizes latency and cost. Best for simple instruction following, chat, or high-throughput applications
  • medium: (Coming soon), not supported at launch
  • high (Default): Maximizes reasoning depth. The model may take significantly longer to reach a first token, but the output will be more carefully reasoned.

⚠️ Warning: You cannot use both thinking_level and the legacy thinking_budget parameter in the same request. Doing so will return a 400 error.

Media resolution

Gemini 3 introduces granular control over multimodal vision processing via the media_resolution parameter. Higher resolutions improve the model's ability to read fine text or identify small details, but increase token usage and latency. The media_resolution parameter determines the maximum number of tokens allocated per input image or video frame.

You can now set the resolution to media_resolution_low, media_resolution_medium, or media_resolution_high per individual media part or globally (via generation_config). If unspecified, the model uses optimal defaults based on the media type.

Recommended settings

Media Type Recommended Setting Max Tokens Usage Guidance
Images media_resolution_high 1120 Recommended for most image analysis tasks to ensure maximum quality.
PDFs media_resolution_medium 560 Optimal for document understanding; quality typically saturates at medium. Increasing to high rarely improves OCR results for standard documents.
Video (General) media_resolution_low (or media_resolution_medium) 70 (per frame) Note: For video, low and medium settings are treated identically (70 tokens) to optimize context usage. This is sufficient for most action recognition and description tasks.
Video (Text-heavy) media_resolution_high 280 (per frame) Required only when the use case involves reading dense text (OCR) or small details within video frames.

⭐ Note: The media_resolution parameter maps to different token counts depending on the input type. While images scale linearly (media_resolution_low: 280, media_resolution_medium: 560, media_resolution_high: 1120), Video is compressed more aggressively. For Video, both media_resolution_low and media_resolution_medium are capped at 70 tokens per frame, and media_resolution_high is capped at 280 tokens. See full details here.

from google import genai
from google.genai import types
import base64

# The media_resolution parameter is currently only available in the v1alpha API version.
client = genai.Client(http_options={'api_version': 'v1alpha'})

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents=[
        types.Content(
            parts=[
                types.Part(text="What is in this image?"),
                types.Part(
                    inline_data=types.Blob(
                        mime_type="image/jpeg",
                        data=base64.b64decode("..."),
                    ),
                    media_resolution={"level": "media_resolution_high"}
                )
            ]
        )
    ]
)

print(response.text)
Enter fullscreen mode Exit fullscreen mode

Temperature

For Gemini 3, we strongly recommend keeping the temperature parameter at its default value of 1.0.

While previous models often benefited from tuning temperature to control creativity versus determinism, Gemini 3's reasoning capabilities are optimized for the default setting. Changing the temperature (setting it below 1.0) may lead to unexpected behavior, such as looping or degraded performance, particularly in complex mathematical or reasoning tasks.

Thought signatures

Gemini 3 uses Thought signatures to maintain reasoning context across API calls. These signatures are encrypted representations of the model's internal thought process. To ensure the model maintains its reasoning capabilities you must return these signatures back to the model in your request exactly as they were received:

  • Function Calling (Strict): The API enforces strict validation on the "Current Turn". Missing signatures will result in a 400 error.
  • Text/Chat: Validation is not strictly enforced, but omitting signatures will degrade the model's reasoning and answer quality.

Success: If you use the official SDKs (Python, Node, Java) and standard chat history, Thought Signatures are handled automatically. You do not need to manually manage these fields.

Function calling (strict validation)

When Gemini generates a functionCall, it relies on the thoughtSignature to process the tool's output correctly in the next turn. The "Current Turn" includes all Model (functionCall) and User (functionResponse) steps that occurred since the last standard User text message.

  • Single Function Call: The functionCall part contains a signature. You must return it.
  • Parallel Function Calls: Only the first functionCall part in the list will contain the signature. You must return the parts in the exact order received.
  • Multi-Step (Sequential): If the model calls a tool, receives a result, and calls another tool (within the same turn), both function calls have signatures. You must return all accumulated signatures in the history.

Text and streaming

For standard chat or text generation, the presence of a signature is not guaranteed.

  • Non-Streaming: The final content part of the response may contain a thoughtSignature, though it is not always present. If one is returned, you should send it back to maintain best performance.
  • Streaming: If a signature is generated, it may arrive in a final chunk that contains an empty text part. Ensure your stream parser checks for signatures even if the text field is empty.

Code examples

Multi-step Function Calling (Sequential)

The user asks a question requiring two separate steps (Check Flight -> Book Taxi) in one turn.

Step 1: Model calls Flight Tool.
The model returns a signature <Sig_A>

// Model Response (Turn 1, Step 1)
  {
    "role": "model",
    "parts": [
      {
        "functionCall": { "name": "check_flight", "args": {...} },
        "thoughtSignature": "<Sig_A>" // SAVE THIS
      }
    ]
  }
Enter fullscreen mode Exit fullscreen mode

Step 2: User sends Flight Result
We must send back <Sig_A> to keep the model's train of thought.

// User Request (Turn 1, Step 2)
[
  { "role": "user", "parts": [{ "text": "Check flight AA100..." }] },
  { 
    "role": "model", 
    "parts": [
      { 
        "functionCall": { "name": "check_flight", "args": {...} }, 
        "thoughtSignature": "<Sig_A>" // REQUIRED
      } 
    ]
  },
  { "role": "user", "parts": [{ "functionResponse": { "name": "check_flight", "response": {...} } }] }
]
Enter fullscreen mode Exit fullscreen mode

Step 3: Model calls Taxi Tool
The model remembers the flight delay via <Sig_A> and now decides to book a taxi. It generates a new signature <Sig_B>.

// Model Response (Turn 1, Step 3)
{
  "role": "model",
  "parts": [
    {
      "functionCall": { "name": "book_taxi", "args": {...} },
      "thoughtSignature": "<Sig_B>" // SAVE THIS
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Step 4: User sends Taxi Result
To complete the turn, you must send back the entire chain: <Sig_A> AND <Sig_B>.

// User Request (Turn 1, Step 4)
[
  // ... previous history ...
  { 
    "role": "model", 
    "parts": [
       { "functionCall": { "name": "check_flight", ... }, "thoughtSignature": "<Sig_A>" } 
    ]
  },
  { "role": "user", "parts": [{ "functionResponse": {...} }] },
  { 
    "role": "model", 
    "parts": [
       { "functionCall": { "name": "book_taxi", ... }, "thoughtSignature": "<Sig_B>" } 
    ]
  },
  { "role": "user", "parts": [{ "functionResponse": {...} }] }
]
Enter fullscreen mode Exit fullscreen mode

Parallel Function Calling

The user asks: "Check the weather in Paris and London." The model returns two function calls in one response.

// User Request (Sending Parallel Results)
[
  {
    "role": "user",
    "parts": [
      { "text": "Check the weather in Paris and London." }
    ]
  },
  {
    "role": "model",
    "parts": [
      // 1. First Function Call has the signature
      {
        "functionCall": { "name": "check_weather", "args": { "city": "Paris" } },
        "thoughtSignature": "<Signature_A>" 
      },
      // 2. Subsequent parallel calls DO NOT have signatures
      {
        "functionCall": { "name": "check_weather", "args": { "city": "London" } }
      } 
    ]
  },
  {
    "role": "user",
    "parts": [
      // 3. Function Responses are grouped together in the next block
      {
        "functionResponse": { "name": "check_weather", "response": { "temp": "15C" } }
      },
      {
        "functionResponse": { "name": "check_weather", "response": { "temp": "12C" } }
      }
    ]
  }
]
Enter fullscreen mode Exit fullscreen mode

Text/In-Context Reasoning (No Validation)

The user asks a question that requires in-context reasoning without external tools. While not strictly validated, including the signature helps the model maintain the reasoning chain for follow-up questions.

// User Request (Follow-up question)
[
  { 
    "role": "user", 
    "parts": [{ "text": "What are the risks of this investment?" }] 
  },
  { 
    "role": "model", 
    "parts": [
      {
        "text": "I need to calculate the risk step-by-step. First, I'll look at volatility...",
        "thoughtSignature": "<Signature_C>" // Recommended to include
      }
    ]
  },
  { 
    "role": "user", 
    "parts": [{ "text": "Summarize that in one sentence." }] 
  }
]
Enter fullscreen mode Exit fullscreen mode

Migrating from other models

If you are transferring a conversation trace from another model (e.g., Gemini 2.5) or injecting a custom function call that was not generated by Gemini 3, you will not have a valid signature.

To bypass strict validation in these specific scenarios, populate the field with this specific dummy string: "thoughtSignature": "context_engineering_is_the_way_to_go"

Structured Outputs with tools

Gemini 3 allows you to combine Structured Outputs with built-in tools, including Grounding with Google Search, URL Context, and Code Execution.

from google import genai
from google.genai import types
from pydantic import BaseModel, Field
from typing import List

class MatchResult(BaseModel):
    winner: str = Field(description="The name of the winner.")
    final_match_score: str = Field(description="The final match score.

Visit Website