Gemini 3.1 Flash Live Explorer

Voice Chat

Real-time audio-to-audio conversation. Native speech understanding with 30 selectable voices.

Ready

Configuration

Voice System Instruction Enable transcription

Audio Controls

Input

Output

Transcription

Click the microphone button below to start talking.

Text Chat

You type; the model replies with native audio. Choose transcript-only (speaker off), audio-only, or both. Matches Live API requirements for Gemini 3.1 Flash Live.

Ready

Configuration

Response Mode Voice (for audio responses) System Instruction

Chat

Type a message below and hit Enter to begin.

Vision

Send camera or screen video alongside audio/text. The model sees JPEG frames at up to 1 FPS.

Ready

Video Source

Frame Rate Resolution Voice

Preview

Response

Click Camera or Screen Share to begin.

Function Calling

The model can call functions you define. Supports synchronous tool execution with the Live API.

Ready

Registered Functions

get_weather

Get current weather for a city. Params: city (string)

calculate

Evaluate a math expression. Params: expression (string)

get_current_time

Get the current time. Params: timezone (string, optional)

roll_dice

Roll dice. Params: sides (int), count (int)

Voice

Conversation

Try: "What's the weather in Tokyo?" or "Roll 3 dice"

Function Call Log

Function calls and responses will appear here.

Google Search Grounding

Ground responses in real-time Google Search results for up-to-date, factual answers.

Ready

Search-Grounded Chat

Ask questions that benefit from real-time search. E.g., "What happened in the news today?"

Thinking

Configure thinking depth: minimal (fastest) to high (most thorough). View the model's reasoning process.

Ready

Configuration

Thinking Level Show thinking process Playback Voice

Conversation

Ask a complex question to see the thinking process.

Settings

Global configuration for session behavior.

Connection

API Key Status

Configured via .env

Model

Voice Activity Detection

Automatic VAD

Used for Voice Chat and Vision (realtime audio). Not sent for text-only features.

Start-of-speech sensitivity End-of-speech sensitivity

Session Management

Enable session resumption

Allows reconnecting to a session within 2 hours if disconnected.

Context window compression

Sliding window compression for unlimited session length.

Audio

Input: 16-bit PCM, 16kHz mono Output: 16-bit PCM, 24kHz mono

Audio formats are fixed by the API. The browser handles conversion automatically.

Model Capabilities

Audio Input/Output

Text Input/Output

Image/Video Input

Input Transcription

Output Transcription

Voice Selection (30)

Function Calling (sync)

Google Search

Thinking (4 levels)

Session Resumption

Context Compression

VAD / Barge-in

97 Languages

Ephemeral Tokens

Async Function Calling

Code Execution

Image Generation

Structured Output

Proactive Audio

Affective Dialog

Protocol Log

Raw WebSocket messages exchanged with the Gemini Live API.

WebSocket messages will appear here when a session is active.