Voice Conversation

Last updated: May 21, 2026

Voice Conversation turns your chatbot into a real-time voice agent. Visitors press the call button in the chat widget and have a spoken conversation with the bot - they talk, the bot answers back with a synthesised voice. You choose the voice model, the voice, the language and a few safety limits.

Where to find Voice Conversation settings

Select your chatbot, click the Settings tab, then select Voice Conversation in the left sidebar.

Voice Conversation in the settings sidebar|width=30

At the top of the page you will find the Enable voice conversation switch. The rest of the settings only appear once voice conversation is enabled.

Enable voice conversation toggle|width=70

Choose a voice model

ChatLab supports four voice models, each with a different per-minute rate:

  • GPT Realtime - OpenAI's flagship realtime model. Best overall quality and low latency. 30 credits / minute.
  • GPT Realtime Mini - faster and cheaper version of GPT Realtime. Default for new bots and a good fit for most use cases. 20 credits / minute.
  • Gemini 3.1 Flash Voice - Google's voice model (Gemini Live). Cheapest option. 15 credits / minute.
  • ElevenLabs Voice - high-quality ElevenLabs voices. Uses the bot's regular text model (set under Settings > Model & Advanced) to generate replies, then ElevenLabs converts them to speech. 30 credits / minute plus the text model cost per response.

Below the model selector there is a Cost breakdown card that shows the per-minute rate for the model you currently have selected, so you always see how the call will be billed.

Voice model cost breakdown|width=70

Pick a voice

Each model exposes its own set of voices in the Voice dropdown:

  • OpenAI (GPT Realtime and GPT Realtime Mini) - Alloy, Ash, Ballad, Cedar, Coral, Echo, Fable, Marin, Onyx, Nova, Sage, Shimmer, Verse.
  • Gemini - Kore (recommended), Charon, Puck, Fenrir, Aoede, Leda, Orus, Zephyr.
  • ElevenLabs - Brian, Helen, Chris, Daniel, Jessica, Sarah, Lily, Liam, Will, Janet.

New bots default to Alloy. You can change the voice at any time - the new voice takes effect on the next call.

Voice dropdown|width=50

Set the language

The Language section controls which language (or languages) the bot speaks during a call.

  • OpenAI and Gemini - pick a single primary language. If a visitor speaks in another supported language during the call, the bot will switch to that language naturally.
  • ElevenLabs - pick a default language plus any number of additional languages the agent is allowed to use.

Click a language tile to open a picker with country flags and search.

Language picker|width=70

Voice welcome message

The Voice Welcome Message is the first sentence the bot speaks when a call starts. It plays once at the beginning of the session, so keep it short and natural - something like "Hi, how can I help you?" works well.

The maximum length is 500 characters.

Voice Welcome Message field|width=70

Limit conversation duration

Voice calls consume credits per minute, so two safety limits are available to protect the bot owner's budget.

Max Conversation Duration

The Max Conversation Duration field caps the length of a single call. The default is 300 seconds (5 minutes); the allowed range is 30 seconds to 3600 seconds (60 minutes). When the limit is reached, the bot speaks an after-limit message (which you can customise just below the field) and the call ends.

If you want to allow unlimited single-call duration, turn on the No cap toggle. A warning is shown because a single call may then consume all remaining credits.

Max conversation duration|width=70

Daily voice cap

The Daily voice cap (minutes) field caps how many voice minutes the bot can use in a single day, across all visitors combined. Threshold notifications are sent to the bot owner when usage reaches 80% and 95% of the daily cap.

Turn on No daily cap to remove the per-day limit.

Daily voice cap|width=70

Voice behavior and system prompt

Voice calls can use a different system prompt than text chat, which is useful when you want a more conversational, spoken-style personality on the phone.

In the Voice Behavior section, click the Manage voice behavior and prompt in Role & Behavior settings link to jump to Settings > Role & Behavior, where a dedicated Voice Conversation tab lets you edit the voice-specific prompt independently from the text chat prompt.

Voice behavior link|width=70

How voice billing works

  • The first minute of every call is charged in full at the selected model's per-minute rate. This is the minimum charge for any call, even very short ones.
  • After the first minute, billing is per second.
  • Silent or disconnected time is not charged - billing stops at the last confirmed heartbeat from the visitor's browser.
  • Before a call starts, ChatLab checks that the bot has enough credits to cover at least the first minute. If not, the call cannot be started.
  • For ElevenLabs Voice, the per-minute fee is added on top of the bot's text model cost per response, because ElevenLabs uses the text model to generate replies and then converts them to speech.

Knowledge base sync for ElevenLabs

When an ElevenLabs voice model is selected, a status panel appears below the voice settings showing whether the bot's training data has been uploaded to ElevenLabs. The sync runs automatically after training: ChatLab bundles up to 8 MB of training material into a single document and sends it to ElevenLabs so the voice agent's answers reflect the same knowledge base as the text chat.

Phone calls

Once voice conversation is enabled, you can also assign the bot a real phone number so customers can call in over the regular phone network. See the dedicated article: Phone calls.

Voice input vs voice conversation

These two features sound similar but are different:

  • Voice input (under Settings > Chat Conversation) adds a microphone button to the message bar so visitors can dictate a typed question. The bot still answers with text. It uses the browser's Web Speech API.
  • Voice conversation (this page) is a full real-time speech-to-speech call - visitors hear the bot speak back, powered by OpenAI Realtime, Gemini Live or ElevenLabs.