The Say
command allows you to play text-to-speech (TTS) messages to a caller. It converts text to speech and then renders it in a voice back to the caller. Say is useful in cases where it's difficult to pre-record a prompt for any reason. When translating text to speech, the Say action will make assumptions about how to pronounce numbers, dates, times, amounts of money and other abbreviations.
FreeClimb now supports** enhanced voice engines** for more natural and customizable speech, while still keeping the classic version available for compatibility.
Say
can only be done on a Call when not in Conference.
Note: Say
command always flushes the DTMF buffer.
Versions of Say
Say
You now have three ways to use the Say
command:
Classic (default) – Simple, quick TTS with basic controls.
freeclimb.standard
Engine– Uses Microsoft UCMA voices for more control (voices, cultures, and content types).
freeclimb.neural
Engine– Powered by Coqui for advanced, lifelike TTS with SSML effects (pitch, rate, volume, etc.).
ElevenLabs Engine – AI-powered voices with highly natural, expressive speech and multilingual support.
Quick Decision Guide
- Choose **Classic **if you need fast, lightweight TTS with minimal setup.
- Choose `freeclimb.standard if you want Microsoft UCMA voices, accents, and cultural variations.
- Choose `freeclimb.neural if you need lifelike, expressive speech and full SSML control.
- Choose ElevenLabs if you want the most natural, human-like voices with emotional range and multilingual options.
Feature / Attribute | Classic | freeclimb.standard | freeclimb.neural (NEW 🚀) | ElevenLabs(In Development) |
---|---|---|---|---|
Engine name | (none) | freeclimb.standard | freeclimb.neural | ElevenLabs |
Powered By | Built-in FreeClimb TTS | Microsoft UCMA | Coqui Neural TTS | ElevenLabs |
text | ✅ Limit 4KB | ✅ Limit 4KB | ✅ Limit 256 Characters | 4KB |
language | ✅ (en-US default, must match table)` | ❌(cannot be used when engine provided) | Set in engine.parameters.language (default en-US ) | |
Voice selection | Implied from language | ✅ Choose from 25_ named voices (Voice ) | ✅ Choose voice (default : Eve ) | |
Culture/Locale | Controlled by language | ✅ Explicitly set via Culture | ✅ Controlled by parameters.language | |
Content type | Plain text only | text/plain or application/ssml+xml | text or ssml | |
SSML support | ❌ | ✅ https://www.w3.org/TR/speech-synthesis/ | ✅ | https://elevenlabs.io/docs/api-reference/text-to-speech/stream |
Prosody controls | ❌ | Limited (UCMA SSML only) | ✅ Full Coqui SSML support | |
privacyMode | ✅ Hides text in logs | ✅ Hides text in logs | ✅ Hides text in logs | |
Best for... | Quick & simple TTS | Regional voices and UCMA compatibility | Natural, expressive speech with SSML effects |
Nesting Rules
Say
can exist as a standalone command or as a nested command. It does not allow barge-in unless nested within the following commands: GetDigits
, GetSpeech
. The file will always be played to completion unless nested or if it is nested and not interrupted by barge-In.
Classic Say
Say
Attribute | Description |
---|---|
text (string, required) | The message to be played to the caller using TTS. The size of the string is limited to 4 KB (or 4,096 bytes). An empty string will cause the command to be skipped. |
loop (integer, optional, default = 1) | Number of times the |
language (string, optional, default = en-US) | Language and (by implication) the locale to use. This implies the accent and pronunciations to be used for the TTS. The complete list of valid values for the language attribute is shown below. Note: Language codes are case-sensitive and must be specified as shown in the table. |
privacyMode (boolean, optional, default = false) | Indicates if the request contains sensitive information which should be hidden. When set to |
freeclimb.standard Engine
Description: Uses Microsoft UCMA voices for more natural speech.
Attribute | Description |
---|---|
text (string, required) | The message to be played to the caller using TTS. The size of the string is limited to 4 KB (or 4,096 bytes). An empty string will cause the command to be skipped. |
loop (integer, optional, default = 1) | How many times the text repeats. 0 = loop forever until hangup/barge-in. |
language (string, optional, default = en-US) |
|
privacyMode (boolean, optional, default = false) | Indicates if the request contains sensitive information which should be hidden. When set to |
Voice/Culture Options
Voice | Culture |
---|---|
Herena | ca-ES |
Helle | da-DK |
Hedda | de-DE |
Hayley | en-AU |
Heather | en-CA |
Hazel | en-GB |
Heera | en-IN |
Helen | en-US |
ZiraPro | en-US |
Helena | es-ES |
Hilda | es-MX |
Heidi | fi-FI |
Harmonie | fr-CA |
Hortense | fr-FR |
Lucia | it-IT |
Harkua | ja-JP |
Heami | ko-KR |
Hulda | nb-NO |
Hanna | nl-NL |
Paulina | pl-PL |
Helosia | pt-BR |
Helia | pt-PT |
Elena | ru-RU |
Hedvig | sv-SE |
HuiHui | zh-CN |
HunYee | zh-HK |
HanHan | zh-TW |
Note: Language codes are case-sensitive and must be specified as shown in the table.
Examples
Example (plain text):
[
{
"Say": {
"text": "Welcome to FreeClimb.",
"engine": {
"name": "freeclimb.standard",
"parameters": {
"Voice": "Helen",
"Culture": "en-US",
"Content-Type": "text/plain"
}
}
}
}
]
Example (SSML):
[
{
"Say": {
"text": "<prosody rate='50%'>Welcome to Freeclimb!</prosody>",
"engine": {
"name": "freeclimb.standard",
"parameters": {
"Voice": "Helen",
"Culture": "en-US",
"Content-Type": "application/ssml+xml"
}
}
}
}
]
freeclimb.neural Engine (NEW 🚀)
Description: This engine delivers lifelike voice and advanced speech control.
Attribute | Description |
---|---|
text (string, required) | The message to be played to the caller using TTS. The size of the string is limited to 4 KB (or 4,096 bytes). An empty string will cause the command to be skipped. |
loop (integer, optional, default = 1) | How many times the |
language (string, optional, default = en-US) |
|
privacyMode (boolean, optional, default = false) | Indicates if the request contains sensitive information which should be hidden. When set to |
Neural Features:
- SSML Support with
<say-as>
- Prosody controls for:
- Volume (increase/decrease)
- Speaking rate (faster/slower)
- Pitch adjustments
- Contour shaping (change pitch gradually over time)
Example (plain text):
[
{
"Say": {
"text": "Welcome to FreeClimb.",
"engine": {
"name": "freeclimb.neural",
"parameters": {
"voice": "Eve"
}
}
}
}
]
Example (SSML with prosody):
[
{
"Say": {
"text": "<prosody rate='50%'>Welcome to Freeclimb!</prosody>",
"engine": {
"name": "freeclimb.neural",
"parameters": {
"voice": "Eve",
"textType": "ssml"
}
}
}
}
]
ElevenLabs Engine (In Development)
Description: This engine text into natural, human-like speech for narration, voiceovers, and conversations.
Attribute | Description |
---|---|
text (string, required) | The message to be played to the caller using TTS. The size of the string is limited to 4 KB (or 4,096 bytes). An empty string will cause the command to be skipped. |
loop (integer, optional, default = 1) | How many times the |
language (string, optional, default = en-US) |
|
privacyMode (boolean, optional, default = false) | Indicates if the request contains sensitive information which should be hidden. When set to |
Example (plain text)
{
"Say": {
"text": "Hello World",
"engine": {
"name": "ElevenLabs",
"parameters": {
"apply_text_normalization": false,
"language_code": "en",
"voice_id": "o11yegU3CL24TZ1qcm6b"
}
}
}
}