The GetSpeech
command enables the caller to respond to the application using a supported language. Unlike DTMF entry, which implicitly restricts the user to using the available buttons on the phone keypad, speech input allows for flexible audio inputs based on grammar. FreeClimb supports grammars written using GRXML compatible with the Microsoft Speech Platform.
GetSpeech
is only supported on a single Call leg. It is not supported when there are two or more Call legs connected (as in within a Conference).
GetSpeech
is a terminal command — any actions following it are never executed. After GetSpeech
is executed, control of the Call picks up using the PerCL received in response to the actionUrl
request. If the reason the command terminated is hangup
(see reason
below), any PerCL returned will not be executed.
Nesting Rules
You can nest the below actions within the GetSpeech
command.
Say
Play
Pause
The commands are not directly nested but are contained in the prompts attribute as a list of commands.
The nested commands (Say
, Play
, and Pause
) will have barge-In enabled.
The GetSpeech
command cannot be nested within any other command.
Note that nested commands do not inherit the value of privacyMode
from the parent command. To use privacyMode
in any nested command, it must be specified within that command's parameters.
Example
In this example, the caller is prompted to state the purpose of his or her call.
[
{
"GetSpeech": {
"actionUrl": "http://www.foo.com/purpose.php",
"grammarType": "URL",
"grammarFile": "http://www.foo.com/grammars/purpose.xml",
"grammarRule": "reason",
"prompts": [
{
"Say": {
"text": "Please state the purpose of your call."
}
},
{
"Say": {
"text": "You can say report an accident, check claim, etc."
}
}
]
}
}
]
Command Attributes
The GetSpeech
command supports the following attributes that modify its behavior:
Attribute | Description |
---|---|
actionUrl | When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL. |
grammarFile | The grammar file to use for speech recognition. |
grammarType | The grammar file type to use for speech recognition. |
grammarRule | The grammar rule within the specified grammar file to use for speech recognition. |
playBeep | Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak. |
prompts | The JSON array of PerCL commands to nest within the GetSpeech command. |
noInputTimeoutMs | When recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation. |
recognitionTimeoutMs | When playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation. |
confidenceThreshold | Specifies what confidence level is considered a successful match. |
sensitivityLevel | The sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech. |
speechCompleteTimeoutMs | Specifies the length of silence required following user speech before the speech recognizer finalizes a result. |
speechIncompleteTimeoutMs | Specifies the length of silence following user speech after which a recognizer finalizes a result. |
privacyMode | Indicates if the response will contain sensitive information which should be hidden. |
actionUrl
REQUIREDType: absolute URL
When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL. A PerCL response is expected to continue handling the call.
Additional Request Parameters
Request Parameters | Description |
---|---|
reason | This field explains how the GetSpeech action ended. The value is one of the below. |
recognitionResult | Semantic content (either a string if speech was recognized or a digit if a digit was input instead of speech) returned from the entry or tag that was recognized within the grammar. This field is populated only if the reason field is set to recognition or digit . |
confidence | The level of confidence in the obtained result. This is a value in the range 0 to 100 – with 0 being total lack of confidence and 100 being absolute certainty in the recognition. This field is populated only if the reason field is set to recognition . |
reason
Valid Values
reason
Valid ValuesValue | Description |
---|---|
error | The speech recognition engine failed to process the request for any reason (e.g. invalid grammar file). |
hangup | The caller hung up. |
digit | The caller input a digit during recognition. |
noInput | An initial timeout occurred. |
noMatch | There was audio input but it could not be matched with the grammar. |
recognition | The audio input matched the specified grammar to some degree of confidence. |
grammarFile
REQUIREDType: absolute URL or built-in grammar file name
The grammar file to use for speech recognition. If grammarType
is set to URL
, this attribute is specified as a download URL. FreeClimb will respect Cache-Control headers for this file. Use them to limit repeated requests for unchanged grammars. If no Cache-Control header is provided, the file will be cached for 5 seconds by default. Grammar files of type URL
must specify the speech language to use; see Supported Languages table below.
If grammarType
is set to BUILTIN
, this attribute is set to the name of one of the platform built-in grammar files; see Built-In table below. Platform built-in grammar files are only available in English (United States).
Supported Languages
Value | Description |
---|---|
en-US | English (United States) |
ca-ES | Catalan (Spain) |
da-DK | Danish (Denmark) |
de-DE | German (Germany) |
en-AU | English (Australia) |
en-CA | English (Canada) |
en-GB | English (United Kingdom) |
en-IN | English (India) |
es-ES | Spanish (Spain) |
es-MX | Spanish (Mexico) |
fi-FI | Finnish (Finland) |
fr-CA | French (Canada) |
fr-FR | French (France) |
it-IT | Italian (Italy) |
ja-JP | Japanese (Japan) |
ko-KR | Korean (Korea) |
nb-NO | Norwegian (Norway) |
nl-NL | Dutch (Netherlands) |
pl-PL | Polish (Poland) |
pt-BR | Portuguese (Brazil) |
pt-PT | Portuguese (Portugal) |
ru-RU | Russian (Russia) |
sv-SE | Swedish (Sweden) |
zh-CN | Chinese (China) |
zh-HK | Chinese (Hong Kong) |
zh-TW | Chinese (Taiwan) |
Built-In Grammar Files
Filename | Description |
---|---|
ALPHNUM6 | Get six alpha-numeric values from the caller |
ANY_DIG | Get 1 to 50 digits from the caller |
DIG1 | Get one digit from the caller |
DIG2 | Get two digits from the caller |
DIG3 | Get three digits from the caller |
DIG4 | Get four digits from the caller |
DIG5 | Get five digits from the caller |
DIG6 | Get six digits from the caller |
DIG7 | Get seven digits from the caller |
DIG8 | Get eight digits from the caller |
DIG9 | Get nine digits from the caller |
DIG10 | Get ten digits from the caller |
DIG11 | Get eleven digits from the caller |
UP_TO_20_DIGIT_SEQUENCE | Get 1 to 20 digits from the caller |
VERSAY_YESNO | Get a Yes or No indication from the caller. Different variations of Yes and No are accepted. |
grammarType
OPTIONALType: string (URL, BUILTIN)
Default: URL
The grammar file type to use for speech recognition. A value of URL
indicates the grammarFile
attribute specifies a URL that points to the grammar file. A value of BUILTIN
indicates the grammarFile
attribute specifies the name of one of the platform built-in grammar files.
grammarRule
OPTIONALType: string
Default: null
The grammar rule within the specified grammar file to use for speech recognition. This attribute is optional if grammarType
is URL
and ignored if grammarType
is BUILTIN
.
playBeep
OPTIONALType: boolean
Default: false
Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.
prompts
OPTIONALType: PerCL command array
Default: null
The JSON array of PerCL commands to nest within the GetSpeech
command. The Say
, Play
, and Pause
commands can be used. The nested actions are executed while FreeClimb is waiting for input from the caller. This allows for playing menu options to the caller and to prompt for the expected input. These commands stop executing when the caller begins to input speech.
noInputTimeoutMs
OPTIONALType: integer > 0
Default: 7000 (ms)
When recognition is started and there is no speech detected for noInputTimeoutMs
milliseconds, the recognizer will terminate the recognition operation.
recognitionTimeoutMs
OPTIONALType: integer > 0
Default: 10000 (ms)
When playback of prompts ends and there is no match for recognitionTimeoutMs
milliseconds, the recognizer will terminate the recognition operation.
confidenceThreshold
OPTIONALType: float range 0.0-1.0
Default: 0.4
When a recognition resource recognizes a spoken phrase, it associates a confidence level with that match. Parameter confidenceThreshold
specifies what confidence level is considered a successful match.
sensitivityLevel
OPTIONALType: float range 0.0-1.0
Default: 0.5
The speech recognizer supports a variable level of sound sensitivity. The sensitivityLevel
attribute allows for filtering out background noise, so it is not mistaken for speech. A higher value means higher sensitivity to background noise increasing the chances of background noise being interpreted as speech. A lower value means lower sensitivity to background noise increasing the chances of lower speech volumes being interpreted as background noise.
speechCompleteTimeoutMs
OPTIONALType: integer > 0
Default: 1000 (ms)
Parameter speechCompleteTimeoutMs
specifies the length of silence required following user speech before the speech recognizer finalizes a result. This timeout applies when the recognizer currently has a complete match against an active grammar. Reasonable speech complete timeout values are typically in the range of 0.3 seconds to 1.0 seconds.
speechIncompleteTimeoutMs
OPTIONALType: integer > 0
Default: 2000 (ms)
Parameter speechIncompleteTimeoutMs
specifies the length of silence following user speech after which a recognizer finalizes a result. This timeout applies when the speech prior to the silence is an incomplete match of all active grammars. Timeout speechIncompleteTimeoutMs
is usually longer than speechCompleteTimeoutMs
to allow users to pause mid-utterance.
privacyMode
OPTIONALType: boolean
Default: false
Indicates if the response will contain sensitive information which should be hidden. When set to true
, the contents of the recognitionResult
attribute will be replaced with the string "xxxxx" in the logs. It's important to note that privacyMode
is set at the command level, meaning it will not be inherited by any nested commands.