GetSpeech

The GetSpeech command enables the caller to respond to the application using a supported language. Unlike DTMF entry, which implicitly restricts the user to using the available buttons on the phone keypad, speech input allows for flexible audio inputs based on grammar. FreeClimb supports grammars written using GRXML compatible with the Microsoft Speech Platform.

GetSpeech is only supported on a single Call leg. It is not supported when there are two or more Call legs connected (as in within a Conference).

GetSpeech is a terminal command — any actions following it are never executed. After GetSpeech is executed, control of the Call picks up using the PerCL received in response to the actionUrl request. If the reason the command terminated is hangup (see reason below), any PerCL returned will not be executed.

Nesting Rules

You can nest the below actions within the GetSpeech command.

  • Say
  • Play
  • Pause

The commands are not directly nested but are contained in the prompts attribute as a list of commands.

The nested commands (Say, Play, and Pause) will have barge-In enabled.
The GetSpeech command cannot be nested within any other command.

Note that nested commands do not inherit the value of privacyMode from the parent command. To use privacyMode in any nested command, it must be specified within that command's parameters.

Example

In this example, the caller is prompted to state the purpose of his or her call.

[
   {
      "GetSpeech": {
         "actionUrl": "http://www.foo.com/purpose.php",
         "grammarType": "URL",
         "grammarFile": "http://www.foo.com/grammars/purpose.xml",
         "grammarRule": "reason",
         "prompts": [
             {
               "Say": {
                 "text": "Please state the purpose of your call."
               }
             },
             {
               "Say": {
                 "text": "You can say report an accident, check claim, etc."
               }
             }
         ]
      }
   }
]

Command Attributes

The GetSpeech command supports the following attributes that modify its behavior:

AttributeDescription
actionUrlWhen the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL.
grammarFileThe grammar file to use for speech recognition.
grammarTypeThe grammar file type to use for speech recognition.
grammarRuleThe grammar rule within the specified grammar file to use for speech recognition.
playBeepIndicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.
promptsThe JSON array of PerCL commands to nest within the GetSpeech command.
noInputTimeoutMsWhen recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation.
recognitionTimeoutMsWhen playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation.
confidenceThresholdSpecifies what confidence level is considered a successful match.
sensitivityLevelThe sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech.
speechCompleteTimeoutMsSpecifies the length of silence required following user speech before the speech recognizer finalizes a result.
speechIncompleteTimeoutMsSpecifies the length of silence following user speech after which a recognizer finalizes a result.
privacyModeIndicates if the response will contain sensitive information which should be hidden.

actionUrl

REQUIRED

Type: absolute URL

When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL. A PerCL response is expected to continue handling the call.

Additional Request Parameters

Request ParametersDescription
reason
This field explains how the GetSpeech action ended. The value is one of the below.
recognitionResult
Semantic content (either a string if speech was recognized or a digit if a digit was input instead of speech) returned from the entry or tag that was recognized within the grammar. This field is populated only if the reason field is set to recognition or digit.
confidence
The level of confidence in the obtained result. This is a value in the range 0 to 100 – with 0 being total lack of confidence and 100 being absolute certainty in the recognition. This field is populated only if the reason field is set to recognition.

reason Valid Values

ValueDescription
error
The speech recognition engine failed to process the request for any reason (e.g. invalid grammar file).
hangup
The caller hung up.
digit
The caller input a digit during recognition.
noInput
An initial timeout occurred.
noMatch
There was audio input but it could not be matched with the grammar.
recognition
The audio input matched the specified grammar to some degree of confidence.

grammarFile

REQUIRED

Type: absolute URL or built-in grammar file name

The grammar file to use for speech recognition. If grammarType is set to URL, this attribute is specified as a download URL. FreeClimb will respect Cache-Control headers for this file. Use them to limit repeated requests for unchanged grammars. If no Cache-Control header is provided, the file will be cached for 5 seconds by default. Grammar files of type URL must specify the speech language to use; see Supported Languages table below.

If grammarType is set to BUILTIN, this attribute is set to the name of one of the platform built-in grammar files; see Built-In table below. Platform built-in grammar files are only available in English (United States).

Supported Languages

ValueDescription
en-USEnglish (United States)
ca-ESCatalan (Spain)
da-DKDanish (Denmark)
de-DEGerman (Germany)
en-AUEnglish (Australia)
en-CAEnglish (Canada)
en-GBEnglish (United Kingdom)
en-INEnglish (India)
es-ESSpanish (Spain)
es-MXSpanish (Mexico)
fi-FIFinnish (Finland)
fr-CAFrench (Canada)
fr-FRFrench (France)
it-ITItalian (Italy)
ja-JPJapanese (Japan)
ko-KRKorean (Korea)
nb-NONorwegian (Norway)
nl-NLDutch (Netherlands)
pl-PLPolish (Poland)
pt-BRPortuguese (Brazil)
pt-PTPortuguese (Portugal)
ru-RURussian (Russia)
sv-SESwedish (Sweden)
zh-CNChinese (China)
zh-HKChinese (Hong Kong)
zh-TWChinese (Taiwan)

Built-In Grammar Files

FilenameDescription
ALPHNUM6Get six alpha-numeric values from the caller
ANY_DIGGet 1 to 50 digits from the caller
DIG1Get one digit from the caller
DIG2Get two digits from the caller
DIG3Get three digits from the caller
DIG4Get four digits from the caller
DIG5Get five digits from the caller
DIG6Get six digits from the caller
DIG7Get seven digits from the caller
DIG8Get eight digits from the caller
DIG9Get nine digits from the caller
DIG10Get ten digits from the caller
DIG11Get eleven digits from the caller
UP_TO_20_DIGIT_SEQUENCEGet 1 to 20 digits from the caller
VERSAY_YESNOGet a Yes or No indication from the caller. Different variations of Yes and No are accepted.

grammarType

OPTIONAL

Type: string (URL, BUILTIN)
Default: URL

The grammar file type to use for speech recognition. A value of URL indicates the grammarFile attribute specifies a URL that points to the grammar file. A value of BUILTIN indicates the grammarFile attribute specifies the name of one of the platform built-in grammar files.


grammarRule

OPTIONAL

Type: string
Default: null

The grammar rule within the specified grammar file to use for speech recognition. This attribute is optional if grammarType is URL and ignored if grammarType is BUILTIN.


playBeep

OPTIONAL

Type: boolean
Default: false

Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.


prompts

OPTIONAL

Type: PerCL command array
Default: null

The JSON array of PerCL commands to nest within the GetSpeech command. The Say, Play, and Pause commands can be used. The nested actions are executed while FreeClimb is waiting for input from the caller. This allows for playing menu options to the caller and to prompt for the expected input. These commands stop executing when the caller begins to input speech.


noInputTimeoutMs

OPTIONAL

Type: integer > 0
Default: 7000 (ms)

When recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation.


recognitionTimeoutMs

OPTIONAL

Type: integer > 0
Default: 10000 (ms)

When playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation.


confidenceThreshold

OPTIONAL

Type: float range 0.0-1.0
Default: 0.4

When a recognition resource recognizes a spoken phrase, it associates a confidence level with that match. Parameter confidenceThreshold specifies what confidence level is considered a successful match.


sensitivityLevel

OPTIONAL

Type: float range 0.0-1.0
Default: 0.5

The speech recognizer supports a variable level of sound sensitivity. The sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech. A higher value means higher sensitivity to background noise increasing the chances of background noise being interpreted as speech. A lower value means lower sensitivity to background noise increasing the chances of lower speech volumes being interpreted as background noise.


speechCompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 1000 (ms)

Parameter speechCompleteTimeoutMs specifies the length of silence required following user speech before the speech recognizer finalizes a result. This timeout applies when the recognizer currently has a complete match against an active grammar. Reasonable speech complete timeout values are typically in the range of 0.3 seconds to 1.0 seconds.


speechIncompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 2000 (ms)

Parameter speechIncompleteTimeoutMs specifies the length of silence following user speech after which a recognizer finalizes a result. This timeout applies when the speech prior to the silence is an incomplete match of all active grammars. Timeout speechIncompleteTimeoutMs is usually longer than speechCompleteTimeoutMs to allow users to pause mid-utterance.


privacyMode

OPTIONAL

Type: boolean
Default: false

Indicates if the response will contain sensitive information which should be hidden. When set to true, the contents of the recognitionResult attribute will be replaced with the string "xxxxx" in the logs. It's important to note that privacyMode is set at the command level, meaning it will not be inherited by any nested commands.