GetSpeech

The GetSpeech command enables the caller to respond to the application using a supported language. Unlike DTMF entry, which implicitly restricts the user to using the available buttons on the phone keypad, speech input allows for flexible audio inputs based on grammar. FreeClimb supports grammars written using GRXML compatible with the Microsoft Speech Platform.

GetSpeech is only supported on a single Call leg. It is not supported when there are two or more Call legs connected (as in within a Conference).

GetSpeech is a terminal command — any actions following it are never executed. After GetSpeech is executed, control of the Call picks up using the PerCL received in response to the actionUrl request. If the reason the command terminated is hangup (see reason below), any PerCL returned will not be executed.

Nesting Rules

You can nest the below actions within the GetSpeech command.

Say
Play
Pause

The commands are not directly nested but are contained in the prompts attribute as a list of commands.

The nested commands (Say, Play, and Pause) will have barge-In enabled.
The GetSpeech command cannot be nested within any other command.

Note that nested commands do not inherit the value of privacyMode from the parent command. To use privacyMode in any nested command, it must be specified within that command's parameters.

Example

In this example, the caller is prompted to state the purpose of his or her call.

[
   {
      "GetSpeech": {
         "actionUrl": "http://www.foo.com/purpose.php",
         "grammarType": "URL",
         "grammarFile": "http://www.foo.com/grammars/purpose.xml",
         "grammarRule": "reason",
         "prompts": [
             {
               "Say": {
                 "text": "Please state the purpose of your call."
               }
             },
             {
               "Say": {
                 "text": "You can say report an accident, check claim, etc."
               }
             }
         ]
      }
   }
]

Command Attributes

The GetSpeech command supports the following attributes that modify its behavior:

Attribute	Description
`actionUrl`	When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL.
`grammarFile`	The grammar file to use for speech recognition.
`grammarType`	The grammar file type to use for speech recognition.
`grammarRule`	The grammar rule within the specified grammar file to use for speech recognition.
`playBeep`	Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.
`prompts`	The JSON array of PerCL commands to nest within the `GetSpeech` command.
`noInputTimeoutMs`	When recognition is started and there is no speech detected for `noInputTimeoutMs` milliseconds, the recognizer will terminate the recognition operation.
`recognitionTimeoutMs`	When playback of prompts ends and there is no match for `recognitionTimeoutMs` milliseconds, the recognizer will terminate the recognition operation.
`confidenceThreshold`	Specifies what confidence level is considered a successful match.
`sensitivityLevel`	The `sensitivityLevel` attribute allows for filtering out background noise, so it is not mistaken for speech.
`speechCompleteTimeoutMs`	Specifies the length of silence required following user speech before the speech recognizer finalizes a result.
`speechIncompleteTimeoutMs`	Specifies the length of silence following user speech after which a recognizer finalizes a result.
`privacyMode`	Indicates if the response will contain sensitive information which should be hidden.

actionUrl

REQUIRED

Type: absolute URL

When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL. A PerCL response is expected to continue handling the call.

Additional Request Parameters

Request Parameters	Description
`reason`	This field explains how the `GetSpeech` action ended. The value is one of the below.
`recognitionResult`	Semantic content (either a string if speech was recognized or a digit if a digit was input instead of speech) returned from the entry or tag that was recognized within the grammar. This field is populated only if the `reason` field is set to `recognition` or `digit` .
`confidence`	The level of confidence in the obtained result. This is a value in the range 0 to 100 – with 0 being total lack of confidence and 100 being absolute certainty in the recognition. This field is populated only if the `reason` field is set to `recognition` .

Request Parameters

Description

reason

This field explains how the

GetSpeech

action ended. The value is one of the below.

recognitionResult

Semantic content (either a string if speech was recognized or a digit if a digit was input instead of speech) returned from the entry or tag that was recognized within the grammar. This field is populated only if the

reason

field is set to

recognition

digit

confidence

The level of confidence in the obtained result. This is a value in the range 0 to 100 – with 0 being total lack of confidence and 100 being absolute certainty in the recognition. This field is populated only if the

reason

field is set to

recognition

`reason` Valid Values

Value	Description
`error`	The speech recognition engine failed to process the request for any reason (e.g. invalid grammar file).
`hangup`	The caller hung up.
`digit`	The caller input a digit during recognition.
`noInput`	An initial timeout occurred.
`noMatch`	There was audio input but it could not be matched with the grammar.
`recognition`	The audio input matched the specified grammar to some degree of confidence.

grammarFile

REQUIRED

Type: absolute URL or built-in grammar file name

The grammar file to use for speech recognition. If grammarType is set to URL, this attribute is specified as a download URL. Grammar files of type URL must specify the speech language to use; see Supported Languages table below.

If grammarType is set to BUILTIN, this attribute is set to the name of one of the platform built-in grammar files; see Built-In table below. Platform built-in grammar files are only available in English (United States).

Supported Languages

Value	Description
en-US	English (United States)
ca-ES	Catalan (Spain)
da-DK	Danish (Denmark)
de-DE	German (Germany)
en-AU	English (Australia)
en-CA	English (Canada)
en-GB	English (United Kingdom)
en-IN	English (India)
es-ES	Spanish (Spain)
es-MX	Spanish (Mexico)
fi-FI	Finnish (Finland)
fr-CA	French (Canada)
fr-FR	French (France)
it-IT	Italian (Italy)
ja-JP	Japanese (Japan)
ko-KR	Korean (Korea)
nb-NO	Norwegian (Norway)
nl-NL	Dutch (Netherlands)
pl-PL	Polish (Poland)
pt-BR	Portuguese (Brazil)
pt-PT	Portuguese (Portugal)
ru-RU	Russian (Russia)
sv-SE	Swedish (Sweden)
zh-CN	Chinese (China)
zh-HK	Chinese (Hong Kong)
zh-TW	Chinese (Taiwan)

Built-In Grammar Files

Filename	Description
ALPHNUM6	Get six alpha-numeric values from the caller
ANY_DIG	Get 1 to 50 digits from the caller
DIG1	Get one digit from the caller
DIG2	Get two digits from the caller
DIG3	Get three digits from the caller
DIG4	Get four digits from the caller
DIG5	Get five digits from the caller
DIG6	Get six digits from the caller
DIG7	Get seven digits from the caller
DIG8	Get eight digits from the caller
DIG9	Get nine digits from the caller
DIG10	Get ten digits from the caller
DIG11	Get eleven digits from the caller
UP_TO_20_DIGIT_SEQUENCE	Get 1 to 20 digits from the caller
VERSAY_YESNO	Get a Yes or No indication from the caller. Different variations of Yes and No are accepted.

grammarType

OPTIONAL

Type: string (URL, BUILTIN)
Default: URL

The grammar file type to use for speech recognition. A value of URL indicates the grammarFile attribute specifies a URL that points to the grammar file. A value of BUILTIN indicates the grammarFile attribute specifies the name of one of the platform built-in grammar files.

grammarRule

OPTIONAL

Type: string
Default: null

The grammar rule within the specified grammar file to use for speech recognition. This attribute is optional if grammarType is URL and ignored if grammarType is BUILTIN.

playBeep

OPTIONAL

Type: boolean
Default:false

Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.

prompts

OPTIONAL

Type: PerCL command array
Default: null

The JSON array of PerCL commands to nest within the GetSpeech command. The Say, Play, and Pause commands can be used. The nested actions are executed while FreeClimb is waiting for input from the caller. This allows for playing menu options to the caller and to prompt for the expected input. These commands stop executing when the caller begins to input speech.

noInputTimeoutMs

OPTIONAL

Type: integer > 0
Default: 7000 (ms)

When recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation.

recognitionTimeoutMs

OPTIONAL

Type: integer > 0
Default: 10000 (ms)

When playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation.

confidenceThreshold

OPTIONAL

Type: float range 0.0-1.0
Default: 0.4

When a recognition resource recognizes a spoken phrase, it associates a confidence level with that match. Parameter confidenceThreshold specifies what confidence level is considered a successful match.

sensitivityLevel

OPTIONAL

Type: float range 0.0-1.0
Default: 0.5

The speech recognizer supports a variable level of sound sensitivity. The sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech. A higher value means higher sensitivity to background noise increasing the chances of background noise being interpreted as speech. A lower value means lower sensitivity to background noise increasing the chances of lower speech volumes being interpreted as background noise.

speechCompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 1000 (ms)

Parameter speechCompleteTimeoutMs specifies the length of silence required following user speech before the speech recognizer finalizes a result. This timeout applies when the recognizer currently has a complete match against an active grammar. Reasonable speech complete timeout values are typically in the range of 0.3 seconds to 1.0 seconds.

speechIncompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 2000 (ms)

Parameter speechIncompleteTimeoutMs specifies the length of silence following user speech after which a recognizer finalizes a result. This timeout applies when the speech prior to the silence is an incomplete match of all active grammars. Timeout speechIncompleteTimeoutMs is usually longer than speechCompleteTimeoutMs to allow users to pause mid-utterance.

privacyMode

OPTIONAL

Type: boolean
Default: false

Indicates if the response will contain sensitive information which should be hidden. When set to true, the contents of the recognitionResult attribute will be replaced with the string "xxxxx" in the logs. It's important to note that privacyMode is set at the command level, meaning it will not be inherited by any nested commands.

Nesting Rules

Example

Command Attributes

actionUrl

Additional Request Parameters

reason Valid Values

grammarFile

Supported Languages

Built-In Grammar Files

grammarType

grammarRule

playBeep

prompts

noInputTimeoutMs

recognitionTimeoutMs

confidenceThreshold

sensitivityLevel

speechCompleteTimeoutMs

speechIncompleteTimeoutMs

privacyMode

`reason` Valid Values