GetSpeech

The GetSpeech command enables the caller to respond to the application using a supported language. Unlike DTMF entry, which implicitly restricts the user to using the available buttons on the phone keypad, speech input allows for flexible audio inputs based on grammar. FreeClimb supports grammars written using GRXML compatible with the Microsoft Speech Platform.

GetSpeech is only supported on a single Call leg. It is not supported when there are two or more Call legs connected (as in within a Conference).

GetSpeech is a terminal command — any actions following it are never executed. After GetSpeech is executed, control of the Call picks up using the PerCL received in response to the actionUrl request. If the reason the command terminated is hangup (see reason below), any PerCL returned will not be executed.

Nesting Rules

You can nest the below actions within the GetSpeech command.

  • Say
  • Play
  • Pause

The commands are not directly nested but are contained in the prompts attribute as a list of commands.

The nested commands (Say, Play, and Pause) will have barge-In enabled.
The GetSpeech command cannot be nested within any other command.

Example

In this example, the caller is prompted to state the purpose of his or her call.

[
   {
      "GetSpeech": {
         "actionUrl": "http://www.foo.com/purpose.php",
         "grammarType": "URL",
         "grammarFile": "http://www.foo.com/grammars/purpose.xml",
         "grammarRule": "reason",
         "prompts": [
             {
               "Say": {
                 "text": "Please state the purpose of your call."
               }
             },
             {
               "Say": {
                 "text": "You can say report an accident, check claim, etc."
               }
             }
         ]
      }
   }
]

Command Attributes

The GetSpeech command supports the following attributes that modify its behavior:

Attribute

Description

actionUrl

When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL.

grammarFile

The grammar file to use for speech recognition.

grammarType

The grammar file type to use for speech recognition.

grammarRule

The grammar rule within the specified grammar file to use for speech recognition.

playBeep

Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.

prompts

The JSON array of PerCL commands to nest within the GetSpeech command.

noInputTimeoutMs

When recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation.

recognitionTimeoutMs

When playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation.

confidenceThreshold

Specifies what confidence level is considered a successful match.

sensitivityLevel

The sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech.

speechCompleteTimeoutMs

Specifies the length of silence required following user speech before the speech recognizer finalizes a result.

speechIncompleteTimeoutMs

Specifies the length of silence following user speech after which a recognizer finalizes a result.

privacyMode

Indicates if the response will contain sensitive information which should be hidden.


actionUrl

REQUIRED

Type: absolute URL
Default: null

When the caller has finished speaking or the command has timed out, FreeClimb will make a POST request to this URL. A PerCL response is expected to continue handling the call.

Additional Request Parameters

Request Parameters

Description

`reason`

This field explains how the GetSpeech action ended. The value is one of the below.

`recognitionResult`

Semantic content (either a string if speech was recognized or a digit if a digit was input instead of speech) returned from the entry or tag that was recognized within the grammar. This field is populated only if the reason field is set to recognition or digit.

`confidence`

The level of confidence in the obtained result. This is a value in the range 0 to 100 – with 0 being total lack of confidence and 100 being absolute certainty in the recognition. This field is populated only if the reason field is set to recognition.

reason Valid Values

Value

Description

`error`

The speech recognition engine failed to process the request for any reason (e.g. invalid grammar file).

`hangup`

The caller hung up.

`digit`

The caller input a digit during recognition.

`noInput`

An initial timeout occurred.

`noMatch`

There was audio input but it could not be matched with the grammar.

`recognition`

The audio input matched the specified grammar to some degree of confidence.


grammarFile

REQUIRED

Type: absolute URL or
built-in grammar file name
Default: null

The grammar file to use for speech recognition. If grammarType is set to URL, this attribute is specified as a download URL. FreeClimb will respect Cache-Control headers for this file. Use them to limit repeated requests for unchanged grammars. If no Cache-Control header is provided, the file will be cached for 5 seconds by default. Grammar files of type URL must specify the speech language to use; see Supported Languages table below.

If grammarType is set to BUILTIN, this attribute is set to the name of one of the platform built-in grammar files; see Built-In table below. Platform built-in grammar files are only available in English (United States).

Supported Languages

Value

Description

en-US

English (United States)

ca-ES

Catalan (Spain)

da-DK

Danish (Denmark)

de-DE

German (Germany)

en-AU

English (Australia)

en-CA

English (Canada)

en-GB

English (United Kingdom)

en-IN

English (India)

es-ES

Spanish (Spain)

es-MX

Spanish (Mexico)

fi-FI

Finnish (Finland)

fr-CA

French (Canada)

fr-FR

French (France)

it-IT

Italian (Italy)

ja-JP

Japanese (Japan)

ko-KR

Korean (Korea)

nb-NO

Norwegian (Norway)

nl-NL

Dutch (Netherlands)

pl-PL

Polish (Poland)

pt-BR

Portuguese (Brazil)

pt-PT

Portuguese (Portugal)

ru-RU

Russian (Russia)

sv-SE

Swedish (Sweden)

zh-CN

Chinese (China)

zh-HK

Chinese (Hong Kong)

zh-TW

Chinese (Taiwan)

Built-In Grammar Files

Filename

Description

ALPHNUM6

Get six alpha-numeric values from the caller

ANY_DIG

Get 1 to 50 digits from the caller

DIG1

Get one digit from the caller

DIG2

Get two digits from the caller

DIG3

Get three digits from the caller

DIG4

Get four digits from the caller

DIG5

Get five digits from the caller

DIG6

Get six digits from the caller

DIG7

Get seven digits from the caller

DIG8

Get eight digits from the caller

DIG9

Get nine digits from the caller

DIG10

Get ten digits from the caller

DIG11

Get eleven digits from the caller

UP_TO_20_DIGIT_SEQUENCE

Get 1 to 20 digits from the caller

VERSAY_YESNO

Get a Yes or No indication from the caller. Different variations of Yes and No are accepted.


grammarType

OPTIONAL

Type: string (URL, BUILTIN)
Default: URL

The grammar file type to use for speech recognition. A value of URL indicates the grammarFile attribute specifies a URL that points to the grammar file. A value of BUILTIN indicates the grammarFile attribute specifies the name of one of the platform built-in grammar files.


grammarRule

OPTIONAL

Type: string
Default: null

The grammar rule within the specified grammar file to use for speech recognition. This attribute is optional if grammarType is URL and ignored if grammarType is BUILTIN.


playBeep

OPTIONAL

Type: boolean
Default: false

Indicates whether a beep should be played just before speech recognition is initiated so that the speaker can start to speak.


prompts

OPTIONAL

Type: PerCL command array
Default: null

The JSON array of PerCL commands to nest within the GetSpeech command. The Say, Play, and Pause commands can be used. The nested actions are executed while FreeClimb is waiting for input from the caller. This allows for playing menu options to the caller and to prompt for the expected input. These commands stop executing when the caller begins to input speech.


noInputTimeoutMs

OPTIONAL

Type: integer > 0
Default: 7000 (ms)

When recognition is started and there is no speech detected for noInputTimeoutMs milliseconds, the recognizer will terminate the recognition operation.


recognitionTimeoutMs

OPTIONAL

Type: integer > 0
Default: 10000 (ms)

When playback of prompts ends and there is no match for recognitionTimeoutMs milliseconds, the recognizer will terminate the recognition operation.


confidenceThreshold

OPTIONAL

Type: float range 0.0-1.0
Default: 0.4

When a recognition resource recognizes a spoken phrase, it associates a confidence level with that match. Parameter confidenceThreshold specifies what confidence level is considered a successful match.


sensitivityLevel

OPTIONAL

Type: float range 0.0-1.0
Default: 0.5

The speech recognizer supports a variable level of sound sensitivity. The sensitivityLevel attribute allows for filtering out background noise, so it is not mistaken for speech. A higher value means higher sensitivity to background noise increasing the chances of background noise being interpreted as speech. A lower value means lower sensitivity to background noise increasing the chances of lower speech volumes being interpreted as background noise.


speechCompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 1000 (ms)

Parameter speechCompleteTimeoutMs specifies the length of silence required following user speech before the speech recognizer finalizes a result. This timeout applies when the recognizer currently has a complete match against an active grammar. Reasonable speech complete timeout values are typically in the range of 0.3 seconds to 1.0 seconds.


speechIncompleteTimeoutMs

OPTIONAL

Type: integer > 0
Default: 2000 (ms)

Parameter speechIncompleteTimeoutMs specifies the length of silence following user speech after which a recognizer finalizes a result. This timeout applies when the speech prior to the silence is an incomplete match of all active grammars. Timeout speechIncompleteTimeoutMs is usually longer than speechCompleteTimeoutMs to allow users to pause mid-utterance.


privacyMode

OPTIONAL

Type: boolean
Default: false

Indicates if the response will contain sensitive information which should be hidden. When set to true, the contents of the recognitionResult attribute will be replaced with the string "xxxxx" in the logs. It's important to note that privacyMode is set at the command level, meaning it will not be inherited by any nested commands.