[Freeswitch-users] Stopping TTS when play_and_detect_speech gets MRCP START-OF-INPUT

mayamatakeshi mayamatakeshi at gmail.com
Thu Mar 18 06:59:59 UTC 2021

On Thu, Mar 18, 2021 at 12:23 PM mayamatakeshi <mayamatakeshi at gmail.com>

> On Wed, Mar 17, 2021 at 6:52 PM mayamatakeshi <mayamatakeshi at gmail.com>
> wrote:
>> Hi,
>> I'm trying play_and_detect_speech with an ESL app but START-OF-INPUT
>> doesn't interrupt the TTS being played.
>> Is this expected?
>> (
>> https://freeswitch.org/confluence/display/FREESWITCH/mod_dptools%3A+play_and_detect_speech
>> is not clear about it)
>> Or maybe I need to set some channel variable for this to work.
> Hi,
> My app is a fork of https://github.com/plivo/plivoframework
> I debugged FS code and found the cause of the problem:
> I verified that the DETECTED_SPEECH events (including the one
> with Speech-Type begin-speaking) were not being fired.
> This was because my app was not setting this variable:
>   https://freeswitch.org/confluence/display/FREESWITCH/fire_asr_events
> Then I set fire_asr_events=true
> but still the prompt didn't get interrupted by speech.
> Then I checked how FS code was fetching (dequeuing) events and I realized
> it checks for divert_events.
> Then I changed my app to send
>   divert_events off
> in the ESL socket
> and after that it worked and FS stopped the prompt upon speech start.
> However, it is not stopping the prompt when a DTMF digit is received (it
> was already this way before I did the changes).
> So things are better but still there is something amiss.

OK. I found the reason:
I was using 'playback_terminators=any'
which reading the FS code shows will result in:
  terminators = "1234567890*#"
But I was sending DTMF 'a' so it was not causing the prompt to be
Then I changed my test script to send '1' instead and then this caused the
prompt to terminate.
However, this also terminated the speech detection operation as FS sent
STOP to the MRCP server.
And the reason is because differently from begin-speaking, a terminator
will set the operation as done (switch_ivr_async.c):

                      } else if (!strcasecmp(speech_type,
"begin-speaking")) {

"(%s) START OF SPEECH\n", switch_channel_get_name(channel));
                          return SWITCH_STATUS_BREAK;

                      if (terminators && strchr(terminators, dtmf->digit)) {

"(%s) ACCEPT TERMINATOR %c\n", switch_channel_get_name(channel),
                          state->result =
switch_core_session_sprintf(session, "DIGIT: %c", dtmf->digit);
                          state->done = PLAY_AND_DETECT_DONE;
                          return SWITCH_STATUS_BREAK;

This might be OK for someone but in my case  there is another issue that
the termination of the speech detection is not notified to my app via ESL.

I am hoping to solve this by leaving the DTMF collection to be done by the
MRCP server but currently the MRCP server I'm using (UniMRCP) doesn't
report START-OF-INPUT for DTMF (I only get the RECOGNITION-COMPLETE after
all digits are input). (my use case is to allow the user to speak some
number sequence or dial it).

So I will switch to use detect_speech instead of play_and_detect_speech.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20210318/9f7475c6/attachment-0001.html>

More information about the FreeSWITCH-users mailing list