[Freeswitch-users] How to implement TTS barge-in using FS ESL

Thu Nov 17 16:28:19 MSK 2011

Hi Christopher,

   After some more tests, I found followings:

   1. Testing from dialplan, the log output is the string of "CRIT 
${speech_detect_result}" rather than the recognition results.

   2. Regarding to non barge in, I first send "speak" command then send 
play_and_detect_speech with parameter:
       detect:unimrcp:nuance5-mrcp1-1 
{start-input-timers=false,no-input-timeout=25000,recognition-timeout=25000}dudeYNNC_Nuance

       in which I removed the "say:" part.

       But very soon I received event CHANNEL_EXECUTE_COMPLETE for 
play_and_detect_speech before the text playing finishes,
       and speech_detect_result is null (actually the event header does 
not contain this variable)

       I tried with "say:" part with empty text like "say:unimrcp:en-GB: 
" but it doesn't work, see issue 3 below.

   3. It seems that I can not send "speak" command twice before the 
first one finishes. In my case I run one port TTS server on one machine.

      If I send the command twice FS will give me Synthesizer 
Error/Invalid TTS Module.
      I thought the TTS request would be queued rather than it 
immediately looks for the TTS resource.

      If I send the second command to another TTS machine, no error 
occurs but I can only hear one utterance being spoken,
      it looks like one utteraance was dropped somehow.

   Any ideas?

  Thanks!

Xing

On 16/11/11 16:51, xl127 wrote:
> Hi Christopher,
>
> The questions are cleared to me now. Many thanks for your explanations!
>
> Best regards,
>
> Xing
>
>
> On 16/11/11 15:52, Christopher Rienzo wrote:
>>
>> Responses inline
>>
>>
>>     Now it works in my ESL app though I am just able to do one
>>     dialogue ( I need to add the event catching for furthur dialgoues).
>>
>>     I have a couple of questions here:
>>
>>       1. In the first try, my Nuance server was able to be accessed
>>     somehow (FS says the MRCP is not responding in 5000ms,
>>         something like that), then FS says: [WARNING]
>>     rtsp_client.c:386 () Failed to Connect to RTSP Server
>>     *MailScanner warning: numerical links are often malicious:*
>>     99.185.85.31:554 <http://99.185.85.31:554>,
>>         later FS says:
>>          [ERR] mod_unimrcp.c:1860 (TTS-6) SYNTHESIZER channel error!
>>          [ERR] switch_ivr_play_say.c:2439 Invalid TTS module!
>>
>>        The SYNTHESIZER channel error and Invalid TTS module error are
>>     obvious.
>>
>>         What I don't understand is why it went to this stange
>>     address: *MailScanner warning: numerical links are often
>>     malicious:* 99.185.85.31:554 <http://99.185.85.31:554>?
>>
>>
>> check your unimrcp configuration.  Make sure the default TTS and ASR 
>> profiles are set to actual servers.
>>
>>       2. I specified TTS engine in play_and_detect_speech as
>>              "say:unimrcp:nuance5-mrcp1-1: the text to speak"
>>          It works though I didn't specify the TTS voice.
>>
>>          How do I specify the TTS voice? In the mrcp profile (how?)?
>>     or something like:
>>              "say:unimrcp:nuance5-mrcp1-1:Serena: the text to speak"
>>     (this seems not right.)
>>
>>
>> That won't work.  Set the tts_engine variable as I explained 
>> previously, or use say:unimrcp:voice:text to speak with the desired 
>> voice and the correct default TTS profile defined in 
>> unimrcp.conf.xml.  This is a limitation of the say: notation.  
>> Alternatively, the voice can be defined with the tts_voice channel 
>> variable.
>>
>>       3. The barge-in works well, thanks!. Is the barge-in
>>     configurable? In some scenarios, we might not allow barge-in.
>>
>>
>> If you don't want to barge in, just do "playback (or speak)" first, 
>> then "play_and_detect_speech" with a silence prompt.
>>
>>
>>       4. How could I get the text which has spoken to the user when
>>     barge-in occurs?
>>          Or Could I get the time when barge-in occurs? If I know the
>>     barge-in time and rough totale time for the whole text
>>          to be spoken I can figure out the spoken text by manually
>>     checking the recorded audio file later, which would be painful.
>>
>>
>> If this is necessary, you might want to use the lower-level functions 
>> instead to watch for the begin-speaking event.
>>
>>
>>       5. when I use "speak" and "detect_speech" apps in ESL, I can
>>     catch event: DETECTED_SPEECH and speech-type: begin-speaking
>>          and "detected-speech", then I do the recognition results
>>     processing.
>>
>>         The new app play_and_detect_speech seems not generate these
>>     events any more. The way that I can think of to get the results
>>         is to catch event:CHANNEL_EXECUTE_COMPLETE then check if
>>     variable_current_application=play_and_detect_speech, then get
>>         the results from variable_detect_speech_result.
>>
>>         Is this the proper way to get the results in ESL app? Or will
>>     play_and_detect_speech later on be consistent with detect_speech
>>         in term of ASR events?
>>
>>
>> play_and_detect_speech is a higher level abstraction to simplify 
>> things.  If you want to have more control, go back to using the ESL 
>> events.  Reading the code in mod_dptools and switch_ivr_async will 
>> give you hints about how to do it correctly.
>>
>>
>>       6. I'd like to set start-input-timers=false in the initial
>>     request then start the recognition timers (start-input-timers=true)
>>          after the TTS finishes.
>>          How possibly could I do this?
>>
>>
>> This is automatically done in the  
>> switch_ivr_play_and_detect_speech() function.  You just need to 
>> specify start-input-timers=false in the beginning.
>>
>>
>>
>> _________________________________________________________________________
>> Professional FreeSWITCH Consulting Services:
>> consulting at freeswitch.org
>> http://www.freeswitchsolutions.com
>>
>> 
>> 
>>
>> Official FreeSWITCH Sites
>> http://www.freeswitch.org
>> http://wiki.freeswitch.org
>> http://www.cluecon.com
>>
>> FreeSWITCH-users mailing list
>> FreeSWITCH-users at lists.freeswitch.org
>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
>> http://www.freeswitch.org
>
>
> ------------------------------------------------------------------------
>
> Scottish University of the Year 2011-12 *Heriot-Watt University is the 
> Sunday Times
>   Scottish University of the Year 2011-2012*
>
>   Heriot-Watt University is a Scottish charity
>   registered under charity number SC000278.
>
>
> _________________________________________________________________________
> Professional FreeSWITCH Consulting Services:
> consulting at freeswitch.org
> http://www.freeswitchsolutions.com
>
> 
> 
>
> Official FreeSWITCH Sites
> http://www.freeswitch.org
> http://wiki.freeswitch.org
> http://www.cluecon.com
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org

-- 
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20111117/d8e8b047/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 4803 bytes
Desc: not available
Url : http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20111117/d8e8b047/attachment-0001.jpe 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hw_uni_of_year.jpg
Type: image/jpeg
Size: 4803 bytes
Desc: not available
Url : http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20111117/d8e8b047/attachment-0001.jpg