<br>Responses inline<br><br><br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div text="#000000" bgcolor="#FFFFFF">
Now it works in my ESL app though I am just able to do one dialogue
( I need to add the event catching for furthur dialgoues).<br>
<br>
I have a couple of questions here:<br>
<br>
1. In the first try, my Nuance server was able to be accessed
somehow (FS says the MRCP is not responding in 5000ms, <br>
something like that), then FS says: [WARNING] rtsp_client.c:386
() Failed to Connect to RTSP Server <a href="http://99.185.85.31:554" target="_blank">99.185.85.31:554</a>,<br>
later FS says: <br>
[ERR] mod_unimrcp.c:1860 (TTS-6) SYNTHESIZER channel error!<br>
[ERR] switch_ivr_play_say.c:2439 Invalid TTS module! <br>
<br>
The SYNTHESIZER channel error and Invalid TTS module error are
obvious.<br>
<br>
What I don't understand is why it went to this stange address:
<a href="http://99.185.85.31:554" target="_blank">99.185.85.31:554</a>?<br></div></blockquote><div><br>check your unimrcp configuration. Make sure the default TTS and ASR profiles are set to actual servers.<br> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF">
2. I specified TTS engine in play_and_detect_speech as<br>
"say:unimrcp:nuance5-mrcp1-1: the text to speak"<br>
It works though I didn't specify the TTS voice.<br>
<br>
How do I specify the TTS voice? In the mrcp profile (how?)? or
something like:<br>
"say:unimrcp:nuance5-mrcp1-1:Serena: the text to speak"
(this seems not right.)<br></div></blockquote><div><br>That won't work. Set the tts_engine variable as I explained
previously, or use say:unimrcp:voice:text to speak with the desired voice and the correct
default TTS profile defined in unimrcp.conf.xml. This is a limitation
of the say: notation. Alternatively, the voice can be defined with the tts_voice channel variable.<br>
<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF">
3. The barge-in works well, thanks!. Is the barge-in configurable?
In some scenarios, we might not allow barge-in.<br></div></blockquote><div><br>If you don't want to barge in, just do "playback (or speak)" first, then "play_and_detect_speech" with a silence prompt.<br>
</div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF">
<br>
4. How could I get the text which has spoken to the user when
barge-in occurs?<br>
Or Could I get the time when barge-in occurs? If I know the
barge-in time and rough totale time for the whole text<br>
to be spoken I can figure out the spoken text by manually
checking the recorded audio file later, which would be painful.<br></div></blockquote><div><br>If this is necessary, you might want to use the lower-level functions instead to watch for the begin-speaking event.<br> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF">
<br>
5. when I use "speak" and "detect_speech" apps in ESL, I can catch
event: DETECTED_SPEECH and speech-type: begin-speaking<br>
and "detected-speech", then I do the recognition results
processing.<br>
<br>
The new app play_and_detect_speech seems not generate these
events any more. The way that I can think of to get the results<br>
is to catch event:CHANNEL_EXECUTE_COMPLETE then check if
variable_current_application=play_and_detect_speech, then get<br>
the results from variable_detect_speech_result.<br>
<br>
Is this the proper way to get the results in ESL app? Or will
play_and_detect_speech later on be consistent with detect_speech <br>
in term of ASR events?<br></div></blockquote><div><br>play_and_detect_speech is a higher level abstraction to simplify
things. If you want to have more control, go back to using the ESL
events. Reading the code in mod_dptools and switch_ivr_async will give
you hints about how to do it correctly.<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF">
<br>
6. I'd like to set start-input-timers=false in the initial request
then start the recognition timers (start-input-timers=true)<br>
after the TTS finishes.<br>
How possibly could I do this?<br></div></blockquote><div><br>This is automatically done in the switch_ivr_play_and_detect_speech() function. You just need to specify start-input-timers=false in the beginning.<br>
</div></div><br>