<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 12 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Yeah, well, the thing is that it needs to be interactive (like an IVR system), so recording the voice and using a 3<sup>rd</sup> party service to translate a file is not an option for me right now. That’s why I’m using MCRP to communicate FreeSwitch with the ASR engine.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>All I need is to figure out how to get the recognized text from a given ASR, without constraining the user to talk in an specific way (by a grammar) and, if possible, without training SLMs. My guess is that it’s not possible, as I can’t find any resources on the web showing this feature, but I’m still optimistic hoping to find an ‘easy’ way of doing the transcription.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Pehr Anderson [mailto:pehr@harqen.com] <br><b>Sent:</b> Wednesday, June 22, 2011 12:17 PM<br><b>To:</b> FreeSWITCH Users Help; Hector Geraldino<br><b>Subject:</b> Re: [Freeswitch-users] FS and ASR engine<o:p></o:p></span></p></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-bottom:12.0pt'>You might check out <a href="http://Nexiwave.com">http://Nexiwave.com</a><br>They have been active at Cluecon and have a web API <br>that does fully hosted ASR on WAV's or MP3's. <br><br>They are doing innovation around running their ASR in GPU clusters,<br>which means their internal cost of operation is likely to be the lowest in <br>the industry. <br><br>Good ASR is always going to be computationally intensive,<br>so it is helpful to have somebody managing that for you.<br>It's going to be a lot easier that trying to juggle your own sphinx training sets.<br><br><a href="http://nexiwave.com/index.php/pricing">http://nexiwave.com/index.php/pricing</a><br><br> --pehr<o:p></o:p></p><div><p class=MsoNormal>On Wed, Jun 22, 2011 at 9:54 AM, Hector Geraldino <<a href="mailto:Hector.Geraldino@ip-soft.net">Hector.Geraldino@ip-soft.net</a>> wrote:<o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Hi everyone,<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I want to check with you guys to see if anyone has experience integrating FreeSwitch with an ASR engine, and using the engine as a merely transcriber of the conversation. <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I’ve been playing for the past two weeks or so with Nuance Speech Server/recognizer and pocketsphinx. Nuance is by far a better solution, but due to the lack of freely available documentation and my short expertise in this subject, I haven’t been to achieve my goal.<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>The communication between FS and the ASR engine works great using MCRP, my concern is with the ASR engine itself. I want to allow the user to speak freely, and get a transcription of what the user said. I don’t want or need to understand what the meaning of the utterances are (definitely the engine doesn’t need to do that), also I don’t need/want to write any complex grammar or SLM to get an interpretation of the spoken phrases, I just want the plain text of what has been said. No decisions will be taken based on what the user said, this information will just be passed to a 3<sup>rd</sup> application.<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I don’t know if this can be achieved or not without developing grammars (not suitable for open-ended dialogs) or training statistical language models. What I do recall is using Dragon Speak in MS word for dictation, without the need of doing some trtraining or developing grammars. That’s exactly what I’m pursuing: a simple plain text transcription of the spoken words.<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Have anyone of you deal with something like this by any chance?<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Thanks for your help. I apologize if this is not the right place to ask this type of questions.<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Thanks again,<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Hector<o:p></o:p></p></div></div><p class=MsoNormal style='margin-bottom:12.0pt'><br>_______________________________________________<br>Join us at ClueCon 2011, Aug 9-11, Chicago<br><a href="http://www.cluecon.com" target="_blank">http://www.cluecon.com</a> 877-7-4ACLUE<br><br>FreeSWITCH-users mailing list<br><a href="mailto:FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><br><a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><br>UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><br><a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><o:p></o:p></p></div><p class=MsoNormal><br><br clear=all><br>-- <br>Pehr Anderson, VP Platform Technology<br>HarQen - <a href="http://HarQen.com" target="_blank">http://HarQen.com</a> <br>414-755-1962 x114<br><a href="mailto:pehr@harqen.com" target="_blank">pehr@harqen.com</a><o:p></o:p></p></div></body></html>