<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hello Guys,<o:p></o:p></p>
<p class="MsoNormal">I know your time is valuable, but I am using this mailing list as my last resort. I have tried to look for answers everywhere else, including the Confluence, wiki and google, and previous posts on the mailing list.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So here’s my dilemma. I have a custom speech recognition engine which I want to use with freeswitch. I am not looking to use the built-in pocketsphinx module or using an mrcp server. What I am doing right now is, when a call is answered,
I am recording it to a file using record_session. I have a java application that reads from this file and sends data to my speech engine. This would have been sufficient, if not for the fact that freeswitch writes 4 second chunks of audio at a time. I verified
this by writing a small test. The recorded file is updated every 4 seconds with the previous 4 seconds of audio. This is not an acceptable delay for my particular application.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So I tried looking into other options where I can directly access the call audio. I know mod_shout can be used to setup an icecast stream, but that’s not really what I would like to do. Same for the telecast feature, as I would like to
use the raw audio stream instead of transcoded audio. I read in one post on stackoverflow that it is possible to use eavesdropping to listen to the audio stream. But it didn’t go into any further details on how this can be achieved through an external application,
like Java. Looking up examples for eavesdropping on freeswitch didn’t help either.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So I’d like to know, if it’s possible to directly access the audio stream from, say a Java program using the freeswitch library? If not, how does pocketsphinx and other built in speech engines really read in the audio? Also, if direct access
is not possible, is there any way to change the recording write speed?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks.<o:p></o:p></p>
</div>
</body>
</html>