[Freeswitch-users] limits on text to speech output length

Daniel Neubert daniel.neubert at solomo.de
Thu Sep 23 21:41:27 PDT 2010


  It's not only good for having a better scalable infrastructure - it 
also gives you the ability to freely change your tts engine or add 
functionality like speech recogniction or recording.

Daniel

Am 24.09.2010 05:59, schrieb A. Lester Buck III:
> Dear Daniel,
>
> Thanks so much for the pointer.  I'll certainly check that out.  I don't
> really have much of a scaling issue at the moment, but I sure hope I do
> someday!  Always good to keep the architecture as small parts loosely
> joined.
>
> Best regards,
>
> Lester
>
>
>> Hi,
>>
>> instead of calling swift directly you can consider calling it via an 
>> MRCP server like UniMRCP.
>>
>> It's all free software and works like a charm in my setup. I'm using 
>> mod_unimrcp on our FreeSWITCH servers and am running UniMRCP as a 
>> server on a single node that is our dedicated speech server.
>>
>> That scenario also offloads the task of speech generation to another 
>> machine to save resources on the FreeSWITCH nodes. If you use other 
>> cepstral languages than english you need to manually patch the sources.
>>
>> Daniel
>>
>> Am 23.09.2010 22:57, schrieb A. Lester Buck III:
>>> Hi,
>>>
>>> I'm running Freeswitch built from trunk a week or so ago, 1.0.head.
>>>
>>> I am using a Cepstral voice for text to speech.  I generate a long
>>> string of text, including SSML breaks, that is rendered to speech by the
>>> swift utility packaged with Cepstral voices.  The test string I built
>>> renders as 1m:46s of speech, and a .wav file about 1.6MB, from the swift
>>> command line.
>>>
>>> I changed single quotes to double quotes and escaped them, as mentioned
>>> on the wiki page for mod_cepstral, and I now have a working text to
>>> speech path using ESL built for Ruby 1.9.2, using the sample Ruby code
>>> on the wiki.
>>>
>>> Everything basically works fine, even on my test machine which is a
>>> $5/mo Xen VPS with 10ms kernel timer resolution.  (lowendbox.com)
>>>
>>> My issue is that the speech cuts off midway through the call, when I
>>> guess some limit on generated output is being reached.  Before I dig
>>> into the guts of mod_cepstral and Freeswitch, could someone point me to
>>> what might be limiting the length of output?  Swift seems to generate a
>>> wav file of indefinite length, so at least the issue is somewhere in the
>>> open source parts of the server.
>>>
>>> Having yet looked at the guts of mod_cepstral, I presume that a
>>> high-performance output module does not have the swift utility write a
>>> .wav file to disk and then play it back, but instead lets swift shove
>>> the bytes directly into a socket.  What is a good Freeswitch example of
>>> this architecture?
>>>
>>>
>>> Thanks,
>>>
>>> Lester
>>>
>>> _______________________________________________
>>> FreeSWITCH-users mailing list
>>> FreeSWITCH-users at lists.freeswitch.org
>>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
>>> http://www.freeswitch.org
>> ATT00001.txt
>>
>>
>> _______________________________________________
>> FreeSWITCH-users mailing list
>> FreeSWITCH-users at lists.freeswitch.org
>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
>> http://www.freeswitch.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20100924/56cc86d3/attachment.html 


More information about the FreeSWITCH-users mailing list