[Freeswitch-users] limits on text to speech output length

A. Lester Buck III freeswitch-reg at compact.com
Thu Sep 23 13:57:12 PDT 2010


I'm running Freeswitch built from trunk a week or so ago, 1.0.head.

I am using a Cepstral voice for text to speech.  I generate a long
string of text, including SSML breaks, that is rendered to speech by the
swift utility packaged with Cepstral voices.  The test string I built
renders as 1m:46s of speech, and a .wav file about 1.6MB, from the swift
command line.

I changed single quotes to double quotes and escaped them, as mentioned
on the wiki page for mod_cepstral, and I now have a working text to
speech path using ESL built for Ruby 1.9.2, using the sample Ruby code
on the wiki.

Everything basically works fine, even on my test machine which is a
$5/mo Xen VPS with 10ms kernel timer resolution.  (lowendbox.com)

My issue is that the speech cuts off midway through the call, when I
guess some limit on generated output is being reached.  Before I dig
into the guts of mod_cepstral and Freeswitch, could someone point me to
what might be limiting the length of output?  Swift seems to generate a
wav file of indefinite length, so at least the issue is somewhere in the
open source parts of the server.

Having yet looked at the guts of mod_cepstral, I presume that a
high-performance output module does not have the swift utility write a
.wav file to disk and then play it back, but instead lets swift shove
the bytes directly into a socket.  What is a good Freeswitch example of
this architecture?



More information about the FreeSWITCH-users mailing list