[Freeswitch-dev] [Freeswitch-users] C SAY API - Issues

David Sugar dyfet at gnutelephony.org
Tue Dec 19 06:35:39 EST 2006

I actually spent close to 6 months on this very problem back in 1999
before we even did the initial release of ACS.  The solution we used in
that (and later in Bayonne) was a phrase rules parser/description
language which can manipulate sampled audio prompts (the phrasebook...)
and has means of marking context of each component.  The goal we were
working toward was "describe/code once, speak anywhere" by having the
rules parser execute against a plugin which would re-interpret the
meaning of the rules description used in the phrase based on the
language and location.  As Peter notes there are many different language
cases with gender, special case numbering rules, etc.  Plural rules also
can be complex, as well as ordering rules which will be different in
different languages.  Consider even the very simple case of these three
possible prompts:

"You have no messages waiting"
"You have 1 message waiting"
"You have 2 messages waiting"

In this case we have both a zero substitute rule (no for 0), and a
plural form rule case around "1", but even then these may all be unique
in English...

Peter Nixon wrote:
> On Sun 17 Dec 2006 08:00, Matt Porter wrote:
>> You are really going to need some feedback about this one.
>> Its been a terribly long time, and i have forgotten most of the issues we
>> faced.. but when we ported a unified messaging product many years ago to
>> some of the eastern languages.  It was almost impossible to abstract this
>> concept out.
>> Abstracting something like speaking a URL, Phone number, Numeric string is
>> perfectly attainable... but constructing a useful working sentence is much
>> more complicated..
>> given "You have 1000 messages", in english.
>> may need to say "1000 messages you have", before it makes any sense in
>> Chinese or Piglatin or whatever.
> You are correct that the linguistic typology of a language is important. The 
> majority of the worlds languages (At least many of the older ones) are SOV, 
> however a few imporant ones like English, Romance Languages and Chinese are 
> SVO. VSO is much less common, although some variants of Arabic and those 
> crazy Irish use it.. (See http://en.wikipedia.org/wiki/Linguistic_typology 
> for more info)
> Unfortunately that is not the only issue however. Turkish for example (the 
> only language other than English that I am close to fluent in at present) 
> has no concept of gender. (You don't specify someone as "he" or "she", 
> simply as "that"). On the other hand French has a gender for every object, 
> not just living things! The structure of pronouncing numbers also varies of 
> course... Do you pronounce "13" as "thirteen" or as "ten three". What 
> about "113" and "1113"? (Actually English probably has the most insane rules 
> for pronouncing numbers of all the languages I have come across) Many 
> languages also (Turkish included) require that both pronunciation and/or 
> spelling be changed depending on the preceeding or following word to make 
> the pronunciation more "musical" or smoother. ("K" gets changed to "soft G" 
> for example to avoid "harshness" in the middle of a sentence).
> All of these little difference can crop up in surprisingly short, simple 
> phrases making the job of such an API more complex that you might initially 
> imagine :-)
> Cheers
> ------------------------------------------------------------------------
> _______________________________________________
> Freeswitch-dev mailing list
> Freeswitch-dev at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-dev
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-dev
> http://www.freeswitch.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dyfet.vcf
Type: text/x-vcard
Size: 177 bytes
Desc: not available
Url : http://lists.freeswitch.org/pipermail/freeswitch-dev/attachments/20061219/9986c736/attachment.vcf 

More information about the Freeswitch-dev mailing list