[Freeswitch-users] Language Handling: call for assistance

Steve Underwood steveu at coppice.org
Thu Jul 2 16:50:29 PDT 2009


Raul Fragoso wrote:
> On Fri, 2009-07-03 at 01:29 +0800, Steve Underwood wrote:
>   
>> Michael Collins wrote:
>>     
>>> On Thu, Jul 2, 2009 at 3:01 AM, Igor Neves <igor at 3gnt.net 
>>> <mailto:igor at 3gnt.net>> wrote:
>>>
>>>     Hi,
>>>
>>>
>>>     Michael Collins wrote:
>>>       
>>>>     Hello all!
>>>>
>>>>     There's been some discussion lately on how to handle multiple
>>>>     languages, specifically with the *say* application. We would like
>>>>     some input from the community on how to handle multiple languages
>>>>     and sound files. Anthony notes that the say application needs to
>>>>     build the path to the sound files by using the ${sound_prefix}
>>>>     and ${lang} variables. Some have asked about countries or
>>>>     language variants, like European Portugese vs. Brazilian
>>>>     Portugese. These are good questions.
>>>>         
>>>     What it's the problem about Portuguese VS Brazilian?
>>>
>>>     Can't we just use "PT_pt" and "PT_br" in ${lang}, just like a lot
>>>     of others softwares do?
>>>
>>>     What about ${sound_prefix} = ${lang}, since ${lang} should always
>>>     be unique, and you make the path's automatically language organized?
>>>
>>>
>>> This is reasonable to me, but it would be nice to have a consensus, 
>>> just to be sure.
>>>  
>>>
>>>
>>>
>>>       
>>>>     >From the community we need input. If you have experience with
>>>>     multiple languages in a telephony environment then please give us
>>>>     your suggestions. How would you like to see the say application
>>>>     handle various languages and dialects? Please give us your
>>>>     helpful suggestions.
>>>>
>>>>     Thanks,
>>>>     Michael
>>>>         
>>>     Sorry if I miss understood something.
>>>     Cheers,
>>>
>>>
>>> Believe, the moment we put this into place we will have someone 
>>> purporting to be an expert offering a completely new solution. That's 
>>> why we asked for input now, before Tony spends a lot of time working 
>>> on it. 
>>> -MC
>>>       
>> The PT_pt format is for written languages, rather than spoken languages. 
>> There is often a difference.
>>
>> The SSML 1.1 spec references http://www.ietf.org/rfc/bcp/bcp47.txt as a 
>> definition of how to identify a language and accent for speech. I'm not 
>> clear if its really works, though.
>>
>> Steve
>>     
>
>
> I think that would be overkill. The usual way of using i.e. "pt-br" (two
> letters for the main language, dash and then two more letters for the
> dialect/variation) would be enough.
>   
If by "the usual way" you mean the standard 2 + 2 letter codes we are 
used to on computers, that just doesn't work. As I said before, those 
are for written languages, not spoken languages. There are no standard 
codes for many spoken languages. For example, the standard codes for 
Chinese are zh_cn for mainland China, zh_tw for Taiwan, zh_hk for Hong 
Kong. However, in GuangDong you will probably want to offer Cantonese as 
well as Mandarin voice prompts, so you will want a zh_gd, or something, 
which you won't find among the standard 2 + 2 letter codes. That's why 
the SSML people had a hard time coming up with a language scheme, and 
SSML 1.0 didn't even reference one. The more you look around the world, 
the most complex the issue of language variants becomes. If you don't 
face that at the beginning it just gets messier later on.

Steve





More information about the FreeSWITCH-users mailing list