[Freeswitch-users] Proper prompt gain/level

Tue Jun 28 05:31:00 MSD 2011

On 06/28/2011 04:00 AM, Bryan Smart wrote:
> I don't want to maximize peaks all the way to 0DB. I just wondered why a low level like -16DB was used. When I'd previously created prompts on Asterisk systems, I used -6DB as a peak (50% of max gain for the channel).
The peaks of voice are general something like 12 to 13dB above the short 
term RMS value. If the -16dB value you quoted is -16dBOv, its 13dB below 
clipping. Your peaks should be close to clipping at -16dB.
> I wasn't aware of any regulations about prompt level. If there is a standardized level, that is what I'd like to use. Perhaps the louder systems are disregarding standards? Does anyone have links to such info? I've been unable to find anything definitive, only opinions.
In many places the regulation says the power on a PSTN line should not 
exceed -13dBm0. That's why -13dBm0 is the target power level for most 
PSTN modems.
> Listen to TellMe (+1-800-555-8355). It's at least 3X the gain of the default FS prompts. Is TellMe in error? I've called them through both FS and Asterisk, using Vitelity and Callcentric, so I'm fairly sure that I'm not being mislead by a switch or ITSP boosting the gain. I always felt their level was strong and intelligible, without sounding overwhelming.
Not everyone bothers to obey regulations these days, and many people do 
love to blast sound into an overloaded highly distorted mess, because 
volume is king. Its not good for clarity, though. What the loudest 
people do is no measure of good engineering. Sadly, this behaviour might 
make people set their levels around the excessively loud signals, so a 
properly adjusted signal sounds too quiet. In the early days of my FAX 
modem work I received many recordings from people who could not get 
reliable results, where the audio was perpetually in clipping, and any 
speech through the channel would have sounded awful. They would 
generally insist that voice was "perfect" on their system. There is a 
serious lack of engineering in most VoIP work.
> One point that caught my attention, though, is that you said -16DB for both average and peak power. Average and peak power don't come out the same, though.
>
> So that we can talk about something concrete, consider conference/32000/conf-enter_conf_pin.wav. Its peak power is -15.7DB. However, its average power (RMS) is -31.3DB! -31DB is profoundly quiet. If its average power is boosted to -16DB, then the peak power is now around -2DB. As long as peak power is less than 0DB, then the audio won't clip, but it might be too loud for comfort. I previously used -6DB for a peak, as I couldn't find any real guidelines regarding levels, and -6 sounded good to me.
You still haven't said whether you are talking dBm0 or dBOv. It makes a 
6dB difference. Also, what do you mean by peak power? If the peaks of 
the short term RMS power are hitting 0dB, the peaks of the waveform will 
be far into clipping. If you are talking about dBOv, then -6dB is only 
3dB from the onset of clipping, and voice will clip a lot. If you are 
talking dBm0, -6dB is 9dB from clipping, and the voice will only clip a 
bit, and maybe not sound too bad. However, clipped voice tends to pass 
through low bit rate codecs worse than clean voice, so you might want to 
keep the clipping down to a really occasional event. Voice codecs have 
at least 12 bits of dynamic range. They are designed to allow a voice to 
bubble along at -30dB with good quality, and burst up to a much higher 
level in the loud bits.
> Maybe FS is lower than it should be. Maybe other services are louder than they should be. If FS should be louder, though, I'd like to help to change the levels up-stream, rather than locally reprocessing the prompts.
>
> So, is this a personal judgement case, or are their standards available that can be consulted?
>
> Bryan
>
> On Jun 27, 2011, at 9:48 AM, Steve Underwood wrote:
>
>> On 06/27/2011 07:11 AM, Bryan Smart wrote:
>>> I have tools to batch-process audio files. I just was not sure that regaining all of the prompt files was the best approach. I figured that the gain must have been reduced so dramatically for some sort of reason (to avoid clipping in some situation, to work better with the internal resampling, etc).
>>>
>>> What AGC do you mean? I know that AGC has recently been added to conferencing, but the level of the prompts is a system-wide situation. As far as I know, there isn't AGC that can be applied on every channel, and, even if there was, there would surely be a processing hit, so the goal would be to avoid needing it, right?
>>>
>>> The root problem, at least for me, is this. I need to add voice prompts and other audio for an IVR. I can't simply normalize all of my prompts to 0DB, as, even though they don't distort, they're so loud when compared to the stock prompts, they'll blow the phone out of my hand. To match them to the stock prompts, I must normalize them to around -16DB. I can do that, but it seems very wrong. At -16DB, nearly 85% of the potential gain of the channel is lost.
>> -16dBM0 or -16dBOv, and average or peak burst power? -16dBOv for the
>> average power is about where you want a voice prompt to be. In some
>> juristictions you could be in breach of a regulation or two if you set
>> the level higher than that on the PSTN. Why would you set a voice prompt
>> to 0dB? It will be clipping like crazy.
>>> Try this... With the demo IVR (5000), add this before the sleep command in the dialplan.
>>>
>>> <action application="set_audio_level" data="write 4"/>
>>>
>>> That is the max gain boost available for a channel. The prompts should be really clipping with that much amplification, but they don't clip at all. At -16DB, you could literally amplify them to 6 times their native level without distorting. Native level is too low. Once I realized this, it became clear to me why Freeswitch sounded more quiet than Asterisk, at least when working with recorded prompts.
>>>
>>> I suppose I could use set_audio_level on every last call, but I'm sure that real-time amplification, like AGC, is another processor drain that builds up with lots of calls. Besides, it seems weird to dramatically reduce the level of audio, and then waste cycles amplifying it back up in real-time.
>>>
>>> Bryan
>> Steve
>>
>>
>> _______________________________________________
>> Join us at ClueCon 2011, Aug 9-11, Chicago
>> http://www.cluecon.com 877-7-4ACLUE
>>
>> FreeSWITCH-users mailing list
>> FreeSWITCH-users at lists.freeswitch.org
>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
>> http://www.freeswitch.org
>
> _______________________________________________
> Join us at ClueCon 2011, Aug 9-11, Chicago
> http://www.cluecon.com 877-7-4ACLUE
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org
>