[Freeswitch-users] Proper prompt gain/level

Wed Jun 29 21:07:26 MSD 2011

On 06/29/2011 03:26 AM, Bryan Smart wrote:
> I must be wrong about DBm0, then. I wasn't familiar with it, so I checked Wikipedia. Wikipedia isn't always right, of course.
>
> Wikipedia says...
> ----------
> dBm0 is an abbreviation for the power in dBm measured at a zero transmission level point.
>
> dBm0 is a concept used (amongst other areas) in audio/telephony processing since it allows a smooth integration of analog and digital chains. Notably, for
> A-law and μ-law codecs the standards define a sequence which has a 0 dBm0 output.
>
> ......
>
> Note 2: 0 dBm0 is often replaced by or used instead of digital milliwatt or zero transmission level point.
> ----------
>
> Where is my zero level transmission point? Data from a codec is not power being transmitted, only data for reproducing power at some later time. Isn't it all up to the D/A converter at the far end to determine what the power levels yielded by decoding the data will be represented relative to. If it is a hardware SIP phone, maybe the person has the volume cranked up, or turned down. Since the analog representation of the signal starts in the D/A that feeds the handset, what is considered 0? I have no idea. It sounds like this is a scale used for calibrating a D/A that feeds a pstn circuit.
Look in the G.711 spec. It defines a specific digital signal that is 
0dBm0 for both u-law and A-law.
> I admit feeling frustrated by this discussion. I've mixed music and mastered CDs for nearly 15 years, and I've always felt that 0DB in an entirely digital domain is a near universally understood concept. If I have a sample of 16-bit signed LPCM audio, then -32768 or 32767 represent the max amplitude that can be stored, and is what everyone that I know in pro audio calls 0DB when speaking strictly about a file, rather than a PA, in/out levels to an analog device like a tape machine, etc. If the gain of the encoded signal is boosted to a point where none of the samples are pushed beyond this range, then none of them clip. I thought that you'd think of DB in the same way, given we were talking about levels in files, but it feels like you're nit-picking or antagonizing me. It sounds like you're telling me that they can clip, even if the gain was never increased to a point where this range would be overflowed, but I have no idea how that is possible. It does not happen when I play such audio through the D/A on a computer's sound card, nor when I store it on a CD and play it back through a stereo.
This is highly inaccurate. Power is usually measured as an RMS value, 
and this does not relate directly to the height of the samples. It 
related to the integral of the samples over time. As I said before, the 
crest factor (basically the ratio between peak instantaneous power and 
RMS power) is about 13dB for speech. Its about the same for a voice 
singing. Its lower for most musical instruments. Nonetheless every 
musical signal has some peakiness, and 0dBOv RMS power will certainly 
clip badly. You might be refering the the measurements on PPM or other 
forms of metering which try to track the audio peaks. If those are 
working well, you can run up to 0dB on their scale before clipping. The 
RMS power will be considerably lower, though. Recording desks use peak 
power meters specifically because RMS power doesn't give you much idea 
about how close to clipping you are.
> I take your point about headroom. Still, I don't feel that the current level matches caller speech. If I felt that the prompts blended well, I wouldn't have even put myself through this thread. I've connected a mix of hardware SIP phones, desktop SIP clients, and iOS SIP clients to a conference, with no modifications to the audio level of the channel, and the level of people speaking to each other in the conference is significantly louder than any prompts that are played through it. Maybe *all* of the clients are pushing audio too strongly. I first thought it was something to do with the conference, but I soon realized that the prompts were quiet everywhere, not just when played in a conference.
Gains in the PSTN are largely controlled in a professional manner. On 
the internet its currently chaos, with many voice signal clipping 
horribly. Instead of a well balanced set of gains down the signal chain 
you get people pumping up the gain at one point, the massively 
attenuating it at another, with no regard for the distortion they are 
introducing.
> I don't expect anything to change due to my personal preference, and I realize that a background in digital audio as applies to music and live recording doesn't mean that I'm not ignorant about many things that involve digital audio as it applies to telephony. I raised the issue here to see if I might be doing something wrong. If not, I wondered if the prompt levels are set based on some sort of standard by people that are wiser regarding the details than me? It seems, though, that there really isn't a standard, and the decision is someone else's personal preference. I'd rather that personal preferences never be a default. Since there isn't an official standard/guideline, I suppose that someone has to make a decision, and that decision will be influenced by their own preferences.
Preferences shouldn't be a factor. This is supposed to be engineered.
> I have a few suggestions to help improve Michael's sox script, but that's where I'll leave this issue. It isn't worth a big argument when I can fix it myself.
>
> I apologize to the list for any smoke, hints of flames, or other frustration that might have leaked through in to my posts. The sound level thing matters to me, but it is really a small thing. I enjoy Freeswitch immensely, and really appreciate everyones' efforts in producing and updating it. I've always been fascinated with phones and any type of interactive phone app, and so Asterisk, and now Freeswitch, really spark my imagination. I'm 34, and the study of phones in my early teens was my first conceptual exposure to a large network. I phreaked a bit at the time. I was fortunate (at least in one regard) to live in the US deep south, where digital switching equipment wasn't common, and so all of the old 1970's techniques weren't unavailable. That didn't last long, but it peaked my curiosity. I used VXML for a project for an employer in 2002 or so, but didn't really feel any excitement about phones, in the way that I used to, until I ran across Asterisk. Asterisk was great for the time, but I wanted to use it more for apps than a PBX, and so was quite excited to discover the different design of Freeswitch. I rarely become excited about new environments and frameworks anymore, but Freeswitch has put me back in to a fun mood of exploration and experimentation.
There is nothing wrong with what you are doing. Its good to see someone 
care about gains.

Steve