[Freeswitch-users] Detecting the origin of voice activity using VAD

Andy Spitzer woof at nortel.com
Mon Mar 2 09:16:33 PST 2009


Woof!

On Sun, 01 Mar 2009 21:28:18 -0500, Brian West <brian at freeswitch.org> wrote:

> NO.  You want something that people THINK exists and works well...
> Reliable human/voice detection doesn't exist in ANY form.

I beg to differ.  See http://www.freepatentsonline.com/5521967.html for one way to do it.  It works rather well and can quickly descriminate between voice and tone.  I've no idea who owns that patent now (not me, for sure).

There is a simpler, less reliable way of differentiating voice from tone, that as far as I know isn't patented.  If you compare the RMS power levels of sequential 40 mS periods, call progress tones will have very consistent power levels from sample to sample.  So if 5 or more 40 mS periods have about the same power measurement (within say, 2%), it's a tone.  Voice will have dramatic power level differences over that same period.  This works very well in today's telephony environment, where tones are computer generated.  In the old days when ringback tone was generated off the audio hum from the 20 Hz ring voltage generator...not so well.

--Woof!




More information about the FreeSWITCH-users mailing list