Sat May 2 10:13:00 PDT 2009

 In my comments on mod_pocketsphinx, I wasn't clear enough about it being "virtually unusable in real world scenarios." Also, the grammars I'm talking
about are either single words, like "yes/no" or more complex like
"leave a message." It doesn't matter how complex the grammar, the issue

My comments are meant in comparison to other asr's and in everyday situations of background noise. I'm not taliking about checking things out at a concert, race track, subway, construction project, etc.

When compared to my AT&T 411 service, AT&T's asr has no where near the problems in dealing with the background noises I'm talking about and is very usable in the real world situations I'm taliking about.
Moreover, when comparing to another vendors asr on my hardware then that vendors asr also has no were near the problems mod_pocketsphinx has and again is very usable in a real world situation. That's why I suggested using something other than mod_pocketsphinx.

think that mod_pocketspinx is not able to deal with low signal-to-noise
ratios to the point where it can be used in telephony at all. At least that's the way it seems to me.
I don't know what else to say. That's been my experience with mod_pocketsphinx 


Hi Moiz,

I've checking out mod_pocketshinx against other asr's on Windows with the same hardware.?
No matter what settings one tries, mod_pocketsphinx is virtually unusable in real world scenarios.?

I have used it and it works fine... I think your expectations are a bit high for it... Complex things like dictation is not what PocketSphinx is for. ?You should try linux cuz I know it works great there.

One can play around with mod_pocketsphinx settings so that it picks voice up well but then there better not be any background noise either from a bad connection or just everyday sounds.?

There is no other ASR out there that doesn't get pissed off at background noise or any noise for that matter... have you called AT&T and Sprint lately? ?My dogs barking in the background really send theirs into fits and they paid tons of money for it. ?

It just way to sensitive and of couse you'll notice this problem most with cell phones.

Same with commercial ASR, Granted the acoustical model for PocketSphinx wasn't done with any files recorded from cellphone from what I can tell. ?You can do adaptation of the acoustical model as per the Sphinx wiki to make it more accurate for your needs.... that takes time and effort but it works.

If you adjust the settings to try blocking out background noise you'll find you don't suceed all that well and then there are problems picking up the callers voice.

Those settings are for telling when the person stopped talking... nothing more.

It looks like some kind of signal pre-processing is required that isn't in place yet but we all know that this is a work-in progress.

I'm not working on it... I run the pizza demo with PS and it works from my polycom rather well I would say it gets some things wrong but it does score them low so you can verify it in your scripts.

I don't know if esl would make any difference. To use FS and an ASR/TTS I just bridge calls to another ASR application for now.?



