[Freeswitch-users] Hung Channels (SVN Rev 10231)

Nik Middleton nik.middleton at noblesolutions.co.uk
Thu Mar 5 15:39:45 PST 2009


Well if it's any consolation, I have a 4 day ish old copy of SVN and I
have around 200 of these hung calls, though after an hour or so they did
seem to clear.

That said, FS made 138,330 call attempts today, not too shabby, and
through out the call quality was as good as the first one.  Not sure how
to debug this one.

Version: FreeSWITCH Version 1.0.trunk (12276)

-----Original Message-----
From: freeswitch-users-bounces at lists.freeswitch.org
[mailto:freeswitch-users-bounces at lists.freeswitch.org] On Behalf Of Eric
Liedtke
Sent: 05 March 2009 23:23
To: freeswitch-users at lists.freeswitch.org
Subject: Re: [Freeswitch-users] Hung Channels (SVN Rev 10231)

Yup, as I mentioned to brian didn't want to clog jira with a bug that's
been fixed or report against a rev 2k+ revs behind. I was trying to work
through it as a learning exercise. And yeah I actually added a bunch of
stuff to the list_sessions function to spit out a variety of associated
variables for each session looking for a pattern somewhere to clue me
into what might be happening.

No proxy or bypass media here, just defaults.

I will keep at it and once we update the production systems, if the
problem persists I will open a bug in jira with all the neccessary
goodies. 

Thanks
-e

It's seems fuzzy now but I think on Thu, Mar 05, 2009 at 05:55:33PM
-0500 , Mathieu Rene said:
> HI,
> 
> If you suspect a bug, the place to report it is JIRA. See:
http://wiki.freeswitch.org/wiki/Reporting_Bugs 
> .
> This gives the whole team a way of following up on issues.
> 
> Also can you upgrade to svn trunk? A lot of fixes gets committed  
> daily, so its good to stay up to date.
> 
> As you seem familiar with GDB, you may symlink the .gdbinit file in  
> the support-d/ folder to your home directory.
> This will give you some FS-specific macros such as "list_sessions"  
> which will dump a list of uuids to session pointers.
> 
> In your jira, make sure you include "thread apply all bt",  
> "list_sessions" and show channels (this one goes in FS) but PLEASE  
> update to svn trunk and test again to see if it still happens.
> 
> Also, are you using proxy/bypass media or just the default?
> 
> Math
> 
> On 5-Mar-09, at 5:38 PM, Eric Liedtke wrote:
> 
> > Greetings,
> >
> > I've been using FS in production on this rev (I realize it's pretty

> > far
> > behind current) and it's been running well, save 1 issue.
> >
> > The basic setup is an SBC , 2 GiG-E ports, 1 public , 1 private. I  
> > have
> > 2 sip profiles created , 1 per ip interface. This is being used to
> > terminate traffic to a provider so calls are only 1 direction. They

> > come
> > into the private side profile, get routed via dialplan to the
gateway
> > defined in the external profile and on to the vendor. Pretty simple.
> >
> > I have noticed that under load (50 or so cps with ~800-900 bridged  
> > calls up)
> > that over time some channels on the public side seem to get  
> > "stuck".  Due to
> > the nature of how this is being used , I would expect both sip  
> > profiles to show
> > the same number of channels in use any time i do a 'sofia  
> > status' ( or at least
> > be within a channel or 2 of each other). However after a day of  
> > heavy use I had
> > a disparity of ~250 channels. These extra channels also seem to put

> > some
> > continual load on the 'system cpu' as well , reported via top.
> >
> > Of course due to the load on the box I have to keep logging turned
way
> > down. So I've been trying to troubleshoot it as best I can.
> >
> > Last night I grabbed a core file and started in with GDB today. I  
> > found
> > the 120 or so threads that represented real active calls when I took

> > the
> > corefile, I also found ~250 threads that appeared to be stuck in the
> > CS_NEW state. The backtraces on all of them looks the same,  
> > annotated below.
> >
> > I walked through the code path by hand , based on the bt's and I  
> > don't see how
> > this could be happening  unless it's a locking issue. But as far as

> > I can tell
> > each  session  has it's own mutex defined in the  
> > switch_core_session_t struct,
> > so I wouldn't think they would be stepping on each other. I also  
> > would have expected
> > if it were something of a deadlock nature it would stop processing  
> > calls all
> > together.
> >
> > I grabbed the commands from the .gdbinit (super handy btw!!) and  
> > have been trolling
> > through the variables to try to ascertain something about why these

> > threads seem to
> > be stuck, but am not having much luck even coming up with a scenario

> > to try
> > to replicate the issue.
> >
> > If anyone has any pointers as to where I might look next it would be

> > greatly
> > appreciated.
> >
> > We will be updating to the newest release soon, however I was hoping

> > to nail down
> > what is going so I can systematically replicate it and verify by  
> > testing in the lab
> > that it is fixed , rather than just pushing the new release to  
> > produvction and hoping.
> >
> > Thanks in advance for any tips/pointers anyone may have.
> >
> > -e
> >
> > ......bt and bt full for a single "hung" thread
> >
> >
> > #0  0xb7fd5410 in __kernel_vsyscall ()
> > #1  0xb7d14cb6 in nanosleep () from /lib/tls/i686/cmov/libc.so.6
> > #2  0xb7d4f1dc in usleep () from /lib/tls/i686/cmov/libc.so.6
> > #3  0xb7ee02cd in switch_sleep (t=1000) at src/switch_time.c:143
> > #4  0xb7e9da03 in switch_core_session_run (session=0x95fe270) at
src/ 
> > switch_core_state_machine.c:462
> > #5  0xb7e9c765 in switch_core_session_thread (thread=0x9ada840,  
> > obj=0x95fe270) at src/switch_core_session.c:853
> > #6  0xb7efd916 in dummy_worker (opaque=0x9ada840) at
threadproc/unix/ 
> > thread.c:138
> > #7  0xb7e034fb in start_thread () from /lib/tls/i686/cmov/ 
> > libpthread.so.0
> > #8  0xb7d55e5e in clone () from /lib/tls/i686/cmov/libc.so.6
> > (gdb) bt full
> > #0  0xb7fd5410 in __kernel_vsyscall ()
> > No symbol table info available.
> > #1  0xb7d14cb6 in nanosleep () from /lib/tls/i686/cmov/libc.so.6
> > No symbol table info available.
> > #2  0xb7d4f1dc in usleep () from /lib/tls/i686/cmov/libc.so.6
> > No symbol table info available.
> > #3  0xb7ee02cd in switch_sleep (t=1000) at src/switch_time.c:143
> > No locals.
> > #4  0xb7e9da03 in switch_core_session_run (session=0x95fe270) at
src/ 
> > switch_core_state_machine.c:462
> >        exception = 0 '\0'
> >        state = <value optimized out>
> >        endstate = CS_NEW
> >        endpoint_interface = <value optimized out>
> >        driver_state_handler = (const switch_state_handler_table_t *)

> > 0xb73b1720
> >        application_state_handler = <value optimized out>
> >        thread_id = 3085554955
> >        env = {{__jmpbuf = {134603552, -1428248680, -1461722504,  
> > 9184, -1210273432, -1210014020}, __mask_was_saved = -1210034895,  
> > __saved_mask = {__val = {0, 3084988404, 3084937740, 3086469280,  
> > 9184, 1, 2976641592, 2833244792, 3086590960,
> >        168036728, 3084937740, 2833244808, 3085923728, 1, 3086590960,

> > 2833244840, 3086590960, 0, 134564192, 2833244840, 3085923728,  
> > 134564244, 3086590960, 2833244872, 3085887870, 134564240, 168036728,

> > 3085458203, 3086590960, 2976606624,
> >        134564192, 2833244904}}}}
> >        sig = <value optimized out>
> >        __func__ = "switch_core_session_run"
> >        __PRETTY_FUNCTION__ = "switch_core_session_run"
> > #5  0xb7e9c765 in switch_core_session_thread (thread=0x9ada840,  
> > obj=0x95fe270) at src/switch_core_session.c:853
> >        session = (switch_core_session_t *) 0x95fe270
> >        event = <value optimized out>
> >        event_str = 0x0
> >        val = <value optimized out>
> >        __func__ = "switch_core_session_thread"
> >        __PRETTY_FUNCTION__ = "switch_core_session_thread"
> > #6  0xb7efd916 in dummy_worker (opaque=0x9ada840) at
threadproc/unix/ 
> > thread.c:138
> > No locals.
> > #7  0xb7e034fb in start_thread () from /lib/tls/i686/cmov/ 
> > libpthread.so.0
> > No symbol table info available.
> > #8  0xb7d55e5e in clone () from /lib/tls/i686/cmov/libc.so.6
> >
> >
> > _______________________________________________
> > Freeswitch-users mailing list
> > Freeswitch-users at lists.freeswitch.org
> > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> >
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> > http://www.freeswitch.org
> 
> 
> _______________________________________________
> Freeswitch-users mailing list
> Freeswitch-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org

_______________________________________________
Freeswitch-users mailing list
Freeswitch-users at lists.freeswitch.org
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org




More information about the FreeSWITCH-users mailing list