[Freeswitch-users] High cps load causes weird cpu and memory starvation. Need suggestions on how to debug.

bratner bratner ratner2 at gmail.com
Sun Mar 10 01:40:56 MSK 2013


List, Steve

I will clarify what i'm asking here before I take Anothny's suggestion and
join a "computer tuning" club as a way to "move forward".
http://media.bestofmicro.com/gerbilpc-tuning-pc,S-L-252453-13.jpg

What is there to read on this subject? Links, textbook names - everything
is appreciated.
What are the tools that show useful data and what i can do with FS to make
the work easier? Compile with some flags to get more info on running
threads?

Thanks,
Boris Ratner.

On Sat, Mar 9, 2013 at 12:51 AM, Steven Ayre <steveayre at gmail.com> wrote:

> After stopping the load FS still hogs 22.1% of memory.
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>
>>
>> 15995 freeswit  -2 -10 4677m 873m 5028 S    0 22.1 755:28.65 freeswitch
>>
>
>
> Until you test with the version you're building from master I would ignore
> the memory usage since you're running a version with known memory leaks.
>
> -Steve
>
>
>
>
> On 8 March 2013 18:15, bratner bratner <ratner2 at gmail.com> wrote:
> > Here is sipp output and additional numbers for a test I ran with -nosql
> > param.
> >
> > The test ran 180CPS for ~3500seconds and the rest with 210cps.
> >
> > Trouble (as in higher system cpu% ) started to appear around 8591seconds
> > into the test.
> > As you can see below the problem started just before 9124sec into the
> test
> > 210cps 5sec calls
> > should not give you a lot more then 1050 concurrent calls.
> >
> > ------------------------------ Scenario Screen -------- [1-9]: Change
> Screen
> > --
> >   Call-rate(length)   Port   Total-time  Total-calls  Remote-host
> > 210.0(5000 ms)/1.000s   5061    9157.32 s      1834024
> > 192.96.201.164:5060(UDP)
> >
> >   0 new calls during 0.000 s period      0 ms scheduler resolution
> >   0 calls (limit 2000)                   Peak was 2000 calls, after 9124
> s
> >   0 Running, 4640 Paused, 0 Woken up
> >   20 dead call msg (discarded)           0 out-of-call msg (discarded)
>
> >   1 open sockets
> >
> >                                  Messages  Retrans   Timeout
> > Unexpected-Msg
> >       INVITE ---------->         1834024   74        0
> >          100 <----------         1834024   0         0         0
> >          180 <----------         1834024   0         0         0
> >          183 <----------         0         0         0         0
> >          500 <----------         0         0         0         0
> >          502 <----------         0         0         0         0
> >          503 <----------         0         0         0         0
> >          408 <----------         0         0         0         0
> >          480 <----------         0         0         0         0
> >          200 <----------  E-RTD1 1834024   81        0         0
> >
> >          ACK ---------->         1834024   81
> >        Pause [   5000ms]         1834024                       0
> >          BYE ---------->         1834024   7646      0
> >          503 <----------         0         0         0         0
> >          200 <----------         1834024   0         0         0
> >
> > ------------------------------ Test Terminated
> > --------------------------------
> >
> >
> > ----------------------------- Statistics Screen ------- [1-9]: Change
> Screen
> > --
> >   Start Time             | 2013-03-08    15:22:18:204
>  1362756138.204833
> >   Last Reset Time        | 2013-03-08    17:54:55:535
>  1362765295.535214
> >   Current Time           | 2013-03-08    17:54:55:535
>  1362765295.535437
> >
> -------------------------+---------------------------+--------------------------
> >   Counter Name           | Periodic value            | Cumulative value
> >
> -------------------------+---------------------------+--------------------------
> >   Elapsed Time           | 00:00:00:000              | 02:32:37:330
>
> >   Call Rate              |    0.000 cps              |  200.279 cps
>
> >
> -------------------------+---------------------------+--------------------------
> >   Incoming call created  |        0                  |        0
>
> >   OutGoing call created  |        0                  |  1834024
>
> >   Total Call created     |                           |  1834024
>
> >   Current Call           |        0                  |
>
> >
> -------------------------+---------------------------+--------------------------
> >   Successful call        |        0                  |  1834024
>
> >   Failed call            |        0                  |        0
>
> >
> -------------------------+---------------------------+--------------------------
> >   Response Time 1        | 00:00:00:000              | 00:00:00:149
>
> >   Call Length            | 00:00:00:000              | 00:00:05:158
>
> > ------------------------------ Test Terminated
> > --------------------------------
> >
> >
> > After stopping the load FS still hogs 22.1% of memory.
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>
> > 15995 freeswit  -2 -10 4677m 873m 5028 S    0 22.1 755:28.65 freeswitch
>
>
> >
> >
> > The symptoms of the crash are the same, just now with higher CPS and
> takes
> > more time (more calls ) before crashing.
> >
> > I will appreciate any suggestion.
> >
> > Regards,
> > Boris Ratner.
> >
> >
> >
> > On Fri, Mar 8, 2013 at 6:22 PM, bratner bratner <ratner2 at gmail.com>
> wrote:
> >>
> >> The original test was done on git master at the date mentioned. The
> sqlite
> >> core.db file was on /run/shm which is a tmpfs on unbuntu 12.04.
> >> I will be recompiling from git master and test running with -nosql.
> >>
> >> Testing my existing setup with -nosql seems more stable now running at
> >> 210CPS for some time (500k calls already passed) with ~35% idle cpu.
> >> But the free mem is slowly going down. I will let it run untill the
> kernel
> >> will kill it to see how many calls it can handle.
> >>
> >> During my tests i did not run FS with RT priority but according to htop
> >> some of the threads are scheduled as RT.
> >> My setup is doing bypass-media , thus FS handling only call
> establishment
> >> and teardown on both legs.
> >>
> >> cat /proc/<FS pid>/status
> >>
> >> Name:   freeswitch
> >> State:  S (sleeping)
> >> Tgid:   15995
> >> Pid:    15995
> >> PPid:   1
> >> TracerPid:      0
> >> Uid:    999     999     999     999
> >> Gid:    999     999     999     999
> >> FDSize: 64
> >> Groups:
> >> VmPeak:  5002808 kB
> >> VmSize:  5002088 kB
> >> VmLck:         0 kB
> >> VmPin:         0 kB
> >> VmHWM:    625900 kB
> >> VmRSS:    624156 kB  <-- this is going up
> >> VmData:  4855788 kB
> >> VmStk:       136 kB
> >> VmExe:        20 kB
> >> VmLib:     18288 kB
> >> VmPTE:      2352 kB
> >> VmSwap:        0 kB
> >> Threads:        1866
> >> SigQ:   0/18446744073709551615
> >> SigPnd: 0000000000000000
> >> ShdPnd: 0000000000000000
> >> SigBlk: 0000000000000000
> >> SigIgn: 0000000010003006
> >> SigCgt: 0000000180014209
> >> CapInh: 0000000000000000
> >> CapPrm: 0000000000000000
> >> CapEff: 0000000000000000
> >> CapBnd: ffffffffffffffff
> >> Cpus_allowed:   ffffff
> >> Cpus_allowed_list:      0-23
> >> Mems_allowed:   00000000,00000003
> >> Mems_allowed_list:      0-1
> >> voluntary_ctxt_switches:        1803
> >> nonvoluntary_ctxt_switches:     23
> >>
> >>
> >> output of 'top -H' at 180CPS
> >>
> >>
> >> top - 15:27:00 up 2 days,  5:32,  5 users,  load average: 8.19, 91.07,
> >> 65.03
> >> Tasks: 2066 total,   3 running, 2063 sleeping,   0 stopped,   0 zombie
> >> Cpu(s): 50.1%us,  3.9%sy,  0.0%ni, 45.9%id,  0.0%wa,  0.0%hi,  0.2%si,
> >> 0.0%st
> >> Mem:   4038512k total,  2282260k used,  1756252k free,   114112k buffers
> >> Swap:        0k total,        0k used,        0k free,  1165868k cached
> >>
> >>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>
> >> 16000 freeswit  RT -10 4885m 594m 4964 R   69 15.1   3:10.26 freeswitch
>
>
> >> 16009 freeswit  RT -10 4885m 594m 4964 S   33 15.1   1:26.20 freeswitch
>
>
> >> 16008 freeswit  RT -10 4885m 594m 4964 S   28 15.1   1:17.30 freeswitch
>
>
> >> 16007 freeswit  RT -10 4885m 594m 4964 S    4 15.1   0:10.80 freeswitch
>
>
> >> 16004 freeswit  RT -10 4885m 594m 4964 S    2 15.1   0:06.63 freeswitch
>
>
> >> 19171 root      20   0 18988 2948  944 R    2  0.1   0:00.64 top
>
>
> >> 18735 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:00.29 freeswitch
>
>
> >> 16003 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:01.61 freeswitch
>
>
> >> 16690 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:00.42 freeswitch
>
>
> >> 16730 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:00.42 freeswitch
>
>
> >> 16750 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:00.45 freeswitch
>
>
> >> 16764 freeswit  -2 -10 4885m 594m 4964 S    1 15.1   0:00.44 freeswitch
>
>
> >> <more of the above>
> >> ....
> >> ....
> >>
> >>
> >> Thanks to all of you ,
> >> Boris Ratner.
> >>
> >> On Fri, Mar 8, 2013 at 4:22 AM, Dmitry Lysenko <
> dvl36.ripe.nick at gmail.com>
> >> wrote:
> >>>
> >>> I can't reproduce such cps load on my ARMv5TE system. )
> >>> bratner, please give us 'top -H'. I guess freeswitch running at
> realtime
> >>> priority.
> >>>
> >>>
> >>> 2013/3/8 Ken Rice <krice at freeswitch.org>
> >>>>
> >>>> Sqlite is probably getting hammered... Trust me... Mount the fs db dir
> >>>> as tmpfs or use the –nosql flag when starting freeswitch
> >>>>
> >>>> I routinely run dialer traffic at much higher CPS then that
> >>>>
> >>>>
> >>>>
> >>>> On 3/7/13 7:58 PM, "Dmitry Lysenko" <dvl36.ripe.nick at gmail.com>
> wrote:
> >>>>
> >>>> bi, bo and wa field is low, so it seems that is not disk subsystem.
> >>>>
> >>>>
> >>>> 2013/3/8 Ken Rice <krice at freeswitch.org>
> >>>>
> >>>> You are probably hammering the disk subsystem... Keep in mind that FS
> >>>> uses multiple sqlite databases by default... Mount the fs db dir as
> tmpfs
> >>>> and try again
> >>>>
> >>>>
> >>>>
> >>>> On 3/7/13 7:35 PM, "Dmitry Lysenko" <dvl36.ripe.nick at gmail.com
> >>>> <http://dvl36.ripe.nick@gmail.com> > wrote:
> >>>>
> >>>> Hm... But what about huge interrupt and context switching  number?
> >>>>
> >>>>
> >>>> ________________________________
> >>>>
> >>>>
> _________________________________________________________________________
> >>>> Professional FreeSWITCH Consulting Services:
> >>>> consulting at freeswitch.org
> >>>> http://www.freeswitchsolutions.com
> >>>>
> >>>> 
> >>>> 
> >>>>
> >>>> Official FreeSWITCH Sites
> >>>> http://www.freeswitch.org
> >>>> http://wiki.freeswitch.org
> >>>> http://www.cluecon.com
> >>>>
> >>>> FreeSWITCH-users mailing list
> >>>> FreeSWITCH-users at lists.freeswitch.org
> >>>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> >>>> UNSUBSCRIBE:
> http://lists.freeswitch.org/mailman/options/freeswitch-users
> >>>> http://www.freeswitch.org
> >>>>
> >>>>
> >>>> --
> >>>> Ken
> >>>> http://www.FreeSWITCH.org
> >>>> http://www.ClueCon.com
> >>>> http://www.OSTAG.org
> >>>> irc.freenode.net #freeswitch
> >>>>
> >>>>
> >>>>
> _________________________________________________________________________
> >>>> Professional FreeSWITCH Consulting Services:
> >>>> consulting at freeswitch.org
> >>>> http://www.freeswitchsolutions.com
> >>>>
> >>>> 
> >>>> 
> >>>>
> >>>> Official FreeSWITCH Sites
> >>>> http://www.freeswitch.org
> >>>> http://wiki.freeswitch.org
> >>>> http://www.cluecon.com
> >>>>
> >>>> FreeSWITCH-users mailing list
> >>>> FreeSWITCH-users at lists.freeswitch.org
> >>>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> >>>> UNSUBSCRIBE:
> http://lists.freeswitch.org/mailman/options/freeswitch-users
> >>>> http://www.freeswitch.org
> >>>>
> >>>
> >>>
> >>>
> _________________________________________________________________________
> >>> Professional FreeSWITCH Consulting Services:
> >>> consulting at freeswitch.org
> >>> http://www.freeswitchsolutions.com
> >>>
> >>> 
> >>> 
> >>>
> >>> Official FreeSWITCH Sites
> >>> http://www.freeswitch.org
> >>> http://wiki.freeswitch.org
> >>> http://www.cluecon.com
> >>>
> >>> FreeSWITCH-users mailing list
> >>> FreeSWITCH-users at lists.freeswitch.org
> >>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> >>> UNSUBSCRIBE:
> http://lists.freeswitch.org/mailman/options/freeswitch-users
> >>> http://www.freeswitch.org
> >>>
> >>
> >
> >
> > _________________________________________________________________________
> > Professional FreeSWITCH Consulting Services:
> > consulting at freeswitch.org
> > http://www.freeswitchsolutions.com
> >
> > 
> > 
> >
> > Official FreeSWITCH Sites
> > http://www.freeswitch.org
> > http://wiki.freeswitch.org
> > http://www.cluecon.com
> >
> > FreeSWITCH-users mailing list
> > FreeSWITCH-users at lists.freeswitch.org
> > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> > UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> > http://www.freeswitch.org
> >
>
> _________________________________________________________________________
> Professional FreeSWITCH Consulting Services:
> consulting at freeswitch.org
> http://www.freeswitchsolutions.com
>
> 
> 
>
> Official FreeSWITCH Sites
> http://www.freeswitch.org
> http://wiki.freeswitch.org
> http://www.cluecon.com
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20130310/0ecb0fd5/attachment-0001.html 


Join us at ClueCon 2011 Aug 9-11, 2011
More information about the FreeSWITCH-users mailing list