<HTML>
<HEAD>
<TITLE>Re: [Freeswitch-users] High cps load causes weird cpu and memory starvation. Need suggestions on how to debug.</TITLE>
</HEAD>
<BODY>
<FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'>You are probably hammering the disk subsystem... Keep in mind that FS uses multiple sqlite databases by default... Mount the fs db dir as tmpfs and try again<BR>
<BR>
<BR>
On 3/7/13 7:35 PM, "Dmitry Lysenko" <<a href="dvl36.ripe.nick@gmail.com">dvl36.ripe.nick@gmail.com</a>> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'>Hm... But what about huge interrupt and context switching number?<BR>
<BR>
<BR>
2013/3/7 Steven Ayre <<a href="steveayre@gmail.com">steveayre@gmail.com</a>><BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'>Upgrade to the latest version.<BR>
<BR>
There are some memory leaks that have been found and fixed in 1.2.7.<BR>
<BR>
-Steve<BR>
<BR>
<BR>
<BR>
<BR>
On 7 March 2013 11:09, bratner bratner <<a href="ratner2@gmail.com">ratner2@gmail.com</a>> wrote:<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'>Hi!<BR>
<BR>
Please assist me to find a way to get some insight into this problem and file a proper bug report. Looks like 2 different problems, one is load related (like a race of some sort) the other is a mem leak. <BR>
<BR>
<BR>
I'm running FS debs compiled on ubuntu 12.04 (-n20130227T114536Z-1~presice+<BR>
1git).<BR>
The machine is ubuntu 12.04 server , CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz, 4GB RAM and no swap. 3.1G RAM free after FS is started. <BR>
I disabled HT in BIOS to get proper CPU load readings from top and vmstat (no HT also allows for higher cps for some reason...).<BR>
<BR>
The FS setup has 2 SIP profiles , internal/external (functioning as incoming/outgoing ), XML Dialplan that runs a lua script. <BR>
Calls come in to profile 'internal', authed by ip acl , go to the dialplan and sent through profile 'external' to a uri string generated by the lua script. <BR>
<BR>
I bombard the machine with high call rate of short calls ,120-180cps , 5sec.<BR>
Everything stays fine for some time , the higher the load the faster the problems appear. <BR>
When i pressure the machine with 150cps and up , it works fine at first with 20%-30% idle CPU.<BR>
After some time passes (10-30minutes) system cpu usage jumps from 3%-7% to around 50%, CPU is 95% busy and vmstat reports hundereds and sometimes thousands of runnable tasks. <BR>
If i stop sending calls at this moment, wait for the CPU load to wind down and start calling again then the CPU load saturates right away. Even if i lower the CPS rate. <BR>
restarting freeswitch at this point brings the system to the original state and it will take some time to get saturated again. <BR>
<BR>
Another symptom is that during the test the available free mem is slowly allocated to freeswitch until it reaches 13.1Gb virtual image and then the kernel kills it.<BR>
<BR>
Here are some numbers:<BR>
<BR>
Running 120 calls per second , 5second calls:<BR>
<BR>
Starts like this: This is very good from my POV. <BR>
<BR>
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<BR>
r b swpd free buff cache si so bi bo in cs us sy id wa<BR>
5 0 0 3255876 35444 96580 0 0 9 2 66 90 1 1 98 0<BR>
4 0 0 3253080 35444 96652 0 0 0 0 16048 20727 36 3 61 0<BR>
4 0 0 3247780 35444 96684 0 0 0 0 16521 21048 36 4 60 0<BR>
4 0 0 3242860 35444 96784 0 0 0 0 15695 19825 36 3 62 0<BR>
1 0 0 3237980 35452 96768 0 0 0 24 16072 19851 36 3 60 0<BR>
4 0 0 3233692 35452 96812 0 0 0 0 16299 20980 35 4 61 0<BR>
8 0 0 3230280 35452 96928 0 0 0 0 15034 18965 36 3 61 0<BR>
2 0 0 3227764 35452 96896 0 0 0 0 13956 20621 37 3 60 0<BR>
<BR>
After some (long) time it started having trouble and eat lots of free ram (note that the idle cpu% and system cpu% jumped up but there are still over 1G free mem available) :<BR>
<BR>
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<BR>
r b swpd free buff cache si so bi bo in cs us sy id wa<BR>
22 0 0 1108588 100304 2034628 0 0 0 0 145350 183629 37 53 10 0<BR>
5 0 0 1108224 100304 2034668 0 0 0 0 151937 167249 38 57 5 0<BR>
11 0 0 1108268 100304 2034904 0 0 0 0 149931 176143 37 58 5 0<BR>
7 0 0 1108244 100312 2034752 0 0 0 32 138612 183583 40 47 14 0<BR>
3 0 0 1108056 100312 2034792 0 0 0 0 111446 162733 42 29 29 0<BR>
4 0 0 1108064 100312 2034836 0 0 0 0 112568 164187 41 30 29 0<BR>
14 0 0 1108040 100312 2034880 0 0 0 0 133372 174676 39 45 17 0<BR>
10 0 0 1107468 100312 2034920 0 0 0 0 157371 184888 37 58 5 0<BR>
<BR>
Short time after that is got into really serious trouble (3000 runnable tasks!? heading toward memory exhaustion):<BR>
<BR>
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<BR>
r b swpd free buff cache si so bi bo in cs us sy id wa<BR>
<BR>
3336 0 0 146580 9248 1457016 0 0 0 0 306642 291819 28 67 5 0<BR>
1180 0 0 145060 9220 1430212 0 0 248 848 294654 285273 27 68 5 0<BR>
1028 0 0 148616 8996 1366844 0 0 0 0 421614 385342 32 63 4 0<BR>
1026 0 0 150964 8880 1298832 0 0 0 28 462825 416480 31 65 4 0<BR>
1338 0 0 146260 8688 1242440 0 0 0 0 442247 391139 27 69 5 0<BR>
1945 0 0 143148 8680 1197180 0 0 0 0 364361 329958 24 71 5 0<BR>
1381 0 0 154668 8596 1137912 0 0 0 0 359982 323543 23 72 5 0<BR>
<BR>
<BR>
After UAC completed transactions 5198242/ UAS completed transactions 5512619 calls FS crashed and it looked like this:<BR>
<BR>
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<BR>
r b swpd free buff cache si so bi bo in cs us sy id wa<BR>
<BR>
3 2 0 92972 568 20492 0 0 156 8 117314 130220 7 40 38 15 <-- back in the day i would kill for 92Mb free mem.<BR>
10 5 0 3473328 208 9848 0 0 24344 48 104557 89011 5 62 15 17 <- Now the kernel kills FS for taking the rest of it.<BR>
8 0 0 3482436 340 14408 0 0 3208 0 2582 2046 0 95 4 1<BR>
9 0 0 3504792 348 14344 0 0 0 48 2312 1836 0 93 7 0<BR>
12 0 0 3532184 348 14348 0 0 0 0 2387 1984 0 94 6 0<BR>
6 0 0 3561792 348 14876 0 0 528 0 2523 2178 0 93 6 0<BR>
11 0 0 3592816 348 14876 0 0 0 0 2684 2248 0 95 5 0<BR>
3 0 0 3615424 348 14876 0 0 0 0 2820 2494 0 95 5 0<BR>
7 0 0 3633488 348 14876 0 0 0 0 2990 2805 0 94 6 0<BR>
7 0 0 3654916 348 14876 0 0 0 0 3350 3222 0 96 4 0<BR>
11 0 0 3680708 348 14876 0 0 0 0 3869 4018 0 94 6 0<BR>
0 0 0 3773532 476 16876 0 0 2344 0 6598 9501 0 74 24 2<BR>
0 0 0 3773836 476 16912 0 0 0 0 191 129 0 0 100 0 <- FS is not running any more, 100% idle CPU , lots of free ram<BR>
0 0 0 3773856 476 17108 0 0 0 0 91 63 0 0 100 0 <tel:91%C2%A0%C2%A0%2063%C2%A0%200%C2%A0%200%20100%C2%A0%200> <BR>
<BR>
<BR>
<BR>
dmesg of the crash is this:<BR>
<BR>
[48674.576126] Out of memory: Kill process 31552 (freeswitch) score 674 or sacrifice child<BR>
[48674.578188] Killed process 31552 (freeswitch) total-vm:10177480kB, anon-rss:2710092kB, file-rss:0kB<BR>
<BR>
fs_log switches to the dark side : (there was no disk space problem of course, i have no idea at which time point in the spiral down this popped on the console )<BR>
<BR>
2013-03-06 23:27:22.923640 [ERR] switch_core_sqldb.c:579 NATIVE SQL ERR [database or disk is full]<BR>
BEGIN EXCLUSIVE<BR>
2013-03-06 23:27:22.923640 [CRIT] switch_core_sqldb.c:1679 ERROR [database or disk is full]<BR>
2013-03-06 23:27:22.923640 [ERR] switch_core_sqldb.c:579 NATIVE SQL ERR [cannot commit - no transaction is active]<BR>
COMMIT<BR>
2013-03-06 23:27:22.963609 [CRIT] switch_core_session.c:1674 Thread Failure!<BR>
2013-03-06 23:27:22.963609 [CRIT] switch_core_session.c:1634 LUKE: I'm hit, but not bad.<BR>
2013-03-06 23:27:22.963609 [CRIT] switch_core_session.c:1635 LUKE'S VOICE: Artoo, see what you can do with it. Hang on back there....<BR>
Green laserfire moves past the beeping little robot as his head turns. After a few beeps and a twist of his mechanical arm,<BR>
Artoo reduces the max sessions to 13897 thus, saving the switch from certain doom.<BR>
<BR>
and thats all , FS is dead. <BR>
<BR>
Runing the test at 180 cps will significantly hasten the death. <BR>
It starts normal with ~40% idle cpu<BR>
then after circa 160k calls it starts to spiral:<BR>
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<BR>
r b swpd free buff cache si so bi bo in cs us sy id wa<BR>
6 0 0 2901484 63204 161132 0 0 0 0 20586 33073 54 5 41 0<BR>
5 0 0 2901468 63212 160988 0 0 0 12 25359 41915 53 6 42 0<BR>
4 0 0 2901220 63212 161064 0 0 0 0 19650 31211 55 5 40 0<BR>
7 0 0 2901080 63212 161124 0 0 0 0 21148 33462 54 5 41 0<BR>
2 0 0 2900676 63212 161188 0 0 0 0 21767 35568 54 5 42 0<BR>
5 0 0 2900468 63212 161256 0 0 0 0 19398 30073 54 5 41 0<BR>
2 0 0 2900088 63220 161304 0 0 0 12 21315 34563 54 5 42 0<BR>
3 0 0 2900180 63220 161484 0 0 0 0 19858 31561 42 4 54 0 <-- this should not happen the load is constant and idle cpu% should stay at ~40%<BR>
3 0 0 2899768 63220 161520 0 0 0 0 29907 46883 39 4 57 0<BR>
11 0 0 2900056 63220 161504 0 0 0 0 41730 63440 41 5 54 0<BR>
9 0 0 2899536 63220 161768 0 0 0 0 48876 68159 40 5 54 0<BR>
6 0 0 2899512 63228 161620 0 0 0 12 59988 80421 41 6 53 0<BR>
49 0 0 2876072 63228 162004 0 0 0 0 193384 241453 57 23 21 0 <-- here we go, starting the memory hogging cpu eating spiral<BR>
40 0 0 2830720 63228 162080 0 0 0 0 308948 356178 51 49 0 0<BR>
156 0 0 2776576 63228 161984 0 0 0 0 301514 325415 48 48 5 0<BR>
178 0 0 2727044 63228 162260 0 0 0 0 290168 306158 44 51 5 0<BR>
70 0 0 2683252 63236 161924 0 0 0 48 291792 302632 37 58 5 0<BR>
5 0 0 2618432 63236 162204 0 0 0 0 316170 302886 35 59 7 0<BR>
2 0 0 2548092 63236 162364 0 0 0 0 340707 319105 38 56 5 0<BR>
2 0 0 2460528 63236 162464 0 0 0 1596 386953 348210 40 54 6 0<BR>
<BR>
Nothing on fs warn/err log at this point until memory is all full. <BR>
<BR>
<BR>
Please give me some ideas on how to get some useful info for a bug report.<BR>
<BR>
Kind Regards,<BR>
Boris Ratner.<BR>
<BR>
_________________________________________________________________________<BR>
Professional FreeSWITCH Consulting Services:<BR>
<a href="consulting@freeswitch.org">consulting@freeswitch.org</a><BR>
<a href="http://www.freeswitchsolutions.com">http://www.freeswitchsolutions.com</a><BR>
<BR>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<BR>
<a href="http://www.cudatel.com">http://www.cudatel.com</a><BR>
<BR>
Official FreeSWITCH Sites<BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
<a href="http://wiki.freeswitch.org">http://wiki.freeswitch.org</a><BR>
<a href="http://www.cluecon.com">http://www.cluecon.com</a><BR>
<BR>
FreeSWITCH-users mailing list<BR>
<a href="FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><BR>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><BR>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
<BR>
</SPAN></FONT></BLOCKQUOTE><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'><BR>
<BR>
_________________________________________________________________________<BR>
Professional FreeSWITCH Consulting Services:<BR>
<a href="consulting@freeswitch.org">consulting@freeswitch.org</a><BR>
<a href="http://www.freeswitchsolutions.com">http://www.freeswitchsolutions.com</a><BR>
<BR>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<BR>
<a href="http://www.cudatel.com">http://www.cudatel.com</a><BR>
<BR>
Official FreeSWITCH Sites<BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
<a href="http://wiki.freeswitch.org">http://wiki.freeswitch.org</a><BR>
<a href="http://www.cluecon.com">http://www.cluecon.com</a><BR>
<BR>
FreeSWITCH-users mailing list<BR>
<a href="FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><BR>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><BR>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
<BR>
</SPAN></FONT></BLOCKQUOTE><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'><BR>
<BR>
<HR ALIGN=CENTER SIZE="3" WIDTH="95%"></SPAN></FONT><FONT SIZE="2"><FONT FACE="Consolas, Courier New, Courier"><SPAN STYLE='font-size:10pt'>_________________________________________________________________________<BR>
Professional FreeSWITCH Consulting Services:<BR>
<a href="consulting@freeswitch.org">consulting@freeswitch.org</a><BR>
<a href="http://www.freeswitchsolutions.com">http://www.freeswitchsolutions.com</a><BR>
<BR>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<BR>
<a href="http://www.cudatel.com">http://www.cudatel.com</a><BR>
<BR>
Official FreeSWITCH Sites<BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
<a href="http://wiki.freeswitch.org">http://wiki.freeswitch.org</a><BR>
<a href="http://www.cluecon.com">http://www.cluecon.com</a><BR>
<BR>
FreeSWITCH-users mailing list<BR>
<a href="FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><BR>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><BR>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><BR>
<a href="http://www.freeswitch.org">http://www.freeswitch.org</a><BR>
</SPAN></FONT></FONT></BLOCKQUOTE><FONT SIZE="2"><FONT FACE="Consolas, Courier New, Courier"><SPAN STYLE='font-size:10pt'><BR>
</SPAN></FONT></FONT><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:11pt'>-- <BR>
Ken<BR>
<FONT COLOR="#0000FF"><U><a href="http://www.FreeSWITCH.org">http://www.FreeSWITCH.org</a><BR>
<a href="http://www.ClueCon.com">http://www.ClueCon.com</a><BR>
<a href="http://www.OSTAG.org">http://www.OSTAG.org</a><BR>
</U></FONT>irc.freenode.net #freeswitch<BR>
</SPAN></FONT>
</BODY>
</HTML>