[Freeswitch-users] FreeSWITCH 1.6 - Where are the bottlenecks?
Michael Jerris
mike at jerris.com
Mon Jul 30 16:28:04 UTC 2018
I’d try out the latest 1.8 release we made at ClueCon. The seg looks familiar but I would have to dig for specifics to be sure.
> On Jul 26, 2018, at 4:44 AM, Shaun Stokes <shaun.stokes at itec-support.co.uk> wrote:
>
> Hi All,
>
> We've experienced a number of segmentation faults over time with various errors in the core dump and no clear indication of the exact cause of the problem, we suspect this is load related.
>
> FreeSWITCH is typically handling more than 100 concurrent sessions but rarely more than 10 session per second, these segmentation faults have even occurred in the early hours of the morning with less than 10 session at the time which leads us to believe the bottleneck is on the number of simultaneous registrations\subscriptions which is the only variable which is roughly the same at all hours only decreasing and increasing slightly in the evenings and mornings.
>
> We've created more SIP profiles to distribute registrations to no avail, we're only able to prevent segmentation faultsentirely by distributing registrations across more servers. The problems seem to occur when we have combination of:
> - 10,000 subscriptions and 400 registrations; reducing subscriptions to 5000 resolves the issue.
> - 1200 subscriptions and 950 registrations; reducing registrations to 700 resolves the issue.
>
> We have a few FreeSWITCH servers which are dedicated for call routing only, these are processing in excess of 400 concurrent sessions and never have any issues.
>
> Our server specs:
> Debian 8.9
> PostgreSQL 9.4 - Optimised for best performance
> FreeSWITCH 1.6.20 - Using PostgreSQL backend
> 32GB RAM
> Dual socket Intel Xeon X5670 2.93Ghz - With Hyper threading enabled
>
> Our plan is to implement SBCs in the future to handle subscriptions and registrations but in the short term we need to understand the limitations of FreeSWITCH, where are the bottlenecks, and are there any other solutions.
>
> Looking at SIP traces at the time of these segmentation faults there are more than 30 SIP messages per second mostly subscribe, 202, notify, registration and 200. If we look at all packets on the FreeSWITCH SIP port including Application Data, ACK etc there are hundreds of packets per second.
>
> We initially suspected the bottleneck could be the FreeSWITCH backend database but moving to PostgreSQL has had no effect.
>
> Has anyone come across similar issues before and\or is able to offer any advice?
>
> We're currently looking at two possibilities, a software bottleneck with-in FreeSWITCH, or a bottleneck on the hardware possibly as a result of using dual socket CPUs. We are in the process of enabling node interleaving on the hardware so we use SUMA (Sufficiently Uniform Memory Access) instead of NUMA (Non-Uniform Memory Access) which should theoretically reduce traffic on the QPI bridge slightly given that FreeSWITCH isn't optimised for NUMA.
>
> Here's an example of some of some of the segmentation faults we've experienced:
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 strlen () at ../sysdeps/x86_64/strlen.S:106
> 106 ../sysdeps/x86_64/strlen.S: No such file or directory.
> (gdb) bt
> #0 strlen () at ../sysdeps/x86_64/strlen.S:106
> #1 0x00007f92302f697e in __GI___strdup (s=0xecd42c18 <error: Cannot access memory at address 0xecd42c18>) at strdup.c:41
> #2 0x00007f9231849ea4 in switch_ivr_generate_xml_cdr (session=0x7f91dd8146a8, xml_cdr=0x7f900b6da748) at src/switch_ivr.c:2803
> #3 0x00007f91696fc7b0 in my_on_reporting (session=0x7f91dd8146a8) at mod_xml_cdr.c:228
> #4 0x00007f92317801d5 in switch_core_session_reporting_state (session=0x7f91dd8146a8) at src/switch_core_state_machine.c:938
> #5 0x00007f923177bf9c in switch_core_session_run (session=0x7f91dd8146a8) at src/switch_core_state_machine.c:609
> #6 0x00007f9231775737 in switch_core_session_thread (thread=0x7f8ed51f5170, obj=0x7f91dd8146a8) at src/switch_core_session.c:1648
> #7 0x00007f9231775b25 in switch_core_session_thread_pool_worker (thread=0x7f8ed51f5170, obj=0x7f8ed51f5000) at src/switch_core_session.c:1711
> #8 0x00007f9231a9bea5 in dummy_worker (opaque=0x7f8ed51f5170) at threadproc/unix/thread.c:151
> #9 0x00007f9230c85064 in start_thread (arg=0x7f900b6db700) at pthread_create.c:309
> #10 0x00007f923035d62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
> 66 ../nptl/pthread_mutex_lock.c: No such file or directory.
> (gdb) bt
> #0 __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
> #1 0x00007fd953bf9db8 in apr_thread_mutex_lock (mutex=0x19101fd8) at locks/unix/thread_mutex.c:92
> #2 0x00007fd95389e663 in switch_mutex_lock (lock=0x19101fd8) at src/switch_apr.c:293
> #3 0x00007fd9538a7f0c in switch_channel_test_flag (channel=0x7fd619101910, flag=CF_THREAD_SLEEPING) at src/switch_channel.c:1568
> #4 0x00007fd9538ed448 in switch_core_session_read_lock (session=0x7fd627d19eb8) at src/switch_core_rwlock.c:91
> #5 0x00007fd9538d78fc in switch_core_session_hupall_matching_var_ans (var_name=0x7fd8b70229b1 "cc_member_pre_answer_uuid",
> var_val=0x7fd891c86680 "e38c1d71-e760-44bf-af63-4b54dcbd01fc", cause=SWITCH_CAUSE_ORIGINATOR_CANCEL, type=(SHT_UNANSWERED | SHT_ANSWERED)) at src/switch_core_session.c:231
> #6 0x00007fd8b701eacd in callcenter_function (session=0x7fd90508dbc8, data=0x7fd6149e5a68 "Adjustments@ <mailto:Adjustments at customerdomain.com>customerdomain.com <mailto:Adjustments at customerdomain.com>") at mod_callcenter.c:3018
> #7 0x00007fd9538df841 in switch_core_session_exec (session=0x7fd90508dbc8, application_interface=0x2836ff0, arg=0x7fd6149e5a68 "Adjustments at customerdomain.com <mailto:Adjustments at customerdomain.com>")
> at src/switch_core_session.c:2802
> #8 0x00007fd9538defe8 in switch_core_session_execute_application_get_flags (session=0x7fd90508dbc8, app=0x7fd6149e5a58 "callcenter",
> arg=0x7fd6149e5a68 "Adjustments@ <mailto:Adjustments at customerdomain.com>customerdomain.com <mailto:Adjustments at customerdomain.com>", flags=0x0) at src/switch_core_session.c:2672
> #9 0x00007fd9538e168b in switch_core_standard_on_execute (session=0x7fd90508dbc8) at src/switch_core_state_machine.c:353
> #10 0x00007fd9538e31c6 in switch_core_session_run (session=0x7fd90508dbc8) at src/switch_core_state_machine.c:650
> #11 0x00007fd9538db737 in switch_core_session_thread (thread=0x7fd9048e9370, obj=0x7fd90508dbc8) at src/switch_core_session.c:1648
> #12 0x00007fd9538dbb25 in switch_core_session_thread_pool_worker (thread=0x7fd9048e9370, obj=0x7fd9048e9200) at src/switch_core_session.c:1711
> #13 0x00007fd953c01ea5 in dummy_worker (opaque=0x7fd9048e9370) at threadproc/unix/thread.c:151
> #14 0x00007fd952deb064 in start_thread (arg=0x7fd891c87700) at pthread_create.c:309
> #15 0x00007fd9524c362d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20180730/5125989c/attachment-0001.html>
More information about the FreeSWITCH-users
mailing list