[Freeswitch-users] FreeSWITCH 1.6 - Where are the bottlenecks?

Anthony Minessale anthony.minessale at gmail.com
Mon Jul 30 19:23:14 UTC 2018


Hi,

Its hard to tell just looking at those examples.
The best way to deal with segmentation faults is to report them with all
the details necessary to debug them and to try the latest version like Mike
suggests, to rule out things already fixed.
If you open JIRA for them with the trace like you showed above and all the
version details so we can match the code to the trace, then sometimes we
can diagnose from that alone.

If the open source help is not enough for your needs and you want to get a
full analysis of your usage footprint and assistance with large scale
commercial deployments, you can also consider working with the pro services
team at https://freeswitch.com since sometimes it takes additional effort
to track down things triggered by massive load.



On Thu, Jul 26, 2018 at 3:44 AM, Shaun Stokes <
shaun.stokes at itec-support.co.uk> wrote:

> Hi All,
>
>
> We've experienced a number of segmentation faults over time with various
> errors in the core dump and no clear indication of the exact cause of the
> problem, we suspect this is load related.
>
>
> FreeSWITCH is typically handling more than 100 concurrent sessions but
> rarely more than 10 session per second, these segmentation faults have
> even occurred in the early hours of the morning with less than 10 session
> at the time which leads us to believe the bottleneck is on the number of
> simultaneous registrations\subscriptions which is the only variable which
> is roughly the same at all hours only decreasing and increasing slightly in
> the evenings and mornings.
>
>
> We've created more SIP profiles to distribute registrations to no avail,
> we're only able to prevent segmentation faults entirely by distributing
> registrations across more servers. The problems seem to occur when we have
> combination of:
>
> - 10,000 subscriptions and 400 registrations; reducing subscriptions to
> 5000 resolves the issue.
>
> - 1200 subscriptions and 950 registrations; reducing registrations to 700
> resolves the issue.
>
>
> We have a few FreeSWITCH servers which are dedicated for call routing
> only, these are processing in excess of 400 concurrent sessions and never
> have any issues.
>
>
> Our server specs:
>
> Debian 8.9
>
> PostgreSQL 9.4 - Optimised for best performance
>
> FreeSWITCH 1.6.20 - Using PostgreSQL backend
>
> 32GB RAM
>
> Dual socket Intel Xeon X5670 2.93Ghz - With Hyper threading enabled
>
>
> Our plan is to implement SBCs in the future to handle subscriptions and
> registrations but in the short term we need to understand the limitations
> of FreeSWITCH, where are the bottlenecks, and are there any other solutions.
>
>
> Looking at SIP traces at the time of these segmentation faults there are
> more than 30  SIP messages per second mostly subscribe, 202, notify,
> registration and 200. If we look at all packets on the FreeSWITCH SIP port
> including Application Data, ACK etc there are hundreds of packets per
> second.
>
>
> We initially suspected the bottleneck could be the FreeSWITCH backend
> database but moving to PostgreSQL has had no effect.
>
>
> Has anyone come across similar issues before and\or is able to offer any
> advice?
>
>
> We're currently looking at two possibilities, a software bottleneck
> with-in FreeSWITCH, or a bottleneck on the hardware possibly as a result of
> using dual socket CPUs. We are in the process of enabling node interleaving
> on the hardware so we use SUMA (Sufficiently Uniform Memory Access)
> instead of NUMA (Non-Uniform Memory Access) which should
> theoretically reduce traffic on the QPI bridge slightly given that
> FreeSWITCH isn't optimised for NUMA.
>
>
> Here's an example of some of some of the segmentation faults we've
> experienced:
>
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  strlen () at ../sysdeps/x86_64/strlen.S:106
> 106     ../sysdeps/x86_64/strlen.S: No such file or directory.
>
> (gdb) bt
> #0  strlen () at ../sysdeps/x86_64/strlen.S:106
> #1  0x00007f92302f697e in __GI___strdup (s=0xecd42c18 <error: Cannot
> access memory at address 0xecd42c18>) at strdup.c:41
> #2  0x00007f9231849ea4 in switch_ivr_generate_xml_cdr
> (session=0x7f91dd8146a8, xml_cdr=0x7f900b6da748) at src/switch_ivr.c:2803
> #3  0x00007f91696fc7b0 in my_on_reporting (session=0x7f91dd8146a8) at
> mod_xml_cdr.c:228
> #4  0x00007f92317801d5 in switch_core_session_reporting_state
> (session=0x7f91dd8146a8) at src/switch_core_state_machine.c:938
> #5  0x00007f923177bf9c in switch_core_session_run (session=0x7f91dd8146a8)
> at src/switch_core_state_machine.c:609
> #6  0x00007f9231775737 in switch_core_session_thread
> (thread=0x7f8ed51f5170, obj=0x7f91dd8146a8) at
> src/switch_core_session.c:1648
> #7  0x00007f9231775b25 in switch_core_session_thread_pool_worker
> (thread=0x7f8ed51f5170, obj=0x7f8ed51f5000) at
> src/switch_core_session.c:1711
> #8  0x00007f9231a9bea5 in dummy_worker (opaque=0x7f8ed51f5170) at
> threadproc/unix/thread.c:151
> #9  0x00007f9230c85064 in start_thread (arg=0x7f900b6db700) at
> pthread_create.c:309
> #10 0x00007f923035d62d in clone () at ../sysdeps/unix/sysv/linux/
> x86_64/clone.S:111
>
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
>
> 66      ../nptl/pthread_mutex_lock.c: No such file or directory.
> (gdb) bt
> #0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
>
> #1  0x00007fd953bf9db8 in apr_thread_mutex_lock (mutex=0x19101fd8) at
> locks/unix/thread_mutex.c:92
> #2  0x00007fd95389e663 in switch_mutex_lock (lock=0x19101fd8) at
> src/switch_apr.c:293
> #3  0x00007fd9538a7f0c in switch_channel_test_flag
> (channel=0x7fd619101910, flag=CF_THREAD_SLEEPING) at
> src/switch_channel.c:1568
> #4  0x00007fd9538ed448 in switch_core_session_read_lock
> (session=0x7fd627d19eb8) at src/switch_core_rwlock.c:91
> #5  0x00007fd9538d78fc in switch_core_session_hupall_matching_var_ans
> (var_name=0x7fd8b70229b1 "cc_member_pre_answer_uuid",
>    var_val=0x7fd891c86680 "e38c1d71-e760-44bf-af63-4b54dcbd01fc",
> cause=SWITCH_CAUSE_ORIGINATOR_CANCEL, type=(SHT_UNANSWERED |
> SHT_ANSWERED)) at src/switch_core_session.c:231
> #6  0x00007fd8b701eacd in callcenter_function (session=0x7fd90508dbc8,
> data=0x7fd6149e5a68 "Adjustments at customerdomain.com") at
> mod_callcenter.c:3018
> #7  0x00007fd9538df841 in switch_core_session_exec
> (session=0x7fd90508dbc8, application_interface=0x2836ff0,
> arg=0x7fd6149e5a68 "Adjustments at customerdomain.com")
>    at src/switch_core_session.c:2802
> #8  0x00007fd9538defe8 in switch_core_session_execute_application_get_flags
> (session=0x7fd90508dbc8, app=0x7fd6149e5a58 "callcenter",
>    arg=0x7fd6149e5a68 "Adjustments at customerdomain.com", flags=0x0) at
> src/switch_core_session.c:2672
> #9  0x00007fd9538e168b in switch_core_standard_on_execute
> (session=0x7fd90508dbc8) at src/switch_core_state_machine.c:353
> #10 0x00007fd9538e31c6 in switch_core_session_run (session=0x7fd90508dbc8)
> at src/switch_core_state_machine.c:650
> #11 0x00007fd9538db737 in switch_core_session_thread
> (thread=0x7fd9048e9370, obj=0x7fd90508dbc8) at
> src/switch_core_session.c:1648
> #12 0x00007fd9538dbb25 in switch_core_session_thread_pool_worker
> (thread=0x7fd9048e9370, obj=0x7fd9048e9200) at
> src/switch_core_session.c:1711
> #13 0x00007fd953c01ea5 in dummy_worker (opaque=0x7fd9048e9370) at
> threadproc/unix/thread.c:151
> #14 0x00007fd952deb064 in start_thread (arg=0x7fd891c87700) at
> pthread_create.c:309
> #15 0x00007fd9524c362d in clone () at ../sysdeps/unix/sysv/linux/
> x86_64/clone.S:111
>
>
> Thanks,
>
> Shaun
> Shaun Stokes - Infrastructure Analyst
> T : 01453 700713
> E : shaun.stokes at itec-support.co.uk
> W : www.itec-support.co.uk
> Registered Address :- ITEC Support, Suite 2 Prospect House, Bath Road,
> Stroud, Gloucestershire GL5 3QF
> Company No. 06908001
>
> CONFIDENTIALITY NOTICE
> This communication and the information it contains are intended for the
> person or organisation to which it is addressed. Its contents are
> confidential and may be protected in law. Unauthorised use, copying or
> disclosure of any of it may be unlawful. If you are not the intended
> recipient, please contact us immediately.
> The contents of any attachments in this e-mail may contain software
> viruses, which could damage your own computer system. While ITEC Support
> has taken every reasonable precaution to minimise this risk, we cannot
> accept liability for any damage which you sustain as a result of software
> viruses. You should carry out your own virus checking procedure before
> opening any attachment.
>
> _________________________________________________________________________
> Professional FreeSWITCH Services
> sales at freeswitch.com
> https://freeswitch.com
>
> Official FreeSWITCH Sites
> https://freeswitch.com/oss
> https://freeswitch.org/confluence
> https://cluecon.com
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> https://freeswitch.com
>



-- 
Anthony Minessale II
Founder, FreeSWITCH.
http://freeswitch.com


https://youtu.be/l_hOxzCt6X4
https://www.youtube.com/watch?v=oAxXgyx5jUw
https://www.youtube.com/watch?v=9XXgW34t40s
https://www.youtube.com/watch?v=NLaDpGQuZDA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20180730/5e709d5b/attachment-0001.html>


More information about the FreeSWITCH-users mailing list