[Freeswitch-users] FreeSWITCH 1.6 - Where are the bottlenecks?

Shaun Stokes shaun.stokes at itec-support.co.uk
Tue Jul 31 07:13:21 UTC 2018


Hi Anthony, Mike,


Thanks for your advise.


We will look at implementing FreeSWITCH 1.8  and will raise a JIRA in the meantime.


Thanks,

Shaun

________________________________
From: FreeSWITCH-users <freeswitch-users-bounces at lists.freeswitch.org> on behalf of Anthony Minessale <anthony.minessale at gmail.com>
Sent: 30 July 2018 20:23
To: FreeSWITCH Users Help
Subject: Re: [Freeswitch-users] FreeSWITCH 1.6 - Where are the bottlenecks?

Hi,

Its hard to tell just looking at those examples.
The best way to deal with segmentation faults is to report them with all the details necessary to debug them and to try the latest version like Mike suggests, to rule out things already fixed.
If you open JIRA for them with the trace like you showed above and all the version details so we can match the code to the trace, then sometimes we can diagnose from that alone.

If the open source help is not enough for your needs and you want to get a full analysis of your usage footprint and assistance with large scale commercial deployments, you can also consider working with the pro services team at https://freeswitch.com since sometimes it takes additional effort to track down things triggered by massive load.



On Thu, Jul 26, 2018 at 3:44 AM, Shaun Stokes <shaun.stokes at itec-support.co.uk<mailto:shaun.stokes at itec-support.co.uk>> wrote:

Hi All,


We've experienced a number of segmentation faults over time with various errors in the core dump and no clear indication of the exact cause of the problem, we suspect this is load related.


FreeSWITCH is typically handling more than 100 concurrent sessions but rarely more than 10 session per second, these segmentation faults have even occurred in the early hours of the morning with less than 10 session at the time which leads us to believe the bottleneck is on the number of simultaneous registrations\subscriptions which is the only variable which is roughly the same at all hours only decreasing and increasing slightly in the evenings and mornings.


We've created more SIP profiles to distribute registrations to no avail, we're only able to prevent segmentation faults entirely by distributing registrations across more servers. The problems seem to occur when we have combination of:

- 10,000 subscriptions and 400 registrations; reducing subscriptions to 5000 resolves the issue.

- 1200 subscriptions and 950 registrations; reducing registrations to 700 resolves the issue.


We have a few FreeSWITCH servers which are dedicated for call routing only, these are processing in excess of 400 concurrent sessions and never have any issues.


Our server specs:

Debian 8.9

PostgreSQL 9.4 - Optimised for best performance

FreeSWITCH 1.6.20 - Using PostgreSQL backend

32GB RAM

Dual socket Intel Xeon X5670 2.93Ghz - With Hyper threading enabled


Our plan is to implement SBCs in the future to handle subscriptions and registrations but in the short term we need to understand the limitations of FreeSWITCH, where are the bottlenecks, and are there any other solutions.


Looking at SIP traces at the time of these segmentation faults there are more than 30  SIP messages per second mostly subscribe, 202, notify, registration and 200. If we look at all packets on the FreeSWITCH SIP port including Application Data, ACK etc there are hundreds of packets per second.


We initially suspected the bottleneck could be the FreeSWITCH backend database but moving to PostgreSQL has had no effect.


Has anyone come across similar issues before and\or is able to offer any advice?


We're currently looking at two possibilities, a software bottleneck with-in FreeSWITCH, or a bottleneck on the hardware possibly as a result of using dual socket CPUs. We are in the process of enabling node interleaving on the hardware so we use SUMA (Sufficiently Uniform Memory Access) instead of NUMA (Non-Uniform Memory Access) which should theoretically reduce traffic on the QPI bridge slightly given that FreeSWITCH isn't optimised for NUMA.


Here's an example of some of some of the segmentation faults we've experienced:


Program terminated with signal SIGSEGV, Segmentation fault.
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
106     ../sysdeps/x86_64/strlen.S: No such file or directory.

(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x00007f92302f697e in __GI___strdup (s=0xecd42c18 <error: Cannot access memory at address 0xecd42c18>) at strdup.c:41
#2  0x00007f9231849ea4 in switch_ivr_generate_xml_cdr (session=0x7f91dd8146a8, xml_cdr=0x7f900b6da748) at src/switch_ivr.c:2803
#3  0x00007f91696fc7b0 in my_on_reporting (session=0x7f91dd8146a8) at mod_xml_cdr.c:228
#4  0x00007f92317801d5 in switch_core_session_reporting_state (session=0x7f91dd8146a8) at src/switch_core_state_machine.c:938
#5  0x00007f923177bf9c in switch_core_session_run (session=0x7f91dd8146a8) at src/switch_core_state_machine.c:609
#6  0x00007f9231775737 in switch_core_session_thread (thread=0x7f8ed51f5170, obj=0x7f91dd8146a8) at src/switch_core_session.c:1648
#7  0x00007f9231775b25 in switch_core_session_thread_pool_worker (thread=0x7f8ed51f5170, obj=0x7f8ed51f5000) at src/switch_core_session.c:1711
#8  0x00007f9231a9bea5 in dummy_worker (opaque=0x7f8ed51f5170) at threadproc/unix/thread.c:151
#9  0x00007f9230c85064 in start_thread (arg=0x7f900b6db700) at pthread_create.c:309
#10 0x00007f923035d62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
66      ../nptl/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
#1  0x00007fd953bf9db8 in apr_thread_mutex_lock (mutex=0x19101fd8) at locks/unix/thread_mutex.c:92
#2  0x00007fd95389e663 in switch_mutex_lock (lock=0x19101fd8) at src/switch_apr.c:293
#3  0x00007fd9538a7f0c in switch_channel_test_flag (channel=0x7fd619101910, flag=CF_THREAD_SLEEPING) at src/switch_channel.c:1568
#4  0x00007fd9538ed448 in switch_core_session_read_lock (session=0x7fd627d19eb8) at src/switch_core_rwlock.c:91
#5  0x00007fd9538d78fc in switch_core_session_hupall_matching_var_ans (var_name=0x7fd8b70229b1 "cc_member_pre_answer_uuid",
   var_val=0x7fd891c86680 "e38c1d71-e760-44bf-af63-4b54dcbd01fc", cause=SWITCH_CAUSE_ORIGINATOR_CANCEL, type=(SHT_UNANSWERED | SHT_ANSWERED)) at src/switch_core_session.c:231
#6  0x00007fd8b701eacd in callcenter_function (session=0x7fd90508dbc8, data=0x7fd6149e5a68 "Adjustments at customerdomain.com<http://customerdomain.com>") at mod_callcenter.c:3018
#7  0x00007fd9538df841 in switch_core_session_exec (session=0x7fd90508dbc8, application_interface=0x2836ff0, arg=0x7fd6149e5a68 "Adjustments at customerdomain.com<mailto:Adjustments at customerdomain.com>")
   at src/switch_core_session.c:2802
#8  0x00007fd9538defe8 in switch_core_session_execute_application_get_flags (session=0x7fd90508dbc8, app=0x7fd6149e5a58 "callcenter",
   arg=0x7fd6149e5a68 "Adjustments at customerdomain.com<http://customerdomain.com>", flags=0x0) at src/switch_core_session.c:2672
#9  0x00007fd9538e168b in switch_core_standard_on_execute (session=0x7fd90508dbc8) at src/switch_core_state_machine.c:353
#10 0x00007fd9538e31c6 in switch_core_session_run (session=0x7fd90508dbc8) at src/switch_core_state_machine.c:650
#11 0x00007fd9538db737 in switch_core_session_thread (thread=0x7fd9048e9370, obj=0x7fd90508dbc8) at src/switch_core_session.c:1648
#12 0x00007fd9538dbb25 in switch_core_session_thread_pool_worker (thread=0x7fd9048e9370, obj=0x7fd9048e9200) at src/switch_core_session.c:1711
#13 0x00007fd953c01ea5 in dummy_worker (opaque=0x7fd9048e9370) at threadproc/unix/thread.c:151
#14 0x00007fd952deb064 in start_thread (arg=0x7fd891c87700) at pthread_create.c:309
#15 0x00007fd9524c362d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111



Thanks,

Shaun


Shaun Stokes - Infrastructure Analyst

T :     01453 700713
E :     shaun.stokes at itec-support.co.uk
W :     www.itec-support.co.uk

Registered Address :- ITEC Support, Suite 2 Prospect House, Bath Road, Stroud, Gloucestershire GL5 3QF
Company No. 06908001

CONFIDENTIALITY NOTICE
This communication and the information it contains are intended for the person or organisation to which it is addressed. Its contents are confidential and may be protected in law. Unauthorised use, copying or disclosure of any of it may be unlawful. If you are not the intended recipient, please contact us immediately.
The contents of any attachments in this e-mail may contain software viruses, which could damage your own computer system. While ITEC Support has taken every reasonable precaution to minimise this risk, we cannot accept liability for any damage which you sustain as a result of software viruses. You should carry out your own virus checking procedure before opening any attachment.

_________________________________________________________________________
Professional FreeSWITCH Services
sales at freeswitch.com<mailto:sales at freeswitch.com>
https://freeswitch.com

Official FreeSWITCH Sites
https://freeswitch.com/oss
https://freeswitch.org/confluence
https://cluecon.com

FreeSWITCH-users mailing list
FreeSWITCH-users at lists.freeswitch.org<mailto:FreeSWITCH-users at lists.freeswitch.org>
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
https://freeswitch.com



--
Anthony Minessale II
Founder, FreeSWITCH.
http://freeswitch.com


https://youtu.be/l_hOxzCt6X4
https://www.youtube.com/watch?v=oAxXgyx5jUw
https://www.youtube.com/watch?v=9XXgW34t40s
https://www.youtube.com/watch?v=NLaDpGQuZDA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20180731/c280a667/attachment-0001.html>


More information about the FreeSWITCH-users mailing list