[Freeswitch-users] High Availability Cluster Module for FreeSWITCH

Marcin Gozdalik gozdal at gmail.com
Sun Feb 10 23:11:22 MSK 2013


Don't get me wrong, I'd love to fund good HA module for FS, if not for
any other reason that I could benefit from that.
But having done a few installations of systems that were supposed to
be "HA" and seen them fail when real problems came I know it ain't
easy.
Redundant networks are fine but following scenarios usually lead to
both machines reply to ARPs for virtual IP and whole HA falls apart:

1) FS stops responding (e.g. due to heavy swapping or disk full), yet
kernel manages to reply to ARPs
2) the HA module fails (like in crashes) but FS manages to work
3) some firewall rule is activated that stops multicast traffic (all or some)

STONITH based on separate technology (like USB-USB connection
connected to some KVM-over-IP with control over power) is
indispensable in such scenarios.

2013/2/10 Eliot Gable <egable+freeswitch at gmail.com>:
> On Sun, Feb 10, 2013 at 12:08 PM, Steven Ayre <steveayre at gmail.com> wrote:
>> That covers redundancy in case of a network card or cable failure, but isn't
>> what partitioning is about. Multiple NICs cannot prevent partitioning.
>>
>> As an example, partitioning might happen when a network switch between two
>> network segments fails so you have nodes A+B in segment 1 able to talk to
>> each other but unable to talk to nodes C+D in segment 2, while C+D can talk
>> to each other but not A+B.
>>
>> Pacemaker/corosync contain a lot of algorithms to fence off partitions
>> without quorum and can resort to things like STONITH if required to force a
>> node to shutdown rather than risk it causing disruption to the cluster (for
>> example if it tries to take over traffic to a virtual IP you could end up in
>> a case where you have two servers sending ARP responses for the same IP).
>>
>
> Steve,
>
> As Avi pointed out, I mentioned having multiple physical networks as a
> guard against a network split / partition. If one network is split
> such that A and B can talk to each other over it and C and D can talk
> to each other over it, you would indeed have an issue if you only had
> one network. However, with two or more networks, all four nodes will
> still be able to talk to each other over the other network(s).
>
> Now, granted, if you have a network split in all networks, then you
> are still screwed. Pacemaker and other solutions deal with this, as
> you mentioned, using something called "quorum" where you need a
> majority of nodes to be able to see each other, and they fence the
> remaining nodes. As I documented on my wiki page for the module, I do
> have plans to eventually support such functionality. However, that is
> a bit further down the road as it will take some time to develop
> STONITH interfaces to various hardware or even to reuse the STONITH
> modules from Pacemaker or another project. In any case, I feel it is
> more important to get the base functionality developed and debugged as
> utilizing multiple networks is a good way to prevent network splits
> from being an issue.
>
> That being said, there are other issues to contend with when
> discussing network splits. For example, if A and B can see the
> Internet but C and D cannot, but C is a Master and B is a slave, you
> still have an issue to address. In this case, mod_ha_cluster must be
> able to determine that C and D cannot see the Internet. They need to
> perform very fast pings to some IP address, or have some external host
> sending them data in some way that they can detect when traffic
> to/from the Internet has stopped. I can place a media bug on the audio
> streams to make this determination fairly accurately. I can also rely
> on a ping mechanism to make the determination. Once the determination
> is made, mod_ha_cluster then has to promote B to a master to take over
> C.
>
> So, there are certainly still other issues to address when a network
> split occurs, but split-brain is easily avoided by simply adding
> redundant networks.
>
> _________________________________________________________________________
> Professional FreeSWITCH Consulting Services:
> consulting at freeswitch.org
> http://www.freeswitchsolutions.com
>
> 
> 
>
> Official FreeSWITCH Sites
> http://www.freeswitch.org
> http://wiki.freeswitch.org
> http://www.cluecon.com
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org



--
Marcin Gozdalik



Join us at ClueCon 2011 Aug 9-11, 2011
More information about the FreeSWITCH-users mailing list