[Freeswitch-users] High Availability Cluster Module for FreeSWITCH

Avi Marcus avi at avimarcus.net
Sun Feb 10 23:40:21 MSK 2013


Steve:
Several reasons.
One is the mod_ha_cluster is an N+1 Cluster, not a active/passive pair.
That way if any single node fails, there's something to pick up the slack.

Secondly, in order to recover live calls, you need a list of the calls.
That currently requires some sort of odbc (or postgres) with replication.
Again, that's abstracted as part of mod_ha_cluster.

Third: The docs mention a similar of pooling for registration, that you can
register to one server and you're regged on them all without needing a DB
to sync everything.

Fourth, according to the docs: single configuration for all FS instances,
rather than manually ensuring each one has the same config.

Fifth: Voicemail clustering? Or we'll have to wait for mod_voicemail's APIs
to be rewritten for that, perhaps...

There's certainly *something *special possible with mod_ha_cluster that
can't be done with existing solutions cleanly, if at all...

-Avi

On Sun, Feb 10, 2013 at 10:16 PM, Steven Ayre <steveayre at gmail.com> wrote:

> Fair enough.
>
> I still don't see the need to reinvent the wheel just to avoid depending
> on a 3rd party well developed and extensively tested piece of software.
> Corosync provides an API that can be used for passing messages between
> nodes in a cluster.
>
> -Steve
>
>
>
> On 10 February 2013 18:14, Eliot Gable <egable+freeswitch at gmail.com>wrote:
>
>> On Sun, Feb 10, 2013 at 12:08 PM, Steven Ayre <steveayre at gmail.com>
>> wrote:
>> > That covers redundancy in case of a network card or cable failure, but
>> isn't
>> > what partitioning is about. Multiple NICs cannot prevent partitioning.
>> >
>> > As an example, partitioning might happen when a network switch between
>> two
>> > network segments fails so you have nodes A+B in segment 1 able to talk
>> to
>> > each other but unable to talk to nodes C+D in segment 2, while C+D can
>> talk
>> > to each other but not A+B.
>> >
>> > Pacemaker/corosync contain a lot of algorithms to fence off partitions
>> > without quorum and can resort to things like STONITH if required to
>> force a
>> > node to shutdown rather than risk it causing disruption to the cluster
>> (for
>> > example if it tries to take over traffic to a virtual IP you could end
>> up in
>> > a case where you have two servers sending ARP responses for the same
>> IP).
>> >
>>
>> Steve,
>>
>> As Avi pointed out, I mentioned having multiple physical networks as a
>> guard against a network split / partition. If one network is split
>> such that A and B can talk to each other over it and C and D can talk
>> to each other over it, you would indeed have an issue if you only had
>> one network. However, with two or more networks, all four nodes will
>> still be able to talk to each other over the other network(s).
>>
>> Now, granted, if you have a network split in all networks, then you
>> are still screwed. Pacemaker and other solutions deal with this, as
>> you mentioned, using something called "quorum" where you need a
>> majority of nodes to be able to see each other, and they fence the
>> remaining nodes. As I documented on my wiki page for the module, I do
>> have plans to eventually support such functionality. However, that is
>> a bit further down the road as it will take some time to develop
>> STONITH interfaces to various hardware or even to reuse the STONITH
>> modules from Pacemaker or another project. In any case, I feel it is
>> more important to get the base functionality developed and debugged as
>> utilizing multiple networks is a good way to prevent network splits
>> from being an issue.
>>
>> That being said, there are other issues to contend with when
>> discussing network splits. For example, if A and B can see the
>> Internet but C and D cannot, but C is a Master and B is a slave, you
>> still have an issue to address. In this case, mod_ha_cluster must be
>> able to determine that C and D cannot see the Internet. They need to
>> perform very fast pings to some IP address, or have some external host
>> sending them data in some way that they can detect when traffic
>> to/from the Internet has stopped. I can place a media bug on the audio
>> streams to make this determination fairly accurately. I can also rely
>> on a ping mechanism to make the determination. Once the determination
>> is made, mod_ha_cluster then has to promote B to a master to take over
>> C.
>>
>> So, there are certainly still other issues to address when a network
>> split occurs, but split-brain is easily avoided by simply adding
>> redundant networks.
>>
>> _________________________________________________________________________
>> Professional FreeSWITCH Consulting Services:
>> consulting at freeswitch.org
>> http://www.freeswitchsolutions.com
>>
>> 
>> 
>>
>> Official FreeSWITCH Sites
>> http://www.freeswitch.org
>> http://wiki.freeswitch.org
>> http://www.cluecon.com
>>
>> FreeSWITCH-users mailing list
>> FreeSWITCH-users at lists.freeswitch.org
>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
>> http://www.freeswitch.org
>>
>
>
> _________________________________________________________________________
> Professional FreeSWITCH Consulting Services:
> consulting at freeswitch.org
> http://www.freeswitchsolutions.com
>
> 
> 
>
> Official FreeSWITCH Sites
> http://www.freeswitch.org
> http://wiki.freeswitch.org
> http://www.cluecon.com
>
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
> http://www.freeswitch.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20130210/1405b24b/attachment.html 


Join us at ClueCon 2011 Aug 9-11, 2011
More information about the FreeSWITCH-users mailing list