[Freeswitch-users] High Availability Cluster Module for FreeSWITCH

Eliot Gable egable+freeswitch at gmail.com
Sun Feb 10 18:51:43 MSK 2013


On Sun, Feb 10, 2013 at 10:27 AM, Eliot Gable
<egable+freeswitch at gmail.com> wrote:
> You use multiple NICs in the systems and send heartbeats out all of
> them. There is no other way to do it. Two NICs are required, and 3 are
> recommended.

Just to clarify a little, this is something that needs to be handled
mostly by the person / organization deploying the HA module. The HA
module supports multiple heartbeat NICs. The entity deploying the
module needs to design the physical layer of their network to ensure a
full network partition can never occur. You do that in general by
deploying 2 or more physical networks. This means you need redundant
power, battery backups, multiple physical switches, etc. You need to
design and deploy your physical network to ensure that no matter what
fails (power, wiring, switch ports, switches, NICs, etc), the systems
using the module always have an alternative method of communicating
with each other. Typically, two physical networks are sufficient for
most users. However, carrier deployments who want to offer 99.999% or
better uptime might want to go with three physical networks. This
means placing three NICs in each system (one on each network) and
configuring mod_ha_cluster to send and receive messages on all three
NICs.

The module keeps several seconds of cached messages in a hash table to
eliminate / ignore duplicate messages.  The first one received will be
used and it's message ID is stored in the hash table. If another copy
arrives before that entry is pruned from the cache, the additional
copy is ignored.

In addition, you can deploy multiple clusters distributed
geographically to ensure that if one entire cluster goes offline, your
services do not fail entirely.

There is nothing magical or "hard" per-se about detecting and
preventing a network split from screwing with the cluster. The hardest
part is simply educating people on how to design and deploy the
physical network and configure the module so it can do the detection.

Also, please do not confuse the difficulty of writing a general
purpose HA system like Pacemaker with the relative simplicity of
writing one for a single specific application. When you write
something general purpose like Pacemaker, your task is vastly more
difficult due to the multitude of software configurations you need to
support. For mod_ha_cluster, there is a very specific design in mind.
Namely, you set a master-to-slave ratio and bring nodes online and it
is designed to behave a specific way and bring those nodes online in a
specific, well-defined way. There is really only "one way" that this
module supports bringing up the cluster. This vastly simplifies the
code and the entire project.



Join us at ClueCon 2011 Aug 9-11, 2011
More information about the FreeSWITCH-users mailing list