[Freeswitch-users] High Availability Cluster Module for FreeSWITCH

Mon Feb 11 17:06:59 MSK 2013

On Mon, Feb 11, 2013 at 7:36 AM, Marcin Gozdalik <gozdal at gmail.com> wrote:

> +1
>
> I do not doubt mod_ha is necessary inside of FS  and it may be
> better/simpler than writing Pacemaker resource agent, but writing
> yet-another-cluster-communication-engine is IMHO the wrong way to go
> and using Corosync for communication will give a lot of value from
> mature codebase.
>
>
I understand what you are saying, but what I am trying to get across is
that I am not writing yet-another-cluster-communication-engine. All I am
really doing is combining a multicast messaging API written by Tony and the
event API in FS to broadcast existing state information between multiple FS
nodes, as well as adding a tiny amount of logic on top of that to
coordinate call fail over and recovery. That's probably a little
over-simplified, but it gets the point across. The network communication
code is already in FS and well tested. The event system is already in FS
and well tested. I have already written the code to the point that it
parses the configuration files and starts sending heartbeats out all of the
interfaces configured. I have also already written a lot of the code that
deals with the state transitions. All I am talking about doing is
implementing a tiny little finite state machine. It's a pretty trivial
programming task. In fact, I think it was covered in my first year at
Carnegie Mellon University. Of course, I had already figured out how to
write such things in high school, I just did not know what it was called at
that point. My point is, that this is not
yet-another-cluster-communication-engine. It is a very specific and small
finite state machine designed solely with the goal in mind of making FS
have just enough information to coordinate call fail over internally. If I
recall correctly, a lot of people also said writing yet-another-VoIP-server
was a waste of time, but now we have FreeSWITCH, and it was obviously worth
the effort. And I am not even trying to do something as complex as that. If
you think this is yet-another-cluster-communication-engine, you are missing
the point. It is not. It never will be.

Look at Sonus, Genband, Broadsoft, Veraz, etc. All the big-name
carrier-grade telecom providers have a built-in solution for automatic call
fail over. The only way FreeSWITCH will ever compete with such solutions is
if it also has that feature. Pacemaker and Corosync are overkill just to
get FS to handle single node failures and provide call recovery. It took me
a full 3 months of working with them every day to really understand how to
deploy them properly in conjunction with FreeSWITCH and Postgres to provide
a carrier-grade hot-standby solution which was robust enough to handle 99%
of the failures I could throw at it. Granted, this was back when the
configuration still needed to be written by hand in XML and prior the
existence of any resource agent for FreeSWITCH. But, even with those
changes, deploying Pacemaker and Corosync is not a simple task. If that is
the requirement for FS to have HA, it will never truly stand a chance
against commercial offerings.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20130211/99dfdcf7/attachment.html