Fair enough.<div><br></div><div>I still don't see the need to reinvent the wheel just to avoid depending on a 3rd party well developed and extensively tested piece of software. Corosync provides an API that can be used for passing messages between nodes in a cluster.</div>
<div><div><div><br></div><div>-Steve</div><div><br></div><div><br><br><div class="gmail_quote">On 10 February 2013 18:14, Eliot Gable <span dir="ltr"><<a href="mailto:egable+freeswitch@gmail.com" target="_blank">egable+freeswitch@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Sun, Feb 10, 2013 at 12:08 PM, Steven Ayre <<a href="mailto:steveayre@gmail.com">steveayre@gmail.com</a>> wrote:<br>
> That covers redundancy in case of a network card or cable failure, but isn't<br>
> what partitioning is about. Multiple NICs cannot prevent partitioning.<br>
><br>
> As an example, partitioning might happen when a network switch between two<br>
> network segments fails so you have nodes A+B in segment 1 able to talk to<br>
> each other but unable to talk to nodes C+D in segment 2, while C+D can talk<br>
> to each other but not A+B.<br>
><br>
> Pacemaker/corosync contain a lot of algorithms to fence off partitions<br>
> without quorum and can resort to things like STONITH if required to force a<br>
> node to shutdown rather than risk it causing disruption to the cluster (for<br>
> example if it tries to take over traffic to a virtual IP you could end up in<br>
> a case where you have two servers sending ARP responses for the same IP).<br>
><br>
<br>
</div>Steve,<br>
<br>
As Avi pointed out, I mentioned having multiple physical networks as a<br>
guard against a network split / partition. If one network is split<br>
such that A and B can talk to each other over it and C and D can talk<br>
to each other over it, you would indeed have an issue if you only had<br>
one network. However, with two or more networks, all four nodes will<br>
still be able to talk to each other over the other network(s).<br>
<br>
Now, granted, if you have a network split in all networks, then you<br>
are still screwed. Pacemaker and other solutions deal with this, as<br>
you mentioned, using something called "quorum" where you need a<br>
majority of nodes to be able to see each other, and they fence the<br>
remaining nodes. As I documented on my wiki page for the module, I do<br>
have plans to eventually support such functionality. However, that is<br>
a bit further down the road as it will take some time to develop<br>
STONITH interfaces to various hardware or even to reuse the STONITH<br>
modules from Pacemaker or another project. In any case, I feel it is<br>
more important to get the base functionality developed and debugged as<br>
utilizing multiple networks is a good way to prevent network splits<br>
from being an issue.<br>
<br>
That being said, there are other issues to contend with when<br>
discussing network splits. For example, if A and B can see the<br>
Internet but C and D cannot, but C is a Master and B is a slave, you<br>
still have an issue to address. In this case, mod_ha_cluster must be<br>
able to determine that C and D cannot see the Internet. They need to<br>
perform very fast pings to some IP address, or have some external host<br>
sending them data in some way that they can detect when traffic<br>
to/from the Internet has stopped. I can place a media bug on the audio<br>
streams to make this determination fairly accurately. I can also rely<br>
on a ping mechanism to make the determination. Once the determination<br>
is made, mod_ha_cluster then has to promote B to a master to take over<br>
C.<br>
<br>
So, there are certainly still other issues to address when a network<br>
split occurs, but split-brain is easily avoided by simply adding<br>
redundant networks.<br>
<div class="HOEnZb"><div class="h5"><br>
_________________________________________________________________________<br>
Professional FreeSWITCH Consulting Services:<br>
<a href="mailto:consulting@freeswitch.org">consulting@freeswitch.org</a><br>
<a href="http://www.freeswitchsolutions.com" target="_blank">http://www.freeswitchsolutions.com</a><br>
<br>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<br>
<a href="http://www.cudatel.com" target="_blank">http://www.cudatel.com</a><br>
<br>
Official FreeSWITCH Sites<br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
<a href="http://wiki.freeswitch.org" target="_blank">http://wiki.freeswitch.org</a><br>
<a href="http://www.cluecon.com" target="_blank">http://www.cluecon.com</a><br>
<br>
FreeSWITCH-users mailing list<br>
<a href="mailto:FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><br>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><br>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
</div></div></blockquote></div><br></div></div></div>