ZMQ also does not work with a fork, which is needed in order to execute any system commands (like iptables or anything else which has no programming API). That pretty much eliminates ZMQ as a possibility. <div><br></div><div>
I did some research this weekend, and of all the possibilities I could find, the one that caught my attention the most was Spread: <a href="http://www.spread.org/">http://www.spread.org/</a></div><div><br></div><div>There are some drawbacks to it, namely:</div>
<div><br></div><div>1) It requires any marketing material mentioning FreeSWITCH or any project / solution utilizing FreeSWITCH to also include a prepared statement about the use of the Spread toolkit. This is a fairly major licensing issue, as all commercial solutions utilizing FreeSWITCH as a core component would then also need to mention the use of the Spread toolkit within FreeSWITCH. It is essentially a "viral clause," which would be very tedious for people to make sure they honor correctly.</div>
<div><br></div><div>2) The toolkit uses a stand-alone daemon which would then need to be monitored separately from FreeSWITCH and would be another point of failure, adding complexity to the system. Basically, if that daemon were to crash, FreeSWITCH would need to know about it and would need to either respawn it or shut down. This would bring the need for Pacemaker or something similar back into the picture for any viable HA solution. Alternatively, we could write some code similar to daemontools into FreeSWTICH which respawns the daemon if it dies, but we would have to test the impact of this respawning on the overall cluster to determine if it impacts visibility of the node at any point in time. Another alternative would be to wrap the daemon into a thread inside FreeSWITCH such that if the daemon caused a segfault or something, it would force FreeSWITCH to terminate, as well. I have not done a code review of the daemon yet to determine if this is viable alternative, but assuming they have not coded it on LSD or something, it is more than likely possible. <br>
<br>3) They boast about how it can handle up to 8,000 1KB messages per second. I don't consider that boast-worthy. When I worked at Broadvox a few years ago, I had a FS pair which ran around 380 calls per second (760 sessions per second). Each call generates dozens of events. That hardware was getting dated when I left Broadvox, and today's hardware along with the performance improvements done to FS since then means we could conceivably have a single node which runs over 1k calls per second firing dozens of events per call. That means a single box could completely consume the message bandwidth of the entire Spread network. Imagine trying to have 64 such boxes running. We are really in need of a solution which boasts hundreds of thousands of messages per second. Spread seems like it might be off by an order of magnitude and then some.</div>
<div><br></div><div>Despite these issues, Spread still seems to come closer to our needs than any other solution I found. FYI, I also looked at the Corosync IPC system, and was not at all impressed. On paper, Spread exceeds Corosync's capabilities by a fair margin.</div>
<div><br></div><div>There are some strategies for mitigating issue #3 with Spread, as well. For example, we could limit messages across the Spread network to things like heartbeats and / or other HA and synchronization related messages. Basically, think of it like a D-channel on a PRI. For sending high packet per second streams of messages, we could do standard unicast connections or even try straight up mutlicasting to all nodes on the LAN. Sending heartbeats every 10ms across the Spread network would put a 64-node cluster at 6,400 messages per second just with heartbeats. That would still leave a decent amount of message bandwidth available for other types of negotiation messages and should still allow for sub-second fail-over detection and reaction. Of course, this is all assuming we can actually get 8,000 1KB messages per second out of a 64-node cluster. </div>
<div><br></div><div>There are likely lots of things that impact how many messages per second Spread can handle. A lot of it has to do with network latency and CPU power. Spread uses acknowledgements and message reordering to ensure delivery in a way that accounts for things like node membership changes during the time the message is in transit and whether the message has been received by all nodes in the cluster. Network latency is probably one of the biggest factors in how many messages per second it can handle. On a faster network link, the messages per second would be higher and on a slower network, it would be lower. Obviously, CPU processing time and scheduling is important, as well. If one system is extremely overloaded and the Spread daemon is being starved for CPU resources, that will add extra latency in processing and also reduce message throughput. Obviously, this could also impact whether the node is seen as visible. So, this is one more argument for why we would need to try to run the daemon as a thread under FreeSWITCH. Then it has the same scheduling priority as FreeSWITCH and cannot be starved for resources by FreeSWITCH itself. It also would exhibit the same amount of resource starvation FreeSWITCH experiences on the node, so would more accurately reflect the state of FreeSWITCH on the node. </div>
<div><br></div><div>If anyone has any other suggestions than Spread, I would like to hear it. Also, some feedback on item #1 would be great, as I cannot really judge for everyone else how willing they are to accept such a licensing clause.</div>
<div><br></div><div><br><br><div class="gmail_quote">On Tue, Feb 12, 2013 at 8:48 PM, Joćo Mesquita <span dir="ltr"><<a href="mailto:jmesquita@freeswitch.org" target="_blank">jmesquita@freeswitch.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I have used ZeroMQ in the past for this sorts of things but it really won't be able to detect failures really fast. It is not made for this. Maybe we can gather the requirements for such message bus? Zmq for example provides you with this cool interface to build messaging protocols on top of it but it does not provide reliability when it comes to endpoint to endpoint connection without a heartbeat implemented on the user end. Can this be used for FS as well? Anyhow, just throwing some ideas...</div>
<div class="gmail_extra"><br clear="all"><div>Joćo Mesquita<br>FreeSWITCH Solutions<br></div>
<br><br><div class="gmail_quote"><div><div class="h5">On Tue, Feb 12, 2013 at 5:42 PM, Dave R. Kompel <span dir="ltr"><<a href="mailto:drk@drkngs.net" target="_blank">drk@drkngs.net</a>></span> wrote:<br></div></div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
<u></u>
<div>
<div>I've done a few experments with using both Redis, and the evil "Microsoft Azure Service bus" (the server on prem based version) to extend the eventing system to have global PUB/SUB. This way things like registrations, and Limit stuff could be made global. </div>
<div> </div>
<div>I'm looking for a way, in my carrier switch implmentation, to implment both HA Failover and Scaleout clustering.</div>
<div> </div>
<div>--Dave</div><br>
<blockquote style="BORDER-LEFT:#0000ff 2px solid;PADDING-LEFT:5px;MARGIN-LEFT:5px;MARGIN-RIGHT:0px">
<hr>
<b>From:</b> Eliot Gable [mailto:<a href="mailto:egable%2Bfreeswitch@gmail.com" target="_blank">egable+freeswitch@gmail.com</a>]<br><b>To:</b> FreeSWITCH Users Help [mailto:<a href="mailto:freeswitch-users@lists.freeswitch.org" target="_blank">freeswitch-users@lists.freeswitch.org</a>]<br>
<b>Sent:</b> Tue, 12 Feb 2013 05:49:13 -0800<br><b>Subject:</b> [Freeswitch-users] FreeSWITCH Message Bus / Shared Key Value Store<div><br><br>Tony and Mike and I had a discussion last night about FreeSWITCH with regards to implementing some form of core message bus or shared key-value store. We discussed a few different options, but did not really settle on anything. If you are writing modules or using FreeSWITCH in a multi-node setting, please share what features / functionality you would like to see implemented in this regard, how you would use it, and why you want to see the specific mechanism of your choice rather than some alternative. Also, please consider and mention whether "cluster awareness" is something that factors into your use case. By this, I mean having each FS node have some idea about the state / status of each other node in terms of taking calls vs acting as a standby or slave node, etc. <br clear="all">
<div><br></div>-- <br>Eliot Gable<br><br></div></blockquote>
<div> </div>
<div> </div></div><br></div></div>_________________________________________________________________________<br>
Professional FreeSWITCH Consulting Services:<br>
<a href="mailto:consulting@freeswitch.org" target="_blank">consulting@freeswitch.org</a><br>
<a href="http://www.freeswitchsolutions.com" target="_blank">http://www.freeswitchsolutions.com</a><br>
<br>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<br>
<a href="http://www.cudatel.com" target="_blank">http://www.cudatel.com</a><br>
<br>
Official FreeSWITCH Sites<br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
<a href="http://wiki.freeswitch.org" target="_blank">http://wiki.freeswitch.org</a><br>
<a href="http://www.cluecon.com" target="_blank">http://www.cluecon.com</a><br>
<br>
FreeSWITCH-users mailing list<br>
<a href="mailto:FreeSWITCH-users@lists.freeswitch.org" target="_blank">FreeSWITCH-users@lists.freeswitch.org</a><br>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><br>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
<br></blockquote></div><br></div>
<br>_________________________________________________________________________<br>
Professional FreeSWITCH Consulting Services:<br>
<a href="mailto:consulting@freeswitch.org">consulting@freeswitch.org</a><br>
<a href="http://www.freeswitchsolutions.com" target="_blank">http://www.freeswitchsolutions.com</a><br>
<br>
FreeSWITCH-powered IP PBX: The CudaTel Communication Server<br>
<a href="http://www.cudatel.com" target="_blank">http://www.cudatel.com</a><br>
<br>
Official FreeSWITCH Sites<br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
<a href="http://wiki.freeswitch.org" target="_blank">http://wiki.freeswitch.org</a><br>
<a href="http://www.cluecon.com" target="_blank">http://www.cluecon.com</a><br>
<br>
FreeSWITCH-users mailing list<br>
<a href="mailto:FreeSWITCH-users@lists.freeswitch.org">FreeSWITCH-users@lists.freeswitch.org</a><br>
<a href="http://lists.freeswitch.org/mailman/listinfo/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/listinfo/freeswitch-users</a><br>
UNSUBSCRIBE:<a href="http://lists.freeswitch.org/mailman/options/freeswitch-users" target="_blank">http://lists.freeswitch.org/mailman/options/freeswitch-users</a><br>
<a href="http://www.freeswitch.org" target="_blank">http://www.freeswitch.org</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Eliot Gable<br><br>"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower <br><br>"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower<br>
<br>"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero
</div>