[Freeswitch-users] FreeSWITCH HA + Loadbalancing

Raimund Sacherer rs at runsolutions.com
Sat Aug 29 07:33:15 PDT 2009


Thinking about it, maybe we can create a solution, if some of us work  
together:

My strength are in virtualization, linux, development, databases,  
integration, etc.
What I do not now much about is how SIP (and everything else for that  
matter in the Voice world) works under the hood, and how it's  
implemented in FS.

I know that the state information for a call has to be stored and  
retrieved somewhere and somehow, only I do not know that part. What I  
know is that it hast to be do-able to store all the stream information  
(ip's, port's, current state's, etc.) in a very fast database (e.g. my  
idea would be memcached) so another FS could just take this  
information and take over the call, maybe you loose a second of voice,  
maybe you loose the recorded call file or a part of it, but that  
should be it. (SipFoundry has a boxed opensource PBX, which, of course  
is not flexible like FreeSWITCH or Asterisk, but has Call Live  
Migration and Call Live Failover integrated!).

What I want is for my company to be able to sell a 99.99 uptime PBX  
(we do mostly call-center related stuff), which can scale well, and  
can grow with the company without lot's of hassles, my Dream would be:

To begin with:

One Hardware Node with the essential hardware (digium cards for  
example).
On this node are OpenVZ virtualized containers:
[VirtCnt1: FS which only talks to the Hardware and forwards  
everything] = Could be replaced with hardware media gateway, etc.
[VirtCnt2: FS which handles the PBX] \___ Loadbalanced, with odbc or  
xml, Failover, Livetakeover
[VirtCnt3: FS which handles the PBX] /
[VirtCnt4: Database for state information] (maybe something as  
resource-friendly as memcached? ressource heavvy database?)

With this we can achieve all this:

Problem with VirtCnt2 (e.g. crash, lock, ...)
* VirtCnt3 can take over.
-> You are free without stress to investigate the problem, you can  
debug and analyze whyle the machine is still running
-> you can also create a machine-state-dump of the virtual container,  
dump the container as well, copy the data to your lab and restore the  
machine up the state which it was running with the problem, so you can  
liveinvestigate it in the lab (some prerequirements given, but easy  
doable)
-> just think about the possibility of better bugreports because  
someone can take the time to read out all the data with GDB to  
investigate the proper cause of a machine Lock!

You want to upgrade to a new FreeSWITCH version?
* Take VirtCnt2 out of the LoadBalancing Scheme,
* Stop it, Clone it,
* Upgrade FreeSWITCH in the cloned Container
* Start the cloned container
* if there's something wrong, stop it and restart the original VirtCnt2
-> No problem at all, you can Test on the Live Hardware, with part of  
the Live users (maybe a low-volume queue) to be sure everything works  
out fine before you activate the full loadbalance

Server on it's own can't handle the load
* Buy new machine
* Setup Hardware Node
* Livemigrate VirtCnt3 (no downtime)

Now the first Server with the VrtCnt1 and VirtCnt2 as well has to much  
load
* Buy new machine
* Setup Hardware Node
* Livemigrate VirtCnt2 (no downtime)
-> Now you have a 3 server solution (1 mediaprox, 2 loadbalanced /  
failover PBXes) out of the first box you bought, without headaches,  
because the system was built for it from the beginning!

The Database drains to much?
* Buy new machine
* Setup Hardware Node
* Livemigrate database VirtCnt4 (no downtime)

You want to upgrade Hardware/Kernel in Hardware node 1?
* Livemigrate VirtCnt2 to a hotstandby machine, or to the other PBX  
machine, upgrade the hardware, Re-Livemigrate the containers. (no  
downtime)
* OR just break the loadbalancing, wait until all current calls are  
teared down correctly, upgrade machine, reenable the loadbalancer

You want an exact copy of the first server for Hardware HA?
* Buy new machine
* Setup Hardware node
* Buy hardware PRI switchover box
* Clone VirtCnt1 - VirtCnt4 to the new machine
* Make basic failover configuration


-> the sky's the limit, as the saying goes ...


So, I can do all the openvz stuff and the integration with database /  
memcached / heartbeat / whatever is needed here, someone there to be  
willing to work with me on this on the FreeSWITCH side? or at least  
provide me with the necessary information about what's needed / how to  
talk / what states from FreeSWITCH?

I know this seems very ambitious but if this could be made in a rather  
relativly easy to setup package, with good documentation, it would be  
a boost for FreeSWITCH, i am sure, because after all this is what  
everyone is grown accustomed to from good old phone companys and the  
good old pbx's: carrier grade uptimes ...

Thanks for everyone reading up until here,
all the best,

Ray



-- 
Raimund Sacherer
-
RunSolutions
     Open Source It Consulting
-

Parc Bit - Centro Empresarial Son Espanyol
Edificio Estel - Local 3D
07121 -  Palma de Mallorca
Baleares

On Aug 29, 2009, at 3:17 PM, Raimund Sacherer wrote:

> Oh yeah, that would be so helpfull for my situation, as my client  
> *demands* now a solution where he can press a big red button and all  
> fails over to another box. Hi es totally scared because of the  
> Lockups in Asterisk which under specific situations including AMI,  
> Automated Call Setup, and murphy led to a lockup of the entire  
> machine, no console was working anymore, only cold-reset could do it.
>
> So, IF there is the possibility for life-takeover, / failover etc. I  
> would love to here how has been done.
>
> I am very experienced with openvz and use for about two years now  
> only openvz virtualization servers for anything because of live- 
> migration etc. But as I am new in this company we could not adopt  
> this until now.
>
> So Please Ken, if you can, describe what need's to be done to get a  
> failover / takeover working (an outline would be enough)
>
> Thanks in Advance
>
> -- 
> Raimund Sacherer
> -
> RunSolutions
>     Open Source It Consulting
> -
>
> Parc Bit - Centro Empresarial Son Espanyol
> Edificio Estel - Local 3D
> 07121 -  Palma de Mallorca
> Baleares
>
> On Aug 29, 2009, at 11:58 AM, Steve Kurzeja wrote:
>
>> On Sat, Aug 29, 2009 at 2:34 PM, Diego Viola  
>> <diego.viola at gmail.com> wrote:
>> Yes, FreeSWITCH is a system that you can trust 100%. I have  
>> switched my Asterisk servers to FreeSWITCH and have peace now.
>>
>> If I were you I would get rid of Asterisk and use FreeSWITCH, FS  
>> will handle all what you want very well.
>>
>> And I agree with David, fail-over is kinda irrelevant since the FS  
>> doesn't crash like Asterisk does.
>>
>>
>>
>> You still have hardware failures and fail-over is also useful for  
>> hit-less maintenance on boxes.
>>
>> I'd be interested to know how Brian West was approaching his live  
>> migration work.
>>
>> Steve
>> _______________________________________________
>> FreeSWITCH-users mailing list
>> FreeSWITCH-users at lists.freeswitch.org
>> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
>> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch- 
>> users
>> http://www.freeswitch.org
>
> _______________________________________________
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch- 
> users
> http://www.freeswitch.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20090829/1d5aa96a/attachment-0002.html 


More information about the FreeSWITCH-users mailing list