ServiceProviderFederation/CLARIN IdP/AnyCast – CLARIN Trac

wiki:ServiceProviderFederation/CLARIN IdP/AnyCast

Context Navigation

You asked me for some infos on our use of AnyCast for load-balancing/automatic failover. I gave a presentation on that many years ago and things still work the same way:

http://www.switch.ch/aai/support/presentations/infoday-2005/AAI-ID05-50-SWITCHwayf.pdf

We are using an approach that is very similar to the one described here:

https://www.usenix.org/legacy/events/lisa10/tech/full_papers/Weiden.pdf

Basically, we have two instances, one in Zurich, one in Lausanne. Both of them have a unique IP but they also have share a common Anycast IPD via a local interface:

lo:0      Link encap:Local Loopback
          inet addr:130.59.31.249  Mask:255.255.255.255
          UP LOOPBACK RUNNING  MTU:16436  Metric:1

The interface lo:0 is bound to the anycast IP (130.59.31.249)

On both instances there is the routing daemon Quagga installed. Quagga announces the root to lo:0 to its neighbour routers. If lo:0 is disabled (ifdown lo:0), quagga stops announcing the route of that instance, which means that all requests go the other instance. This is good for maintenance.

Normally, both instances are active and receive requests (one a bit more than the other). If an instance has a power loss or network disruption, the routing daemon fails to announce the route to the anycast daemon. Within a few seconds (I think 3s or 4s) the route to that instance is deleted from the routers in our network thanks to the OSPF dead timer. This then is part of the automatic failover.

As the presentation explains, we also have some monitoring in place to ensure the host is available. One local monitoring script also will withdraw the route to the local instance if the web server is not running properly or the response is not containing a specific response. In such a case, the script executes the "ifdown lo:0".

Last modified 10 years ago Last modified on 03/17/14 10:54:30

Download in other formats:

Plain Text