Changes between Initial Version and Version 1 of ServiceProviderFederation/CLARIN IdP/AnyCast


Ignore:
Timestamp:
03/17/14 10:53:54 (10 years ago)
Author:
Dieter Van Uytvanck
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ServiceProviderFederation/CLARIN IdP/AnyCast

    v1 v1  
     1You asked me for some infos on our use of [http://en.wikipedia.org/wiki/Anycast AnyCast] for
     2load-balancing/automatic failover. I gave a presentation on that many
     3years ago and things still work the same way:
     4
     5http://www.switch.ch/aai/support/presentations/infoday-2005/AAI-ID05-50-SWITCHwayf.pdf
     6
     7We are using an approach that is very similar to the one described here:
     8
     9https://www.usenix.org/legacy/events/lisa10/tech/full_papers/Weiden.pdf
     10
     11Basically, we have two instances, one in Zurich, one in Lausanne.
     12Both of them have a unique IP but they also have share a common Anycast
     13IPD via a local interface:
     14{{{
     15lo:0      Link encap:Local Loopback
     16          inet addr:130.59.31.249  Mask:255.255.255.255
     17          UP LOOPBACK RUNNING  MTU:16436  Metric:1
     18}}}
     19
     20The interface lo:0 is bound to the anycast IP (130.59.31.249)
     21
     22On both instances there is the routing daemon Quagga installed. Quagga
     23announces the root to lo:0 to its neighbour routers. If lo:0 is disabled
     24(ifdown lo:0), quagga stops announcing the route of that instance, which
     25means that all requests go the other instance. This is good for maintenance.
     26
     27Normally, both instances are active and receive requests (one a bit more
     28than the other). If an instance has a power loss or network disruption,
     29the routing daemon fails to announce the route to the anycast daemon.
     30Within a few seconds (I think 3s or 4s) the route to that instance is
     31deleted from the routers in our network thanks to the OSPF dead timer.
     32This then is part of the automatic failover.
     33
     34As the presentation explains, we also have some monitoring in place to
     35ensure the host is available. One local monitoring script also will
     36withdraw the route to the local instance if the web server is not
     37running properly or the response is not containing a specific response.
     38In such a case, the script executes the "ifdown lo:0".