Version 29 (modified by 11 years ago) (diff) | ,
---|
server + services monitoring
- nagios: http://www.nagios.org/
- icinga (nagios fork, same format for plugins etc): https://www.icinga.org/
AAI monitoring
- AAI eye: http://www.csc.fi/english/institutions/haka/instructions/services-tech/aaieye
- RAPTOR: http://iam.cf.ac.uk/trac/RAPTOR
CLARIN-D monitoring requirements
The monitoring requirements of CLARIN-D are pretty modest - a simple icinga or nagios installation would be sufficient to fullfill all the needs:
- regular checks if hosts are up and reachable (ping)
- regular checks if certain network services are working (e.g. http)
- Each center can provide nagios plugins to assess the state of that center's services.
- Each center can register one or more contact persons. They will get an email with a warning if a host or service is not working correctly.
- The server running the monitoring software is also monitored itself.
- Users (not only the center administrators) should be able to see the status of each center and its service(s) on a website.
- For access to the web interface of icinga/nagios authentication & authorization via shibboleth would be nice.
Service Types / Tests | ping | http | disk space | load | free mem | shib func | query func | performance |
---|---|---|---|---|---|---|---|---|
=AAI Service Providers (SP) | * | * | ||||||
=AAI Identity Providers (IdP) | * | * | ||||||
=AAI Where are you From (WAYF) | * | * | ||||||
=REST-Webservices (WebLicht?) | * | |||||||
=Federated Content Search Endpoints (FCS) | * | * | * | |||||
=Federated Content Search Aggregator | * | * | * | |||||
=Repositories | * | * | * | |||||
=OAI-PMH Gateway | * | * | ||||||
=Handle Servers | * | * | * | |||||
=local handles resolveable | * | * | ||||||
=Center Registry | * | * | ||||||
=WebLicht? webserver | * | * | ||||||
=VLO webserver | * | * | ||||||
=other webservers | * | * | ||||||
=Nagios servers crosscheck | * | * | ||||||
=Workspaces server | * | * | * |
Visualisation for Clarin-D: map of germany under http://de.clarin.eu/status with traffic light alarm indication (?) maybe like this (?)
Open questions: Do we need graphs to see how long or how often services have been unavailable in the past? Perhaps via link to nagios?