wiki:SystemAdministration/Monitoring/Icinga/Outdated

Version 63 (modified by kzimmer, 11 years ago) (diff)

--

server + services monitoring

AAI monitoring

MPI services monitoring

CLARIN-D monitoring requirements

The monitoring requirements of CLARIN-D are pretty modest - a simple icinga or nagios installation would be sufficient to fullfill all the needs:

  • regular checks if hosts are up and reachable (ping)
  • regular checks if certain network services are working (e.g. http)
  • Each center can provide nagios plugins to assess the state of that center's services.
  • Each center can register one or more contact persons. They will get an email with a warning if a host or service is not working correctly.
  • The server running the monitoring software is also monitored itself.
  • Users (not only the center administrators) should be able to see the status of each center and its service(s) on a website.
  • For access to the web interface of icinga/nagios authentication & authorization via shibboleth would be nice.
  • As visualisation a map of germany under http://de.clarin.eu/status momentan hier (für Joomla-Nutzer) via Nagvis-Plugin
    • with traffic light alarm indication (?) maybe like this (?)
    • with graphs to see how long or how often services have been unavailable in the past? Via link to nagios?
Service Types / Tests ping http disk space load free mem users functional check query duration time
AAI Service Providers (SP) # # #(IDS probe?)
AAI Identity Providers (IdP) # # * * #(IDS probe?)
AAI Where are you From (WAYF) # # #(MPI discojuice probe?)
REST-Webservices (WebLicht?) # #
Federated Content Search Endpoints (SRU/CQL) # # #(MPI probe?)
Federated Content Search Aggregator # # #
Repositories # # * #(test for afedora content model?)
OAI-PMH Gateway # #(MPI probe?)
Handle Servers # #(EUDAT/Jülich probe?) #
resolve a sample PID for each repository # #
Center Registry # #
WebLicht? webserver # #
VLO webserver # #
TLA webserver # #
other webservers # #
Nagios servers (selfcheck) # # #(check_nagios plugin)
Nagios servers crosscheck (from other center) # #(check_nagios plugin)
Workspaces server (not yet) n.a. n.a. n.a.

# mandatory; * optional/useful