Changes between Version 11 and Version 12 of SystemAdministration/Monitoring/Icinga
- Timestamp:
- 05/28/21 13:17:58 (3 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SystemAdministration/Monitoring/Icinga
v11 v12 12 12 2. Statistics/impressions of overall availability/quality-of-service can be gathered. 13 13 14 Our [https://www.icinga.org Icinga] monitoring system does this by periodically launching '''checks''' against all of our endpoints and hosts. Icinga's configuration is managed through a Git repo [https://github.com/clarin-eric/monitoring on GitHub]. Checks are parametrized '''probes'''. Probes are small set of executable commands, i.e. some basic builtin command (e.g., that checks whether a HTTP endpoint is up) or some monitoring plugin that can be executed as command line tool. Currently, almost all probes we use are handled by the [https://curl.haxx.se curl] utility, controlled via [https://github.com/clarin-eric/monitoring/blob/master/probes/probe_curl.sh a central script].14 Our [https://www.icinga.org Icinga] monitoring system does this by periodically launching '''checks''' against all of our endpoints and hosts. Icinga's configuration is managed through a Git repo [https://github.com/clarin-eric/monitoring on GitHub]. Checks are parametrized '''probes'''. Probes are small set of executable commands, i.e. some basic builtin command (e.g., that checks whether a HTTP endpoint is up) or some monitoring plugin that can be executed as command line tool. 15 15 16 16 ## Roles ## 17 17 18 The system is primarily administered by the ASV in the person of [[mailto:"Thomas Hynek" <hynek@informatik.uni-leipzig.de>]]. CLARIN sysops maintain an advisory, secondary role on the technical level. The ASV system administrator(s) and the CLARIN sysops (Sander, Dieter, Willem) have full administrative access over SSH to the monitoring host for emergency maintenance. This includes being able to schedule downtime in Icinga. 19 20 All requests and issues relating to monitoring should be processed through the Trac component Monitoring (see ‘Tickets’ below). 18 The system is primarily administered by the ASV . CLARIN sysops maintain an advisory, secondary role on the technical level. The ASV system administrator(s) and the CLARIN sysops (Sander, Dieter, Willem) have full administrative access over SSH to the monitoring host for emergency maintenance. This includes being able to schedule downtime in Icinga. 21 19 22 20 ## Links ## 23 21 24 For internal discussion, Benedikt (FZJ), Dirk (ASV), Dieter (CLARIN), Thomas Eckart (ASV), Thomas Hynek (ASV) and Sander (MPI-PL) are currentlyreachable via monitoring@clarin.eu.22 For internal discussion, Dieter (CLARIN), Thomas Eckart (ASV), Nathanael Philipp (ASV) and Sander (MPI-PL) are reachable via monitoring@clarin.eu. 25 23 26 T o view the current monitoring state, browse the [https://fsd-cloud22.fz-juelich.de/icinga/ Icinga frontend].24 The public link to the monitoring is [https://monitoring.clarin.eu/]. If you have justified reasons for accessing the monitoring, your CLARIN IdP account can be added to the configuration. 27 25 28 The official, public link to it is [https://monitoring.clarin.eu/]. 29 30 [wiki:SystemAdministration/Hosts/fsd-cloud22.zam.kfa-juelich.de About the monitoring host fsd-cloud22]. 31 32 To see how changes to the Git repo have been propagated to the monitoring host, see [https://fsd-cloud22.fz-juelich.de:7011/logs/ Icinga configuration GitHub-host sync logs]. 26 [wiki:SystemAdministration/Hosts/fsd-cloud22.zam.kfa-juelich.de Some more details]. 33 27 34 28 # Activities #