Changes between Version 11 and Version 12 of SystemAdministration/Monitoring/Icinga


Ignore:
Timestamp:
05/28/21 13:17:58 (3 years ago)
Author:
teckart@informatik.uni-leipzig.de
Comment:

Update

Legend:

Unmodified
Added
Removed
Modified
  • SystemAdministration/Monitoring/Icinga

    v11 v12  
    12122. Statistics/impressions of overall availability/quality-of-service can be gathered.
    1313
    14 Our [https://www.icinga.org Icinga] monitoring system does this by periodically launching '''checks''' against all of our endpoints and hosts. Icinga's configuration is managed through a Git repo [https://github.com/clarin-eric/monitoring on GitHub]. Checks are parametrized '''probes'''. Probes are small set of executable commands, i.e. some basic builtin command (e.g., that checks whether a HTTP endpoint is up) or some monitoring plugin that can be executed as command line tool. Currently, almost all probes we use are handled by the [https://curl.haxx.se curl] utility, controlled via [https://github.com/clarin-eric/monitoring/blob/master/probes/probe_curl.sh a central script].
     14Our [https://www.icinga.org Icinga] monitoring system does this by periodically launching '''checks''' against all of our endpoints and hosts. Icinga's configuration is managed through a Git repo [https://github.com/clarin-eric/monitoring on GitHub]. Checks are parametrized '''probes'''. Probes are small set of executable commands, i.e. some basic builtin command (e.g., that checks whether a HTTP endpoint is up) or some monitoring plugin that can be executed as command line tool.
    1515
    1616## Roles ##
    1717
    18 The system is primarily administered  by the ASV in the person of [[mailto:"Thomas Hynek" <hynek@informatik.uni-leipzig.de>]]. CLARIN sysops maintain an advisory, secondary role on the technical level. The ASV system administrator(s) and the CLARIN sysops (Sander, Dieter, Willem) have full administrative access over SSH to the monitoring host for emergency maintenance. This includes being able to schedule downtime in Icinga.
    19 
    20 All requests and issues relating to monitoring should be processed through the Trac component Monitoring (see ‘Tickets’ below).
     18The system is primarily administered  by the ASV . CLARIN sysops maintain an advisory, secondary role on the technical level. The ASV system administrator(s) and the CLARIN sysops (Sander, Dieter, Willem) have full administrative access over SSH to the monitoring host for emergency maintenance. This includes being able to schedule downtime in Icinga.
    2119
    2220## Links ##
    2321
    24 For internal discussion, Benedikt (FZJ), Dirk (ASV), Dieter (CLARIN), Thomas Eckart (ASV), Thomas Hynek (ASV) and Sander (MPI-PL) are currently reachable via monitoring@clarin.eu.
     22For internal discussion, Dieter (CLARIN), Thomas Eckart (ASV), Nathanael Philipp (ASV) and Sander (MPI-PL) are reachable via monitoring@clarin.eu.
    2523
    26 To view the current monitoring state, browse the [https://fsd-cloud22.fz-juelich.de/icinga/ Icinga frontend].
     24The public link to the monitoring is [https://monitoring.clarin.eu/]. If you have justified reasons for accessing the monitoring, your CLARIN IdP account can be added to the configuration.
    2725
    28 The official, public link to it is [https://monitoring.clarin.eu/].
    29 
    30 [wiki:SystemAdministration/Hosts/fsd-cloud22.zam.kfa-juelich.de About the monitoring host fsd-cloud22].
    31 
    32 To see how changes to the Git repo have been propagated to the monitoring host, see [https://fsd-cloud22.fz-juelich.de:7011/logs/ Icinga configuration GitHub-host sync logs].
     26[wiki:SystemAdministration/Hosts/fsd-cloud22.zam.kfa-juelich.de Some more details].
    3327
    3428# Activities #