Project

General

Profile

Actions

action #75445

closed

unknown dashboards for "linux-fwcx" and "localhost" reappearing on monitor.qa

Added by okurz over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2020-10-28
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://stats.openqa-monitor.qa.suse.de/alerting/list?state=not_ok
shows many paused alerts for "linux-fwcx" and "localhost", e.g.

linux-fwcx: Memory usage alert
UNKNOWN for 4 days
linux-fwcx: Minion Jobs alert
UNKNOWN for 4 days
linux-fwcx: NTP offset alert
UNKNOWN for 4 days
linux-fwcx: OpenQA Ping time alert
UNKNOWN for 4 days
linux-fwcx: partitions usage (%) alert
UNKNOWN for 4 days
localhost: Disk I/O time alert
UNKNOWN for 5 days
localhost: Memory usage alert
UNKNOWN for 5 days
localhost: Minion Jobs alert
UNKNOWN for 5 days
localhost: NTP offset alert
UNKNOWN for 5 days
localhost: OpenQA Ping time alert
UNKNOWN for 5 days
localhost: partitions usage (%) alert
UNKNOWN for 5 days

I already tried to manually delete them but they seem to reappear. What I did on monitor.qa:

sudo su
cd /var/lib/grafana/dashboards
rm worker-linux-fwcx.json worker-localhost.json
systemctl restart grafana-server

Acceptance criteria

Suggestions

  • Find out who did that, which machines these are, maybe experiments on "staging" or on the staging worker machines?
  • Prevent that the same monitoring instance is reconfigured from elsewhere

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #76783: research how hostnames with systemd work and make them static for all OSD related machinesResolvedokurz2020-10-29

Actions
Copied to openQA Infrastructure - action #76786: Configure static hostnames with salt for all salt nodesResolvedokurz

Actions
Actions

Also available in: Atom PDF