action #176265
openo3: Cron <munin@ariel> test -x /usr/bin/munin-cron && /usr/bin/munin-cron size:S
0%
Description
Observation¶
From o3 cron emails
[FATAL] There is nothing to do here, since there are no nodes with any plugins. Please refer to http://munin-monitoring.org/wiki/FAQ_no_graphs at /usr/lib/munin/munin-html line 38.
Acceptance criteria¶
- AC1: o3 is consistently monitored without fatal error messages about "nothing to do"
Suggestions¶
- Call
find /etc/munin
on o3 to find out what is currently there for munin, e.g. /etc/munin/plugins/openqa_minion_jobs_hook_rc_failed - Read https://web.archive.org/web/20200711220239/http://munin-monitoring.org/wiki/FAQ_no_graphs
- Considering the error refers to a non-existing URL - maybe update Munin?
and the comments in this ticket about what was already tried
- Considering the error refers to a non-existing URL - maybe update Munin?
- Understand the current configuration of munin on o3 which shouldn't be too complicated
- Also check https://github.com/munin-monitoring/munin/issues/1497
- Verify that the correct hostname is used e.g. openqa.opensuse.org
- Do we need/actively use Munin?
- We also have Zabbix running here
Updated by tinita 4 months ago
- Status changed from In Progress to Feedback
I couldn't find much hints, except https://debianforum.de/forum/viewtopic.php?t=131809 where the OP says that setting host_name in munin-node.conf helped.
I did that now and set host_name openqa.opensuse.org
in /etc/munin/munin-node.conf
.
Updated by tinita 4 months ago ยท Edited
- Status changed from Resolved to Workable
- Assignee deleted (
tinita)
Happened again today Date: Thu, 6 Feb 2025 09:55:22 +0000 (UTC). No idea why.
I think we can just ignore it, but it would be good to have an automatic way to ignore this and not receive an email.
See also https://github.com/munin-monitoring/munin/issues/1497 and https://web.archive.org/web/20200711220239/http://munin-monitoring.org/wiki/FAQ_no_graphs
Updated by tinita 4 months ago
Hmm. I tried something else now.
Maybe I hadn't restarted the service when I did the config change last time. So I set host_name openqa.opensuse.org
again and then restarted munin-node.service.
Then:
telnet localhost 4949
Trying ::1...
Connected to localhost.
Escape character is '^]'.
# munin node at openqa.opensuse.org
quit
Connection closed by foreign host.
if I understand the FAQ correctly, this should now be enough to let munin believe the hostname is openqa.opensuse.org
, which should be the displayed titles in the graphs.
Updated by gpuliti about 2 months ago
- Status changed from Resolved to Workable
Happened again: Fri, 18 Apr 2025
Updated by livdywan about 1 month ago
- Status changed from Workable to New
This wasn't estimated hence moving to New
Updated by okurz about 1 month ago
- Tags changed from reactive work, o3, alert to reactive work, o3, alert, infra
- Description updated (diff)
Refined and agreed to move to "infra" as this is about low-level OS level monitoring
Updated by okurz about 1 month ago
- Subject changed from o3: Cron <munin@ariel> test -x /usr/bin/munin-cron && /usr/bin/munin-cron to o3: Cron <munin@ariel> test -x /usr/bin/munin-cron && /usr/bin/munin-cron size:S
- Description updated (diff)
- Status changed from New to Workable