Project

General

Profile

Actions

action #128969

closed

[alert][grafana] Failed systemd services alert (except openqa.suse.de) Salt (Uk02cifVkz)

Added by tinita 12 months ago. Updated 12 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2023-05-08
Due date:
2023-05-24
% Done:

0%

Estimated time:
Tags:

Description

Observation

multiple alert emails received 2023-05-09

https://stats.openqa-monitor.qa.suse.de/alerting/grafana/Uk02cifVkz/view?orgId=1

Actions #1

Updated by okurz 12 months ago

  • Tags set to infra
Actions #2

Updated by osukup 12 months ago

looks like a broken package from security-sensor.repo ...

==> Suppress alert and wait for the newer working build?

Actions #3

Updated by livdywan 12 months ago

osukup wrote:

looks like a broken package from security-sensor.repo ...

==> Suppress alert and wait for the newer working build?

There's a go stacktrace. I would guess we need to report this somewhere or we may not get a fix? Not sure where, though.

Actions #4

Updated by livdywan 12 months ago

  • Status changed from New to In Progress
  • Assignee set to livdywan

Well, let's see if I can find out what to do here. Asking in Slack for now.

Btw since I was asked, here's how I double-checked that these are all Leap 15.4 machines: sudo salt -C '*' cmd.run 'grep PRETTY /etc/os-release'

Actions #5

Updated by openqa_review 12 months ago

  • Due date set to 2023-05-24

Setting due date based on mean cycle time of SUSE QE Tools

Actions #6

Updated by livdywan 12 months ago

  • Status changed from In Progress to Feedback

cdywan wrote:

Well, let's see if I can find out what to do here. Asking in Slack for now.

Btw since I was asked, here's how I double-checked that these are all Leap 15.4 machines: sudo salt -C '*' cmd.run 'grep PRETTY /etc/os-release'

Apparently a fix for a regression in libbpfgo was applied, and after restarting the service it's looking to run fine again:

sudo salt -C '*' cmd.run 'systemctl restart velociraptor-client'
storage.oqa.suse.de:
[...]
sudo salt -C '*' cmd.run 'systemctl is-active velociraptor-client'
storage.oqa.suse.de:
    active
[...]
Actions #7

Updated by okurz 12 months ago

I followed the discussion. That's the benefit of having the main developer in the house :D

I suggest you crosscheck all velociraptor-client statuses on all OSD salt maintained machines, should be an easy salt exercise for you ;)

Actions #8

Updated by livdywan 12 months ago

okurz wrote:

I suggest you crosscheck all velociraptor-client statuses on all OSD salt maintained machines, should be an easy salt exercise for you ;)

I did. See above. Or do you mean something not covered by cmd.run?

Actions #10

Updated by okurz 12 months ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF