Project

General

Profile

Actions

action #116752

closed

[alert] powerqaworker-qam-1: host up alert

Added by okurz about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2022-09-19
Due date:
% Done:

0%

Estimated time:
Tags:


Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #116722: openqa.suse.de is not reachable 2022-09-18, no ping response, postgreSQL OOM and kernel panics size:MResolvedmkittler2022-09-18

Actions
Copied from openQA Infrastructure - action #116746: [alert] openqaworker9: host up alertResolvednicksinger2022-09-19

Actions
Actions #1

Updated by okurz about 2 years ago

  • Copied from action #116746: [alert] openqaworker9: host up alert added
Actions #2

Updated by okurz about 2 years ago

  • Related to action #116722: openqa.suse.de is not reachable 2022-09-18, no ping response, postgreSQL OOM and kernel panics size:M added
Actions #3

Updated by nicksinger about 2 years ago

  • Status changed from New to In Progress
  • Assignee set to nicksinger
Actions #4

Updated by nicksinger about 2 years ago

No sol output after connecting to it. Grafana dashboard doesn't show something unusual at the time of failure (2022-09-18, 05:45). Reboot let the machine came up normally. Rebooting 3 times to see if any issue can be reproduced or if we can consider the machine stable.

Actions #5

Updated by nicksinger about 2 years ago

After rebooting several times it seems like petitboot sometimes fails to find the OS. "rescan devices" makes them appear and everything works fine afterwards. Will have a look if I can regenerate the grub entries (most likely not, my past research shows that petitboot just "probes" for present kernels).

Actions #6

Updated by nicksinger about 2 years ago

  • Status changed from In Progress to Resolved

force reinstalled "kernel-default" because it triggers the right $magic (regenerating initrd, writing grub files, etc). Now the machine was perfectly able to reboot 3 times. I might was too impatient with my previous attempts because petitboot probes for quite some time but eventually finds the disk/os and boots it on its own.
Alert is resumed now.

Actions

Also available in: Atom PDF