Project

General

Profile

Actions

action #152811

closed

ada.qe.suse.de is not responding to salt commands

Added by livdywan 4 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-12-14
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

ada.qe.suse.de: Minion did not return. [Not connected]

Rollback steps

  • ssh osd 'sudo salt-key -y -a ada.qe.suse.de'

Related issues 2 (0 open2 closed)

Related to QA - action #132617: Move of selected LSG QE machines NUE1 to PRG2e size:MResolvedokurz

Actions
Copied from openQA Infrastructure - action #152673: [alert] `systemctl status iscsid.socket` failed on `s390zl12.oqa.prg2.suse.org` size:SResolvedlivdywan2023-12-14

Actions
Actions #1

Updated by livdywan 4 months ago

  • Copied from action #152673: [alert] `systemctl status iscsid.socket` failed on `s390zl12.oqa.prg2.suse.org` size:S added
Actions #2

Updated by okurz 4 months ago

  • Related to action #132617: Move of selected LSG QE machines NUE1 to PRG2e size:M added
Actions #3

Updated by okurz 4 months ago

Same as #152813 for ada+openqaw5-xen we need to wait for #132617 . Your observation is correct and I wonder if some gitlab CI pipelines or monitoring shouldn't have failed until we remove those hosts from salt. I wonder, how did you find this issue?

Actions #4

Updated by livdywan 4 months ago

okurz wrote in #note-3:

Same as #152813 for ada+openqaw5-xen we need to wait for #132617 . Your observation is correct and I wonder if some gitlab CI pipelines or monitoring shouldn't have failed until we remove those hosts from salt. I wonder, how did you find this issue?

I was executing salt commands on all machines and these did not respond. It also surprised me that monitoring didn't fail. If they're not expected to be usable, they surely shouldn't be in salt?

Actions #5

Updated by okurz 4 months ago

livdywan wrote in #note-4:

okurz wrote in #note-3:

Same as #152813 for ada+openqaw5-xen we need to wait for #132617 . Your observation is correct and I wonder if some gitlab CI pipelines or monitoring shouldn't have failed until we remove those hosts from salt. I wonder, how did you find this issue?

I was executing salt commands on all machines and these did not respond. It also surprised me that monitoring didn't fail.

Found it: #151588

If they're not expected to be usable, they surely shouldn't be in salt?

Correct. For those we should follow https://progress.opensuse.org/projects/openqav3/wiki/#Take-machines-out-of-salt-controlled-production

Actions #6

Updated by okurz 4 months ago

  • Description updated (diff)
  • Status changed from New to Blocked
  • Assignee set to okurz
  • Priority changed from High to Normal
  • Target version changed from Ready to Tools - Next

removed salt key and added rollback step in description. Blocking on #132617

Actions #7

Updated by okurz about 2 months ago

  • Status changed from Blocked to Resolved
  • Target version changed from Tools - Next to Ready

#132617 resolved. ada is properly part of salt again. Removed salt-key for ada.qe.suse.de with sudo salt-key -y -d ada.qe.suse.de

Actions

Also available in: Atom PDF