Project

General

Profile

Actions

action #175740

closed

coordination #161414: [epic] Improved salt based infrastructure management

[alert] deploy pipeline for salt-states-openqa failed, multiple host run into salt error "Not connected" or "No response"

Added by jbaier_cz about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2025-01-16
Due date:
% Done:

0%

Estimated time:

Description

Observation

See https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3677962

openqa-piworker.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
tumblesle.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
backup-vm.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
petrol.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
mania.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
monitor.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
grenache-1.oqa.prg2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To loERROR: Minions returned with non-zero exit code
ok up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
backup-qam.qe.nue2.suse.org:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20250117171554566571
diesel.qe.nue2.suse.org:
    Minion did not return. [Not connected]

Similar can be found also in https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/1520049


Related issues 3 (0 open3 closed)

Related to openQA Infrastructure (public) - action #167164: osd-deployment | Minions returned with non-zero exit code (qesapworker-prg5.qa.suse.cz) size:MResolvedybonatakis

Actions
Blocked by openQA Infrastructure (public) - action #175407: salt state for machine monitor.qe.nue2.suse.org was broken for almost 2 months, nothing was alerting us size:SResolvedokurz

Actions
Copied from openQA Infrastructure (public) - action #175629: diesel+petrol (possibly all ppc64le OPAL machines) often run into salt error "Not connected" or "No response" due to wireguard services failing to start on boot size:SResolvednicksinger2025-01-16

Actions
Actions #1

Updated by jbaier_cz about 1 month ago

  • Copied from action #175629: diesel+petrol (possibly all ppc64le OPAL machines) often run into salt error "Not connected" or "No response" due to wireguard services failing to start on boot size:S added
Actions #2

Updated by jbaier_cz about 1 month ago

  • Related to action #167164: osd-deployment | Minions returned with non-zero exit code (qesapworker-prg5.qa.suse.cz) size:M added
Actions #3

Updated by jbaier_cz about 1 month ago

  • Description updated (diff)
Actions #4

Updated by okurz about 1 month ago

  • Related to action #175407: salt state for machine monitor.qe.nue2.suse.org was broken for almost 2 months, nothing was alerting us size:S added
Actions #5

Updated by okurz about 1 month ago

  • Related to deleted (action #175407: salt state for machine monitor.qe.nue2.suse.org was broken for almost 2 months, nothing was alerting us size:S)
Actions #6

Updated by okurz about 1 month ago

  • Blocked by action #175407: salt state for machine monitor.qe.nue2.suse.org was broken for almost 2 months, nothing was alerting us size:S added
Actions #7

Updated by okurz about 1 month ago

  • Status changed from New to Blocked
  • Assignee set to okurz
Actions #8

Updated by okurz about 1 month ago

  • Status changed from Blocked to Resolved

We have solved the problems for non-wireguard hosts within #175407. For wireguard hosts there is still #175629

Actions

Also available in: Atom PDF