Project

General

Profile

Actions

action #162734

open

openQA Project - coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

coordination #161735: [epic] Better error detection on GRE tunnel misconfiguration

Simple script detecting gre_tunnel_preup.sh with only empty remote_ip= statements during salt CI pipelines size:M

Added by okurz 26 days ago. Updated 8 days ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2024-06-21
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

See #161735-4 where nicksinger explains how the salt mine seems to be empty sometimes causing to end up with /etc/wicked/scripts/gre_tunnel_preup.sh being "empty" again (only containing options:remote_ip=, e.g. worker36 (offline at point of file generation) lines).

Acceptance criteria

  • AC1: gre_tunnel_preup.sh scripts are ensured to have at least one valid remote_ip= statement
  • AC2: All remote_ip= statements represent relevant peers, e.g. current online TAP worker hosts of same architecture

Suggestions

  • Look into https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/openvswitch.sls#L122
  • Start with sudo salt --no-color -C 'G@roles:worker' cmd.run 'test -e /etc/wicked/scripts/gre_tunnel_preup.sh && grep remote_ip /etc/wicked/scripts/gre_tunnel_preup.sh'
  • The task can be solved by ensuring non-empty entries during generation or also retroactively as part of the CI pipeline execution in a post-deploy monitoring step: Something like find currently online salt connected workers, use that as filter against https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls in a separate script
  • Consider the case of an island cluster where actually no peers are expected
  • Let this new script make a diff between the old and new version of gre_tunnel_preup.sh and do a sanity check on the diff (e.g. if too many lines have been removed reject the change)
  • Check if nicksinger already disabled the grain-cache and if that helped

Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #162455: Secondary TAP worker class instead of "tap_poo…" on closed tickets size:SResolvedokurz2024-06-18

Actions
Actions #1

Updated by okurz 14 days ago

  • Subject changed from Simple script detecting "empty" gre_tunnel_preup.sh during salt CI pipelines to Simple script detecting gre_tunnel_preup.sh with only empty remote_ip= statements during salt CI pipelines
  • Description updated (diff)
  • Status changed from New to Feedback
  • Assignee set to okurz

We failed to estimate this. Will need to consult with nicksinger first.

Actions #2

Updated by okurz 14 days ago

  • Description updated (diff)
Actions #3

Updated by okurz 14 days ago

  • Related to action #162455: Secondary TAP worker class instead of "tap_poo…" on closed tickets size:S added
Actions #4

Updated by okurz 14 days ago

  • Status changed from Feedback to Blocked
Actions #5

Updated by okurz 8 days ago

  • Status changed from Blocked to New
  • Assignee deleted (okurz)

#162455 is done. We can go ahead here.

Actions #6

Updated by livdywan 8 days ago

  • Subject changed from Simple script detecting gre_tunnel_preup.sh with only empty remote_ip= statements during salt CI pipelines to Simple script detecting gre_tunnel_preup.sh with only empty remote_ip= statements during salt CI pipelines size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #7

Updated by okurz 8 days ago

  • Target version changed from Ready to Tools - Next
Actions

Also available in: Atom PDF