Project

General

Profile

action #99267

Improve validate_multipath module

Added by geor 8 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
2021-09-24
Due date:
% Done:

0%

Estimated time:

Description

After some recent changes in the zfcp connectivity, validate_multipath module is failing.

The main reason of the failure is that after the changes, we have two wwid entries instead of one.
Compare the listed wwids (output at the bottom) from this job to this one.
This results in the stored $wwid output in the validate_multipath module to contain two lines with two different worldwide identifiers.

Similarly, due to this, the output of multipath -ll now contains two device mapper devices instead of one.
This can be seen here in contrast to this old job.

This setup results in two commands being susceptible to failure.

  1. The first is assert_matches(qr/$wwid dm-0 $ven_pro_rev/, $topology_output, 'General topology info are not displayed properly');. This of course fails because $wwid contains two lines of identifiers corresponding to the two multipath maps. This is the failure seen in our zfcp jobs currently.
  2. The second one is validate_script_output("multipath -d -v3 | grep ^$wwid", qr/$wwid.*$name/);. If the previous command would not exist the jobs would fail here because, once more, the $wwid parameter no longer contains only one WWID, and also because it is related to a different disk.

Observation

openQA test in scenario sle-15-SP4-Online-s390x-zfcp@s390x-zfcp fails in
validate_multipath

Test suite description

Testsuite maintained at https://gitlab.suse.de/qa-maintenance/qam-openqa-yml. Maintainer: QE Yast, mgriessmeier

Installation-only test configuring an s390x ZFCP storage.

Reproducible

Fails since (at least) Build 29.1

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Suggestion

A good middle ground solution would be to store in the $wwid parameter only one of the identifiers matching to a single multipath map.
This would mean that both commands 1. and 2. listed on the top of the description would pass.
This is not a compromise, since we will still test the validity of one multipath map (as was the case for 15-SP3, before this change had occured)

History

#1 Updated by geor 8 months ago

  • Description updated (diff)

#2 Updated by geor 8 months ago

  • Description updated (diff)

#3 Updated by geor 8 months ago

  • Description updated (diff)

#4 Updated by geor 8 months ago

  • Description updated (diff)

#5 Updated by geor 8 months ago

  • Description updated (diff)

#6 Updated by JERiveraMoya 8 months ago

  • Target version set to Current

#7 Updated by openqa_review 6 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: zfcp@s390x-zfcp
https://openqa.suse.de/tests/7642580

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

#8 Updated by oorlov 6 months ago

  • Tags deleted (qe-yast-refinement)
  • Status changed from New to Workable

#9 Updated by geor 6 months ago

  • Status changed from Workable to In Progress
  • Assignee set to geor

#10 Updated by JERiveraMoya 6 months ago

As discussed, instead of deactivate any multipath device to make it green, because the test is thought just for one, we plan here to rewrite the validation. Now we know what to expect depending on the underlying infrastructure, 1 or 2 channels, 1 switch and 2 disks represented by two differents LUNs, which each one has to connectors. The idea is simple, given a fixed infrastructure, create a test module which validate it in a way that we are sure that the right number of multipath devices is created with the right paths associates to them, so basically will validate that regardless of the path taken, your data ends in the right disk and multipath is grouping them properly.
It is prefered to use regex with named capture instead of bash pipelines because it is more readable and easy to maintain, but if there is some file to read configuration for, it is even better than using an upstream command, if there isn't if there is a command that directly give us the result without piping it also is preferred, if not, we can do the manipulation in perl with the output of that command, much better using grep or map in perl than using loops. Important thing is avoid to test the system with the system itself, so we don't get values with some command and use the result with other command, which make very difficult to dissect the error. This type of validation could be a good precedent to follow in future validation if all that info is taken into account, but of course, it is not an strict guide.

#11 Updated by geor 5 months ago

  • Status changed from In Progress to Closed

I will close this ticket, which I treated as an investigation ticket.
I am documenting my findings on https://confluence.suse.com/display/~geor/Mainframe+Musings%3A+Playing+around+with+FCP

Later I will also create a new ticket with specific description and acceptance criteria for the current multipath infrastructure on the zfcp worker we run our multipath test on.

#12 Updated by geor 5 months ago

The module validate_multipath has been failing in the zfcp testsuite due to changes in the FCP infrastructure on the z/VM hypervisor.
After the changes we have two disks attached to the system (an extra 200 GB disk has been added), which results in two WWIDs being displayed in the multipath config. This causes the module to fail because the $wwid parameter is supposed to only contain one value.

PR unsceduling valudate_multipath from zfcp testsuite: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/13909

#13 Updated by geor 4 months ago

A followup Epic has been created: https://progress.opensuse.org/issues/105437

Also available in: Atom PDF