action #99267
closedImprove validate_multipath module
Description
After some recent changes in the zfcp connectivity, validate_multipath
module is failing.
The main reason of the failure is that after the changes, we have two wwid
entries instead of one.
Compare the listed wwids (output at the bottom) from this job to this one.
This results in the stored $wwid
output in the validate_multipath
module to contain two lines with two different worldwide identifiers.
Similarly, due to this, the output of multipath -ll
now contains two device mapper devices instead of one.
This can be seen here in contrast to this old job.
This setup results in two commands being susceptible to failure.
- The first is
assert_matches(qr/$wwid dm-0 $ven_pro_rev/, $topology_output, 'General topology info are not displayed properly');
. This of course fails because$wwid
contains two lines of identifiers corresponding to the two multipath maps. This is the failure seen in our zfcp jobs currently. - The second one is
validate_script_output("multipath -d -v3 | grep ^$wwid", qr/$wwid.*$name/);
. If the previous command would not exist the jobs would fail here because, once more, the$wwid
parameter no longer contains only one WWID, and also because it is related to a different disk.
Observation¶
openQA test in scenario sle-15-SP4-Online-s390x-zfcp@s390x-zfcp fails in
validate_multipath
Test suite description¶
Testsuite maintained at https://gitlab.suse.de/qa-maintenance/qam-openqa-yml. Maintainer: QE Yast, mgriessmeier
Installation-only test configuring an s390x ZFCP storage.
Reproducible¶
Fails since (at least) Build 29.1
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Suggestion¶
A good middle ground solution would be to store in the $wwid
parameter only one of the identifiers matching to a single multipath map.
This would mean that both commands 1. and 2. listed on the top of the description would pass.
This is not a compromise, since we will still test the validity of one multipath map (as was the case for 15-SP3, before this change had occured)
Updated by openqa_review about 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: zfcp@s390x-zfcp
https://openqa.suse.de/tests/7642580
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by oorlov about 3 years ago
- Tags deleted (
qe-yast-refinement) - Status changed from New to Workable
Updated by geor about 3 years ago
- Status changed from Workable to In Progress
- Assignee set to geor
Updated by JERiveraMoya about 3 years ago
As discussed, instead of deactivate any multipath device to make it green, because the test is thought just for one, we plan here to rewrite the validation. Now we know what to expect depending on the underlying infrastructure, 1 or 2 channels, 1 switch and 2 disks represented by two differents LUNs, which each one has to connectors. The idea is simple, given a fixed infrastructure, create a test module which validate it in a way that we are sure that the right number of multipath devices is created with the right paths associates to them, so basically will validate that regardless of the path taken, your data ends in the right disk and multipath is grouping them properly.
It is prefered to use regex with named capture instead of bash pipelines because it is more readable and easy to maintain, but if there is some file to read configuration for, it is even better than using an upstream command, if there isn't if there is a command that directly give us the result without piping it also is preferred, if not, we can do the manipulation in perl with the output of that command, much better using grep
or map
in perl than using loops. Important thing is avoid to test the system with the system itself, so we don't get values with some command and use the result with other command, which make very difficult to dissect the error. This type of validation could be a good precedent to follow in future validation if all that info is taken into account, but of course, it is not an strict guide.
Updated by geor almost 3 years ago
- Status changed from In Progress to Closed
I will close this ticket, which I treated as an investigation ticket.
I am documenting my findings on https://confluence.suse.com/display/~geor/Mainframe+Musings%3A+Playing+around+with+FCP
Later I will also create a new ticket with specific description and acceptance criteria for the current multipath infrastructure on the zfcp worker we run our multipath test on.
Updated by geor almost 3 years ago
The module validate_multipath
has been failing in the zfcp testsuite due to changes in the FCP infrastructure on the z/VM hypervisor.
After the changes we have two disks attached to the system (an extra 200 GB disk has been added), which results in two WWIDs being displayed in the multipath config. This causes the module to fail because the $wwid
parameter is supposed to only contain one value.
PR unsceduling valudate_multipath
from zfcp testsuite: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/13909
Updated by geor almost 3 years ago
A followup Epic has been created: https://progress.opensuse.org/issues/105437