Project

General

Profile

action #96296

coordination #96302: [qe-core][QU] Quarterly update failures

[qem] test fails in disk_activation

Added by hurhaj about 2 months ago. Updated 14 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
2021-07-29
Due date:
% Done:

100%

Estimated time:
Difficulty:

Description

Observation

  • no devices are in the list

openQA test in scenario sle-15-SP3-Online-QR-s390x-zfcp@s390x-zfcp fails in
disk_activation

Test suite description

Maintainer: QE Yast, mgriessmeier

Installation-only test configuring an s390x ZFCP storage.

Reproducible

Fails since (at least) Build 188.11

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #96680: [qe-core] s390qa102.qa.suse.de zfcp issueNew2021-08-09

History

#1 Updated by szarate about 2 months ago

  • Parent task set to #96302

Setting to parent task to have a common tracker

#2 Updated by oorlov about 2 months ago

This failure does not relate to any test code/schedule files. No changes were introduced.

Looks like a product bug or infrastructure issue. Matthias suggested to set up the machine manually and check the issue.

I'm not an expert in s390x setup, so this ticket is better to address to whom it may concern.

#3 Updated by geor about 2 months ago

  • Assignee set to geor

#4 Updated by geor about 2 months ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 20

#5 Updated by geor about 1 month ago

  • % Done changed from 20 to 90

It seems like s390qa101.qa.suse.de has no issues with zfcp, whereas s390qa102.qa.suse.de does.
Whenever the job ends up running in the second worker instance (s390qa102.qa.suse.de) disk_activation module will fail.

Bringing the FCP device online with chccwdev -e 0.0.fa00 on s390qa101 results in the respective LUNs getting attached:

# lsluns
Scanning for LUNs on adapter 0.0.fa00
        at port 0x500507630703d3b3:
                0x4001405000000000
        at port 0x500507630708d3b3:
                0x4001405000000000

Bringing the FCP device online with chccwdev -e 0.0.fa00 on s390qa102 results in the following:

# lsluns
lsluns: No port on any adapter exists

# lszfcp -PHD
0.0.fa00 host0
Error: No fcp devices found.

# cat /proc/scsi/scsi  (shows no SCSI attached devices)
Attached devices:
# 

Also in s390qa101 the output of lszdev does show the two SCSI devices (with both names)

zfcp-lun     0.0.fa00:0x500507630703d3b3:0x4001405000000000  yes  no    sdb sg1
zfcp-lun     0.0.fa00:0x500507630708d3b3:0x4001405000000000  yes  no    sda sg0

whereas, in s390qa102 it does not

After consulting with Matthias on this, it turns out that there could be a probable mapping discrepancy, where the 0.0.fa00 points to different devices between the two s390 instances, resulting in this inconsistency. Further investigation will need to be done in order to pinpoint the exact root of the problem. For now a workaround will be to only schedule the zfcp testsuites on worker instance 1.

#7 Updated by geor about 1 month ago

  • Related to action #96680: [qe-core] s390qa102.qa.suse.de zfcp issue added

#8 Updated by geor about 1 month ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

Resolving this issue,

Added related issue in order to fix s390qa102, for now disc_activation will not fail because it will always run on s390qa101.

#9 Updated by szarate 14 days ago

  • Target version set to QE-Core: Ready

Also available in: Atom PDF