Project

General

Profile

Actions

action #115742

open

[qe-sap][s390x] test fails in setup_hosts_and_luns on SLE15SP5 HA: 'mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs' timed out

Added by llzhao over 1 year ago. Updated 10 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Category:
Bugs in existing tests
Target version:
-
Start date:
2022-08-25
Due date:
% Done:

100%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-15-SP5-Online-s390x-migration_offline_scc_verify_sle15sp3_ha_alpha_node02@s390x-kvm-sle12 fails in
setup_hosts_and_luns

Test suite description

The base test suite is used for job templates defined in YAML documents. It has no settings of its own.

Reproducible

Fails since (at least) Build 15.2 (current job)

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by llzhao over 1 year ago

Tried on x86 arch, the mount can be succeeded.
FYI:

# mkdir /testdir
# mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /testdir
# echo $?
0
# mount | grep testdir
1c119.qa.suse.de:/srv/nfs/ha_ss on /testdir type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.67.19.61,local_lock=none,addr=10.162.31.119)
Actions #2

Updated by llzhao over 1 year ago

  • Subject changed from [qe-sap] test fails in setup_hosts_and_luns on SLE15SP5 HA: 'mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs' timed out to [qe-sap][s390x] test fails in setup_hosts_and_luns on SLE15SP5 HA: 'mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs' timed out
Actions #3

Updated by rbranco over 1 year ago

  • Assignee set to llzhao
Actions #4

Updated by llzhao over 1 year ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Closed it atm as it is passed in latest build sle-15-SP5 Build19.1:
https://openqa.suse.de/tests/9416248#step/setup_hosts_and_luns/6

Actions #5

Updated by llzhao over 1 year ago

  • Status changed from Resolved to In Progress
Actions #6

Updated by llzhao over 1 year ago

There are lots of unexpected msgs were got other than "hZjFX-0-" (for example).
For example: http://openqa.nue.suse.com/tests/9458224#step/setup_hosts_and_luns/4

# wait_serial expected: qr/hZjFX-\d+-/u
# Result:

2022-09-05T09:43:37.146963-04:00 alpha-node01 kernel: [   72.189078][    C0] sd 0:0:0:2: Power-on or device reset occurred
2022-09-05T09:43:46.309375-04:00 alpha-node01 kernel: [   81.350585][    C0] sd 0:0:0:3: Power-on or device reset occurred
2022-09-05T09:43:57.695729-04:00 alpha-node01 kernel: [   92.737476][    C0] sd 0:0:0:4: Power-on or device reset occurred
2022-09-05T09:44:09.768084-04:00 alpha-node01 kernel: [  104.809796][    C0] sd 0:0:0:5: Power-on or device reset occurred
2022-09-05T09:44:23.817178-04:00 alpha-node01 kernel: [  118.858736][    C0] sd 0:0:0:6: Power-on or device reset occurred
[FAILED] Failed to start Shared-storage based fencing daemon.
2022-09-05T09:44:26.715690-04:00 alpha-node01 systemd[1]: Failed to start Shared-storage based fencing daemon.
2022-09-05T09:44:38.247451-04:00 alpha-node01 kernel: [  133.288556][    C0] sd 0:0:0:7: Power-on or device reset occurred
2022-09-05T09:45:02.757571-04:00 alpha-node01 kernel: [  157.798254][    C0] sd 0:0:0:8: Power-on or device reset occurred
2022-09-05T09:45:33.211785-04:00 alpha-node01 kernel: [  188.249707][    C0] sd 0:0:0:9: Power-on or device reset occurred
2022-09-05T09:46:06.471824-04:00 alpha-node01 kernel: [  221.511902][    C0] sd 0:0:0:10: Power-on or device reset occurred
2022-09-05T09:46:07.463144-04:00 alpha-node01 systemd-udevd[522]: 0:0:0:0: Worker [1564] processing SEQNUM=911 killed

The same in http://openqa.nue.suse.com/tests/9458224/logfile?filename=autoinst-log.txt:

[2022-09-05T15:43:27.043889+02:00] [debug] <<< testapi::assert_script_run(cmd="mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs", quiet=undef, timeout=90, fail_message="")
[2022-09-05T15:43:27.044021+02:00] [debug] tests/ha/setup_hosts_and_luns.pm:46 called testapi::assert_script_run
[2022-09-05T15:43:27.044170+02:00] [debug] <<< testapi::type_string(string="mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs", max_interval=250, wait_screen_change=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2022-09-05T15:43:28.986833+02:00] [debug] tests/ha/setup_hosts_and_luns.pm:46 called testapi::assert_script_run
[2022-09-05T15:43:28.987078+02:00] [debug] <<< testapi::type_string(string="; echo hZjFX-\$?- > /dev/ttysclp0\n", max_interval=250, wait_screen_change=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2022-09-05T15:43:30.201450+02:00] [debug] tests/ha/setup_hosts_and_luns.pm:46 called testapi::assert_script_run
[2022-09-05T15:43:30.201677+02:00] [debug] <<< testapi::wait_serial(regexp=qr/hZjFX-\d+-/u, no_regex=0, buffer_size=undef, timeout=90, expect_not_found=0, record_output=undef, quiet=undef)
2022-09-05T09:43:37.146963-04:00 alpha-node01 kernel: [   72.189078][    C0] sd 0:0:0:2: Power-on or device reset occurred
2022-09-05T09:43:46.309375-04:00 alpha-node01 kernel: [   81.350585][    C0] sd 0:0:0:3: Power-on or device reset occurred
2022-09-05T09:43:57.695729-04:00 alpha-node01 kernel: [   92.737476][    C0] sd 0:0:0:4: Power-on or device reset occurred
2022-09-05T09:44:09.768084-04:00 alpha-node01 kernel: [  104.809796][    C0] sd 0:0:0:5: Power-on or device reset occurred
2022-09-05T09:44:23.817178-04:00 alpha-node01 kernel: [  118.858736][    C0] sd 0:0:0:6: Power-on or device reset occurred
[FAILED] Failed to start Shared-storage based fencing daemon.
2022-09-05T09:44:26.715690-04:00 alpha-node01 systemd[1]: Failed to start Shared-storage based fencing daemon.
2022-09-05T09:44:38.247451-04:00 alpha-node01 kernel: [  133.288556][    C0] sd 0:0:0:7: Power-on or device reset occurred
2022-09-05T09:45:02.757571-04:00 alpha-node01 kernel: [  157.798254][    C0] sd 0:0:0:8: Power-on or device reset occurred
2022-09-05T09:45:33.211785-04:00 alpha-node01 kernel: [  188.249707][    C0] sd 0:0:0:9: Power-on or device reset occurred
2022-09-05T09:46:06.471824-04:00 alpha-node01 kernel: [  221.511902][    C0] sd 0:0:0:10: Power-on or device reset occurred
2022-09-05T09:46:07.463144-04:00 alpha-node01 systemd-udevd[522]: 0:0:0:0: Worker [1564] processing SEQNUM=911 killed
[2022-09-05T15:46:31.388275+02:00] [debug] >>> testapi::wait_serial: (?^u:hZjFX-\d+-): fail
[2022-09-05T15:46:31.389694+02:00] [info] ::: basetest::runtest: # Test died: command 'mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs' timed out at /usr/lib/os-autoinst/testapi.pm line 919.
    testapi::_handle_script_run_ret(undef, "mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs", "quiet", undef, "timeout", 90, "fail_message", "") called at /usr/lib/os-autoinst/testapi.pm line 960
    testapi::assert_script_run("mount -t nfs 1c119.qa.suse.de:/srv/nfs/ha_ss /support_fs") called at sle/tests/ha/setup_hosts_and_luns.pm line 46
    setup_hosts_and_luns::run(setup_hosts_and_luns=HASH(0x55ce23b0e868)) called at /usr/lib/os-autoinst/basetest.pm line 328
    eval {...} called at /usr/lib/os-autoinst/basetest.pm line 322
    basetest::runtest(setup_hosts_and_luns=HASH(0x55ce23b0e868)) called at /usr/lib/os-autoinst/autotest.pm line 367
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 367
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 243
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 243
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 294
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x55ce244d9660)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x55ce244d9660), CODE(0x55ce252b5ef0)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 488
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x55ce244d9660)) called at /usr/lib/os-autoinst/autotest.pm line 296
    autotest::start_process() called at /usr/bin/isotovideo line 273

[2022-09-05T15:46:31.462961+02:00] [debug] ||| finished setup_hosts_and_luns ha (runtime: 187 s)
[2022-09-05T15:46:31.463075+02:00] [debug] ||| post fail hooks runtime: 0 s
Actions #7

Updated by llzhao over 1 year ago

[FAILED] Failed to start Shared-storage based fencing daemon.
2022-09-05T09:44:26.715690-04:00 alpha-node01 systemd[1]: Failed to start Shared-storage based fencing daemon.
Actions #8

Updated by llzhao over 1 year ago

  • Status changed from In Progress to Closed

Close it and using https://progress.opensuse.org/issues/116287 (poo#116287 - SSH serial terminal connection issues on s390x workers
) to track related failures.

Actions #9

Updated by openqa_review over 1 year ago

  • Status changed from Closed to Feedback

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_dvd_verify_sle15sp1_ltss_ha_alpha_node01
https://openqa.suse.de/tests/9563238#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #10

Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_scc_verify_sle15sp3_ha_alpha_node01
https://openqa.suse.de/tests/9757563#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 64 days if nothing changes in this ticket.

Actions #11

Updated by llzhao over 1 year ago

  • Status changed from Feedback to Closed
Actions #12

Updated by llzhao over 1 year ago

Close it atm. Can be reopen as needed.

Actions #13

Updated by openqa_review over 1 year ago

  • Status changed from Closed to Feedback

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_dvd_verify_sle15sp3_ha_alpha_node01
https://openqa.suse.de/tests/10218307#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #14

Updated by openqa_review about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_dvd_verify_sle15sp3_ha_alpha_node01
https://openqa.suse.de/tests/10339345#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.

Actions #15

Updated by llzhao about 1 year ago

  • Assignee deleted (llzhao)

Removed me from the "Assignee" since I have no more idea on this issue and don't have time work on it currently.
And it is a ticket of "Operational".

Actions #16

Updated by openqa_review about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_online_verify_sle15sp3_ha_alpha_node02
https://openqa.suse.de/tests/10541174#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #17

Updated by openqa_review about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_dvd_verify_sle15sp3_ha_alpha_node02
https://openqa.suse.de/tests/10846965#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.

Actions #18

Updated by emiura about 1 year ago

Issue happening only on s390x at this moment: https://openqa.suse.de/tests/10940605

Actions #19

Updated by openqa_review 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_scc_verify_sle15sp3_ha_alpha_node01
https://openqa.suse.de/tests/11140366#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.

Actions #20

Updated by openqa_review 10 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: migration_offline_dvd_verify_sle12sp4_ltss_ha_alpha_node01
https://openqa.suse.de/tests/11182419#step/setup_hosts_and_luns/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 112 days if nothing changes in this ticket.

Actions

Also available in: Atom PDF