Project

General

Profile

Actions

action #67558

closed

testapi::wait_serial: can not get correct msg with pvm_hmc backend

Added by coolgw almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2020-06-02
Due date:
% Done:

0%

Estimated time:

Description

This issue happen on pvm_hmc backend:
http://openqa.nue.suse.com/tests/4288223#step/zypper_migration/2
Base http://openqa.nue.suse.com/tests/4288223/file/autoinst-log.txt, when execute "script_output("zypper migration", proceed_on_failure => 1)", you will get "[32m[2020-05-27T12:11:37.959 CEST] [debug] >>> testapi::wait_serial: SCRIPT_FINISHEDt3nSs-\d+-: fail" error msg.

But when you open http://openqa.nue.suse.com/tests/4288223/file/serial0.txt you already can see string "SCRIPT_FINISHEDt3nSs" output from serial log file.

So i suspect the wait_serial function in os-autoinst/backend/baseclass.pm:829 has something wrong, maybe can not handle some specific string triggered from new backend pvm_hmc.

HEI4V-0-
cDJHr-0-
rci3z-0-
2vdJ6-0-
HxwnK-0-
Z2Aej-0-
mkfifo: cannot create fifo '/dev/sshserial': File exists
LVZ8_
rollback-helper-1.0+git20181218.5394d6e-4.3.1.noarch
yast2-migration-4.1.2-7.3.2.noarch
zypper-migration-plugin-0.12.1580220831.7102be8-6.4.1.noarch
SCRIPT_FINISHEDLVZ8_-0-
INwHk-0-
t3nSs

Executing 'zypper patch-check --updatestack-only'

Refreshing service 'Basesystem_Module_15_SP1_ppc64le'.
Refreshing service 'SUSE_Linux_Enterprise_Server_15_SP1_ppc64le'.
Refreshing service 'Server_Applications_Module_15_SP1_ppc64le'.
Loading repository data...
Reading installed packages...

0 patches needed (0 security patches)

Executing 'zypper refresh'

Repository 'SLE-Module-Basesystem15-SP1-Pool' is up to date.
Repository 'SLE-Module-Basesystem15-SP1-Updates' is up to date.
Repository 'SLES15-SP1-15.1-0' is up to date.
Repository 'SLE-Product-SLES15-SP1-Pool' is up to date.
Repository 'SLE-Product-SLES15-SP1-Updates' is up to date.
Repository 'SLE-Module-Server-Applications15-SP1-Pool' is up to date.
Repository 'SLE-Module-Server-Applications15-SP1-Updates' is up to date.
All repositories have been refreshed.
Available migrations:

1 | SUSE Linux Enterprise Server 15 SP2 ppc64le
    Basesystem Module 15 SP2 ppc64le
    Python 2 Module 15 SP2 ppc64le
    Server Applications Module 15 SP2 ppc64le

[num/q]: [num/q]: [num/q]: [num/q]:
Standard input seems to be closed, please use '--non-interactive' option
SCRIPT_FINISHEDt3nSs-1- <================== this already show "SCRIPT_FINISHEDt3nSs"
GEhX~
SCRIPT_FINISHEDGEhX~-1-

Actions #1

Updated by coolgw almost 4 years ago

above result log not exist anymore, you can check following case error/log
https://openqa.nue.suse.com/tests/4305648#step/zypper_migration/3

Actions #2

Updated by coolgw almost 4 years ago

Oliver:I suggest you crosscheck the job where it failed against other test scenarios on the same backend and see if you can find non-migration related working examples of script_run or assert_script_output on pvm_hmc

Actions #3

Updated by coolgw almost 4 years ago

barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/spvm http://openqa.suse.de/tests/4330029
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4330070: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330070
Created job #4330071: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330071
barry:~/:[0]#
strange failed happen during first job.

Actions #4

Updated by coolgw almost 4 years ago

zypper migration result log good one on x86(used for compare)
https://openqa.nue.suse.com/tests/4328526/file/autoinst-log.txt

Actions #5

Updated by coolgw almost 4 years ago

barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4330230: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330230
Created job #4330231: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330231

Actions #6

Updated by coolgw almost 4 years ago

extra space added before wait_serial.
barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4335193: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4335193
Created job #4335194: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4335194

Actions #7

Updated by okurz almost 4 years ago

  • Category set to Regressions/Crashes
  • Assignee set to coolgw

@coolgw I have reserved one worker instance for you with https://gitlab.suse.de/openqa/salt-pillars-openqa/-/commit/fff6d3734fda628cad5666e3ddcfec90472872b0 by using special worker classes which are not used by production test jobs. So the instance will be free to use with manual invocations of either openQA jobs using the special worker class "hmc_ppc64le_debug_poo67558" or what I suggest: Run isotovideo locally against that instance, e.g. use a vars.json file based on one of the failed jobs and use the instance specific configuration variables, i.e.

HMC_MACHINE_NAME: redcurrant
LPAR_ID: 12
SUT_IP: redcurrant-8.qa.suse.de

As soon as you found the problem we can bring the worker back into production.

Actions #8

Updated by coolgw almost 4 years ago

barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4348933: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4348933
Created job #4348934: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4348934
barry:~/:[0]#

Actions #10

Updated by okurz almost 4 years ago

  • Target version set to future
Actions #11

Updated by okurz over 3 years ago

  • Subject changed from testapi::wait_serial: can not get correct msg with pvm_hmc backend. to testapi::wait_serial: can not get correct msg with pvm_hmc backend
  • Status changed from New to Feedback

@coolgw PR https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10516 was merged 2020-06-16 and I have not heard from you since then. Can you comment on the current status? Please keep in mind that the worker "redcurrant-8.qa.suse.de" is still reserved for your debug work and should be brought back into production when you are done e.g. with a revert of https://gitlab.suse.de/openqa/salt-pillars-openqa/-/commit/fff6d3734fda628cad5666e3ddcfec90472872b0

Actions #12

Updated by coolgw over 3 years ago

@Oliver, thanks for your support on this tickets, sorry for the later reply:) The current status is good, so the issue is fixed! You can set resolve now.
Also, I suppose we can release the worker.

Actions #13

Updated by okurz over 3 years ago

  • Assignee changed from coolgw to okurz

ok, good. I created an MR to re-enable the worker again and track it: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/252

EDIT: Using https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/253 instead as the MR from origin to origin erroneously triggers tests which fail.

Actions #14

Updated by okurz over 3 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from okurz to coolgw

worker active for production again. Setting back original assignee and to "Resolved"

Actions

Also available in: Atom PDF