action #67558
closedtestapi::wait_serial: can not get correct msg with pvm_hmc backend
0%
Description
This issue happen on pvm_hmc backend:
http://openqa.nue.suse.com/tests/4288223#step/zypper_migration/2
Base http://openqa.nue.suse.com/tests/4288223/file/autoinst-log.txt, when execute "script_output("zypper migration", proceed_on_failure => 1)", you will get "[32m[2020-05-27T12:11:37.959 CEST] [debug] >>> testapi::wait_serial: SCRIPT_FINISHEDt3nSs-\d+-: fail" error msg.
But when you open http://openqa.nue.suse.com/tests/4288223/file/serial0.txt you already can see string "SCRIPT_FINISHEDt3nSs" output from serial log file.
So i suspect the wait_serial function in os-autoinst/backend/baseclass.pm:829 has something wrong, maybe can not handle some specific string triggered from new backend pvm_hmc.
HEI4V-0-
cDJHr-0-
rci3z-0-
2vdJ6-0-
HxwnK-0-
Z2Aej-0-
mkfifo: cannot create fifo '/dev/sshserial': File exists
LVZ8_
rollback-helper-1.0+git20181218.5394d6e-4.3.1.noarch
yast2-migration-4.1.2-7.3.2.noarch
zypper-migration-plugin-0.12.1580220831.7102be8-6.4.1.noarch
SCRIPT_FINISHEDLVZ8_-0-
INwHk-0-
t3nSs
Executing 'zypper patch-check --updatestack-only'
Refreshing service 'Basesystem_Module_15_SP1_ppc64le'.
Refreshing service 'SUSE_Linux_Enterprise_Server_15_SP1_ppc64le'.
Refreshing service 'Server_Applications_Module_15_SP1_ppc64le'.
Loading repository data...
Reading installed packages...
0 patches needed (0 security patches)
Executing 'zypper refresh'
Repository 'SLE-Module-Basesystem15-SP1-Pool' is up to date.
Repository 'SLE-Module-Basesystem15-SP1-Updates' is up to date.
Repository 'SLES15-SP1-15.1-0' is up to date.
Repository 'SLE-Product-SLES15-SP1-Pool' is up to date.
Repository 'SLE-Product-SLES15-SP1-Updates' is up to date.
Repository 'SLE-Module-Server-Applications15-SP1-Pool' is up to date.
Repository 'SLE-Module-Server-Applications15-SP1-Updates' is up to date.
All repositories have been refreshed.
Available migrations:
1 | SUSE Linux Enterprise Server 15 SP2 ppc64le
Basesystem Module 15 SP2 ppc64le
Python 2 Module 15 SP2 ppc64le
Server Applications Module 15 SP2 ppc64le
[num/q]: [num/q]: [num/q]: [num/q]:
Standard input seems to be closed, please use '--non-interactive' option
SCRIPT_FINISHEDt3nSs-1- <================== this already show "SCRIPT_FINISHEDt3nSs"
GEhX~
SCRIPT_FINISHEDGEhX~-1-
Updated by coolgw over 4 years ago
above result log not exist anymore, you can check following case error/log
https://openqa.nue.suse.com/tests/4305648#step/zypper_migration/3
Updated by coolgw over 4 years ago
Oliver:I suggest you crosscheck the job where it failed against other test scenarios on the same backend and see if you can find non-migration related working examples of script_run or assert_script_output on pvm_hmc
Updated by coolgw over 4 years ago
barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/spvm http://openqa.suse.de/tests/4330029
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4330070: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330070
Created job #4330071: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330071
barry:~/:[0]#
strange failed happen during first job.
Updated by coolgw over 4 years ago
zypper migration result log good one on x86(used for compare)
https://openqa.nue.suse.com/tests/4328526/file/autoinst-log.txt
Updated by coolgw over 4 years ago
barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4330230: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330230
Created job #4330231: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4330231
Updated by coolgw over 4 years ago
extra space added before wait_serial.
barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4335193: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4335193
Created job #4335194: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4335194
Updated by okurz over 4 years ago
- Category set to Regressions/Crashes
- Assignee set to coolgw
@coolgw I have reserved one worker instance for you with https://gitlab.suse.de/openqa/salt-pillars-openqa/-/commit/fff6d3734fda628cad5666e3ddcfec90472872b0 by using special worker classes which are not used by production test jobs. So the instance will be free to use with manual invocations of either openQA jobs using the special worker class "hmc_ppc64le_debug_poo67558" or what I suggest: Run isotovideo locally against that instance, e.g. use a vars.json file based on one of the failed jobs and use the instance specific configuration variables, i.e.
HMC_MACHINE_NAME: redcurrant
LPAR_ID: 12
SUT_IP: redcurrant-8.qa.suse.de
As soon as you found the problem we can bring the worker back into production.
Updated by coolgw over 4 years ago
barry:~/:[0]# /usr/share/openqa/script/openqa-clone-custom-git-refspec https://github.com/coolgw/os-autoinst-distri-opensuse/tree/hpvm http://openqa.suse.de/tests/4330029 -c " --parental-inheritance "
Cloning dependencies of sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk
Created job #4348933: sle-15-SP1-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm_pre@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4348933
Created job #4348934: sle-15-SP2-Migration-from-SLE15-SPX-to-SLE15-SP2-PowerVM-ppc64le-Build205.1-gw-online_sles15sp1_pscc_basesys-srv_all_full_zypper_spvm@ppc64le-hmc-single-disk -> http://openqa.suse.de/t4348934
barry:~/:[0]#
Updated by coolgw over 4 years ago
Updated by okurz over 4 years ago
- Subject changed from testapi::wait_serial: can not get correct msg with pvm_hmc backend. to testapi::wait_serial: can not get correct msg with pvm_hmc backend
- Status changed from New to Feedback
@coolgw PR https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10516 was merged 2020-06-16 and I have not heard from you since then. Can you comment on the current status? Please keep in mind that the worker "redcurrant-8.qa.suse.de" is still reserved for your debug work and should be brought back into production when you are done e.g. with a revert of https://gitlab.suse.de/openqa/salt-pillars-openqa/-/commit/fff6d3734fda628cad5666e3ddcfec90472872b0
Updated by coolgw over 4 years ago
@Oliver, thanks for your support on this tickets, sorry for the later reply:) The current status is good, so the issue is fixed! You can set resolve now.
Also, I suppose we can release the worker.
Updated by okurz over 4 years ago
- Assignee changed from coolgw to okurz
ok, good. I created an MR to re-enable the worker again and track it: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/252
EDIT: Using https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/253 instead as the MR from origin to origin erroneously triggers tests which fail.
Updated by okurz over 4 years ago
- Status changed from Feedback to Resolved
- Assignee changed from okurz to coolgw
worker active for production again. Setting back original assignee and to "Resolved"