action #64938
closed'+ISO=' in test suite breaks a number of tests
0%
Description
'+ISO=' has been added to a number of test suites, but it breaks a number of tests:
- boot_to_snapshot: https://openqa.opensuse.org/tests/1215562
- gnuhealth: https://openqa.opensuse.org/tests/1215557
Error log is:
[2020-03-27T13:53:49.151 CET] [debug] running /usr/bin/qemu-img info --output=json /var/lib/openqa/pool/13/openqa1-opensuse
[2020-03-27T13:53:49.175 CET] [debug] qemu-img: Could not open '/var/lib/openqa/pool/13/openqa1-opensuse': A regular file was expected by the 'file' driver, but something else was given
[2020-03-27T13:53:49.175 CET] [debug] Backend process died, backend errors are reported below in the following lines:
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "qemu-img: Could not ...") at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/JSON.pm line 39.
vars.json
shows:
"ISO" : "/var/lib/openqa/pool/6/openqa1-opensuse"
Updated by ggardet_arm over 4 years ago
- Related to action #59394: [qe-core][functional] Overwrite empty ISO variable everywhere where not needed, i.e. `+ISO=`, to prevent useless ISO downloading and storage added
Updated by okurz over 4 years ago
- Project changed from openQA Tests (public) to openQA Project (public)
- Category set to Regressions/Crashes
- Status changed from New to In Progress
- Assignee set to okurz
- Priority changed from Normal to High
I will look into this. I suspect a regression from https://github.com/os-autoinst/openQA/pull/2861
Updated by okurz over 4 years ago
- Status changed from In Progress to Feedback
Reproduced with https://openqa.opensuse.org/tests/1215844 , did snapper rollback 585
on aarch64, reboot, retrigger, https://openqa.opensuse.org/tests/1215847 passed, hypothesis of regression due to https://github.com/os-autoinst/openQA/pull/2861 accepted, revert https://github.com/os-autoinst/openQA/pull/2874 prepared and merged. Waiting for fixed packages.
Updated by okurz over 4 years ago
- Status changed from Feedback to Resolved
Fixed packages are deployed on all o3 workers.
Updated by mlin7442 over 4 years ago
still failing https://openqa.opensuse.org/tests/1218258
Updated by Xiaojing_liu over 4 years ago
- Status changed from Resolved to Feedback
I checked the os-autoinst.log between successful job and failed job. The +ISO
in those jobs were both handled to ISO=
in the job settings. The different is that, in the fail job, the ISO
was re-written to /var/lib/openqa/pool/8/openqa1-opensuse
. And this re-written seems like was done during caching assets. In the successful job, when ISO=
, there is no downloading ISO log message in os-autoinst, but in fail job, it still download the ISO
. I also did some test in my local environment (disabled the cache service), when the ISO=
, the job will failed too, because the function locate_local_assets
in isotovideo.pm re-written the ISO to directory which is under /var/lib/openqa/pool.
Updated by Xiaojing_liu over 4 years ago
I also did some test in my local environment after reverting this pr #2860, and the test passed. Seems like this question is caused by this modification: https://github.com/os-autoinst/openQA/pull/2860/files#diff-daeb812b7eb46c12f6ec790a2ec2d399L85.
Updated by Xiaojing_liu over 4 years ago
Try to fix it: https://github.com/os-autoinst/openQA/pull/2877
Updated by okurz over 4 years ago
- Related to action #63565: The extra setting is added to the new job when cloning a job added
Updated by okurz over 4 years ago
- Status changed from Feedback to In Progress
please Xiaojing_liu, thanks for taking a look. Please handle your follow up in code in #59394 or #63565 so that this ticket can focus on handling the problems in production with reverts and workarounds.
EDIT: openqa_clone_job_o3 --skip-chained-deps 1218657 ISO=''
Created job #1218661: opensuse-15.2-DVD-ppc64le-Build190.2-boot_to_snapshot@ppc64le -> https://openqa.opensuse.org/t1218661
Also x86_64 is affected.
[30/03/2020 12:24:09] <DimStar> okurz: do you have some ETA for https://progress.opensuse.org/issues/64938 ?
[30/03/2020 12:24:10] <|Anna|> '+ISO=' in test suite breaks a number of tests in openQA Project (action for okurz) [In Progress] Created on: 2020-03-27 | 0% done.
[30/03/2020 12:25:41] <okurz> DimStar: ETA 1-2 days. I do not yet understand why after my work three days ago there should still be problems . At the time I retriggered tests and they were fine. So what's the current impact?
[30/03/2020 12:26:25] <DimStar> okurz: I see two tests in TW incomplete on that - so gnuhealth and boot_to_snapshot are untested for a few days already
[30/03/2020 12:26:39] <DimStar> e.g https://openqa.opensuse.org/tests/1218368
[30/03/2020 12:27:18] <DimStar> [info] [#5429] Downloading "." from "http://openqa1-opensuse/tests/1218368/asset/iso/."
[30/03/2020 12:28:31] <okurz> alright. I cloned them now with the explicit `ISO=''` that should help for the current jobs
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/1218664 ISO=''
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/1218665 ISO=''
Created job #1218681: opensuse-Tumbleweed-DVD-x86_64-Build20200329-boot_to_snapshot@64bit -> https://openqa.opensuse.org/t1218681
Created job #1218682: opensuse-Tumbleweed-DVD-x86_64-Build20200329-gnuhealth@64bit -> https://openqa.opensuse.org/t1218682
both passed
I hotpatched openqaworker7 with https://github.com/os-autoinst/openQA/pull/2877 and restarted the only free worker instance systemctl openqa-worker@5
and retriggered a run
$ openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/1218665 TEST=okurz_poo64938_boot_to_snapshot_openqaworker7_hotpatched_openQA_2877 BUILD=poo64938 WORKER_CLASS=openqaworker7
Created job #1218700: opensuse-Tumbleweed-DVD-x86_64-Build20200329-gnuhealth@64bit -> https://openqa.opensuse.org/t1218700
test is fine so the fix is helpful. Not sure about other impacts but nevertheless we can move ahead. I will merge the PR.
Updated by Xiaojing_liu over 4 years ago
okurz wrote:
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/1218664 ISO='' openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/1218665 ISO=''
Created job #1218681: opensuse-Tumbleweed-DVD-x86_64-Build20200329-boot_to_snapshot@64bit -> https://openqa.opensuse.org/t1218681
Created job #1218682: opensuse-Tumbleweed-DVD-x86_64-Build20200329-gnuhealth@64bit -> https://openqa.opensuse.org/t1218682both passed
When using the openqa-clone-job and specify the ISO=''
, the job's setting ISO=
will be removed, then the job will be passed. This situation is different from the jobs that created by isos post
. All the fail jobs' settings have ISO=
. we should use isos post
or openqa-clone-job but not specify ISO=
(the command okurz gave above) to verify this fix.
Updated by okurz over 4 years ago
- Status changed from In Progress to Feedback
Yes, I am aware that clone-job ISO=''
is not the same. I did that as a short-term remedy knowing that the ISO is still there even though not actively used by the test.
Triggering an out-of-ordinary upgrade with the fixed package with for i in aarch64 openqaworker1 openqaworker4 openqaworker7 power8 imagetester rebel; do echo $i && ssh root@$i "(transactional-update -n dup || zypper -n dup) && reboot" ; done
and will monitor tests on o3. https://openqa.opensuse.org/tests/1218947 is passed fine now.
Updated by okurz over 4 years ago
- Status changed from Feedback to Resolved