Project

General

Profile

Actions

action #114523

closed

Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64 size:M

Added by okurz over 2 years ago. Updated 3 months ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Regressions/Crashes
Start date:
2022-06-03
Due date:
% Done:

0%

Estimated time:

Description

Observation

Same as on #111992 but now on aarch64 OSD workers, crosscheck o3 worker aarch64

Steps to reproduce

Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label, call openqa-query-for-job-label poo#114523

Suggestions

  • DONE Ensure to have workaround on all aarch64 OSD workers for #111992, e.g.
sudo zypper -n in --oldpackage http://download.opensuse.org/ports/aarch64/distribution/leap/15.3/repo/oss/noarch/qemu-uefi-aarch64-202008-10.8.1.noarch.rpm
  • Now try again to remove zypper lock and trigger according openQA tests to crosscheck both on osd+o3

Rollback steps

  • Remove zypper lock for qemu-uefi-aarch64 on all workers

Related issues 6 (3 open3 closed)

Related to openQA Project (public) - action #113794: Use prepared OVMF image with expected settings size:MResolvedtinita2022-06-03

Actions
Related to openQA Tests (public) - action #114493: [qe-core][aarch64][installation]test fails in bootloader_start, needle mismatch on installer boot memuResolvedokurz2022-07-22

Actions
Related to openQA Tests (public) - action #114550: [qe-core] Ignored warnings about too many needles and detected stalls, in particular when checking grub2New2022-07-22

Actions
Related to openQA Project (public) - action #114769: Have jobs fail if screen checks take too long, e.g. if there are "two many needles" after warning about itNew

Actions
Blocks openQA Tests (public) - action #115919: [security] test fails in tpm2_measured_bootBlocked2022-08-29

Actions
Copied from openQA Project (public) - action #111992: Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl size:MResolvedtinita2022-06-03

Actions
Actions #1

Updated by okurz over 2 years ago

  • Copied from action #111992: Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl size:M added
Actions #2

Updated by okurz over 2 years ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64 to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*Stall detected.*no candidate needle.*bootloader-shim-import-prompt.*grub2.*inst-bootmenu"
Actions #3

Updated by okurz over 2 years ago

  • Description updated (diff)
Actions #4

Updated by okurz over 2 years ago

  • Description updated (diff)
Actions #5

Updated by okurz over 2 years ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*Stall detected.*no candidate needle.*bootloader-shim-import-prompt.*grub2.*inst-bootmenu" to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*Stall detected.*no candidate needle.*bootloader-shim-import-prompt.*grub2.*inst-bootmenu":retry

Deployed old packages onto

sudo salt --no-color --state-output=changes -C 'G@roles:worker and G@osarch:aarch64' cmd.run 'sudo zypper -n in --oldpackage http://download.opensuse.org/ports/aarch64/distribution/leap/15.3/repo/oss/noarch/qemu-uefi-aarch64-202008-10.8.1.noarch.rpm'

I triggered a test https://openqa.suse.de/tests/9196630# and realized that people went crazy creating new needles already for the wrong resolution. I pointed that out to mgrifalconi in https://suse.slack.com/archives/C02CANHLANP/p1658478957406679?thread_ts=1658478131.778769&cid=C02CANHLANP

Now after applying that workaround with the old package install that seems to work

openqa-clone-job --skip-chained-deps --within-instance https://openqa.suse.de/tests/9192234 _GROUP=0 BUILD= TEST=okurz_poo_111992_workaround_downgraded_qemu-uefi-aarch64 SCHEDULE=tests/installation/bootloader_start WORKER_CLASS=openqaworker-arm-2

->
Created job #9196670: sle-15-SP3-Server-DVD-Updates-aarch64-Build20220721-1-qam-gnome@aarch64-virtio -> https://openqa.suse.de/t9196670

Actions #6

Updated by okurz over 2 years ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*Stall detected.*no candidate needle.*bootloader-shim-import-prompt.*grub2.*inst-bootmenu":retry to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*no candidate needle.*grub":retry

And called

for i in openqa.opensuse.org openqa.suse.de; do export host=$i; bash -ex ./openqa-monitor-investigation-candidates | bash -e ./openqa-label-known-issues; done
Actions #7

Updated by okurz over 2 years ago

  • Related to action #113794: Use prepared OVMF image with expected settings size:M added
Actions #8

Updated by okurz over 2 years ago

  • Status changed from In Progress to Feedback

I assume with auto-review+retry and the workaround we are good for now. Keeping open until #111992 or #113794

Actions #9

Updated by okurz over 2 years ago

  • Related to action #114493: [qe-core][aarch64][installation]test fails in bootloader_start, needle mismatch on installer boot memu added
Actions #10

Updated by okurz over 2 years ago

  • Related to action #114550: [qe-core] Ignored warnings about too many needles and detected stalls, in particular when checking grub2 added
Actions #11

Updated by okurz over 2 years ago

Richard Fan in https://suse.slack.com/archives/C02CANHLANP/p1658892342572309 brought up that more tests are again affected. I might have missed the lock or a machine wasn't reachable during the operation? I executed sudo salt --no-color --state-output=changes -C 'G@roles:worker and G@osarch:aarch64' cmd.run 'zypper -n in --oldpackage http://download.opensuse.org/ports/aarch64/distribution/leap/15.3/repo/oss/noarch/qemu-uefi-aarch64-202008-10.8.1.noarch.rpm && zypper al qemu-uefi-aarch64' but openqaworker-arm-3 just crashed again right now. It might come back online again with the wrong version.

Actions #12

Updated by okurz over 2 years ago

  • Related to action #114769: Have jobs fail if screen checks take too long, e.g. if there are "two many needles" after warning about it added
Actions #13

Updated by okurz over 2 years ago

openqaworker-arm-3 was now reachable so I could rollback the package qemu-uefi-aarch64. I have applied workarounds to all OSD machines regarding OVMF usage as far as they are available. In case you still encounter failed jobs from the passed with the "wrong resolution" for the initial bootup screen retrigger them. And also I recommend to delete any needles you might have created recently in the wrong resolution to prevent the bootloader screen to become inefficient due to the amount of needles so that eventually jobs would miss the bootloader screen and fail with annoying random failures.

Actions #14

Updated by okurz over 2 years ago

  • Status changed from Feedback to Blocked
  • Priority changed from Urgent to High

Workaround should still be effective on all, waiting for #113794

Actions #15

Updated by openqa_review over 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: yast2_gui
https://openqa.opensuse.org/tests/2526840#step/yast2_control_center/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #16

Updated by tinita over 2 years ago

See #113794 for current progress and what to do

Actions #17

Updated by okurz over 2 years ago

  • Description updated (diff)
Actions #18

Updated by mkittler over 2 years ago

The regex in the ticket title isn't specific enough, see note on #116614.

Actions #19

Updated by tinita over 2 years ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*no candidate needle.*grub":retry to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*no candidate needle[^\n]*grub":retry

I changed the title and assume that no candidate needle[^\n]*grub should be matched on the same line.
Worked for me as expected on an example logfile.
#116614

Actions #20

Updated by jlausuch over 2 years ago

Actions #21

Updated by tinita about 2 years ago

@jlausuch can you help me get the connection with the resolution issue? I see that in your linked jobs needles don't match, but the screenshots are totally different, so it doesn't seem to be a resolution issue. Or am I missing something?

Actions #22

Updated by openqa_review about 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: xfstests_xfs-dangrous-tests
https://openqa.suse.de/tests/9679280#step/run/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 32 days if nothing changes in this ticket.

Actions #24

Updated by openqa_review about 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: install_only@aarch64-uefi_http_boot
https://openqa.opensuse.org/tests/2885807#step/bootloader_uefi/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 48 days if nothing changes in this ticket.

Actions #25

Updated by okurz almost 2 years ago

  • Priority changed from High to Normal
Actions #26

Updated by okurz almost 2 years ago

  • Tags set to reactive work
Actions #28

Updated by openqa_review almost 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test:
https://openqa.suse.de/tests/10357436#step/boot_without_secureboot/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #29

Updated by tinita almost 2 years ago

The regex in the ticket title exceeds PCRE's backtrack limit, I see on osd:

grep failed: cmd=>grep -qPzo '(?s)aarch64.*uefi.*no candidate needle[^\n]*grub' '/tmp/tmp.HKYSbhPnjQ'< output='grep: exceeded PCRE's backtracking limit'
Actions #30

Updated by favogt almost 2 years ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64, auto_review:"(?s)aarch64.*uefi.*no candidate needle[^\n]*grub":retry to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64

Let's just drop the autoreview regex, there are too many false positives anyway

Actions #31

Updated by openqa_review almost 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: online_sles15sp2_ltss_pscc_basesys-srv-desk-dev-contm-lgm-py2-tsm-wsm-pcm_all_full
https://openqa.suse.de/tests/10437121#step/check_os_release/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #32

Updated by openqa_review almost 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: jeos-containers-podman
https://openqa.suse.de/tests/10653531#step/bootloader_uefi/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.

Actions #33

Updated by okurz over 1 year ago

  • Tags changed from reactive work to reactive work, infra
  • Status changed from Blocked to New
  • Assignee deleted (okurz)

#111992 was resolved, the updated packages can be tried on aarch64 for now as well.

Actions #34

Updated by okurz over 1 year ago

  • Priority changed from Normal to Low
Actions #35

Updated by okurz over 1 year ago

  • Subject changed from Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64 to Deal with QEMU and OVMF default resolution being 1280x800, affecting (at least) qxl, but on aarch64 size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #36

Updated by tjyrinki_suse over 1 year ago

  • Blocks action #115919: [security] test fails in tpm2_measured_boot added
Actions #37

Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: jeos-filesystem
https://openqa.suse.de/tests/11050552#step/bootloader_uefi/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #38

Updated by okurz over 1 year ago

  • Target version changed from Ready to future
Actions #39

Updated by okurz 3 months ago ยท Edited

  • Category set to Regressions/Crashes
  • Status changed from Workable to In Progress
  • Assignee set to okurz
  • Target version changed from future to Ready

this was brought up in https://suse.slack.com/archives/C02CANHLANP/p1726731665437149 by Jose Lausuch. Apparently we don't have the downgraded package on newer arm workers. I will just remove the package lock and upgrade the package on openqaworker-arm-1.

Did sudo salt --no-color -C 'G@roles:worker and G@osarch:aarch64' cmd.run 'zypper rl qemu-uefi-aarch64'

openqa-query-for-job-label poo#114523 does not return any matches anymore.

Actions #40

Updated by okurz 3 months ago

  • Status changed from In Progress to Resolved

No more problematic jobs encountered. I also followed the Slack conversation and no more problems were mentioned there as well. https://bugzilla.opensuse.org/show_bug.cgi?id=1204067 was already VERIFIED FIXED in before.

Actions

Also available in: Atom PDF