Project

General

Profile

Actions

action #25496

closed

coordination #23340: [sle][functional][epic]Adaption to new yast storage stack

[sle][functional][sle15][demo] test fails in partitioning_raid - again adaptions are needed

Added by okurz over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Category:
Bugs in existing tests
Start date:
2017-09-21
Due date:
2018-01-30
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-15-Leanos-DVD-x86_64-RAID0@64bit fails in
partitioning_raid

Acceptance criteria

  • AC1: The tests for RAID that fail are labeled with a proper product bug.
  • AC2: Create necessary tickets for tasks about workaround and soft-fail known bugs.
  • AC3: Create necessary tickets for tasks about fixing test bugs.

Reproducible

Fails since (at least) Build 93.17

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Impact is considered high as all RAID tests at least are blocked by this

Related


Related issues 2 (0 open2 closed)

Related to openQA Tests - coordination #26912: [sle][functional][epic] test fails in partitioning - Different hotkey in same build, same arch and same elementsResolved2017-10-20

Actions
Related to openQA Tests - action #26920: [sle][functional] test fails in partitioning_raid - Workaround for bsc#1063844 causes needle to failResolvedriafarov2017-10-202017-11-08

Actions
Actions #1

Updated by okurz over 6 years ago

  • Status changed from New to In Progress
  • Assignee set to jorauch
Actions #3

Updated by okurz over 6 years ago

  • Target version set to Milestone 11
Actions #4

Updated by okurz over 6 years ago

  • Due date set to 2017-10-11
Actions #5

Updated by okurz over 6 years ago

now fails even earlier on the partition to be added, hex value missing on partition type, trivial needle update as first step.

Actions #6

Updated by SLindoMansilla over 6 years ago

  • Description updated (diff)
Actions #7

Updated by riafarov over 6 years ago

Actions #8

Updated by jorauch over 6 years ago

Tests causes serious qemu problems locally and in production.
Now the creation of disks works and it fails in a product:
https://bugzilla.suse.com/show_bug.cgi?id=1061865
PR:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3666
Needles:
https://gitlab.suse.de/openqa/os-autoinst-needles-sles/merge_requests/516

Actions #9

Updated by jorauch over 6 years ago

Now trying to fix the RAID disk creation which works with 288.8
Qemu problems are getting worse despite the workaround.

Actions #10

Updated by okurz over 6 years ago

jorauch wrote:

Qemu problems are getting worse despite the workaround.

Where do you encounter the qemu problems? which machines? Make sure you are not running into https://bugzilla.suse.com/show_bug.cgi?id=1059369 , a qemu bug. openSUSE Leap 42.2 qemu-2.6.2-31.6.2 should be used or 2.6.2-29.4. 2.6.2-31.3.3 is the faulty one.

Actions #11

Updated by jorauch over 6 years ago

I am using qemu-x86_64 version 2.9.92 which is not mentioned in the ticket.

Actions #12

Updated by okurz over 6 years ago

jorauch and slindomansilla will ensure the ticket is done until tomorrow or some flogging should be considered ;)

Actions #13

Updated by jorauch over 6 years ago

Updated PRs, now failing shortly before finishing

Actions #14

Updated by jorauch over 6 years ago

  • Subject changed from [sle][functional][sle15]test fails in partitioning_raid - again adaptions are needed to [sle][functional][sle15][demo] test fails in partitioning_raid - again adaptions are needed

Best run so far:
http://10.160.65.204/tests/713
Example for early fail:
http://10.160.65.204/tests/738

Actions #15

Updated by jorauch over 6 years ago

Actions #16

Updated by okurz over 6 years ago

  • Assignee changed from jorauch to SLindoMansilla

As jorauch seems to have a lot of problems on his computer another person should try to execute this scenario. @jorauch: Do not work on this anymore alone.

Actions #17

Updated by okurz over 6 years ago

  • Due date changed from 2017-10-11 to 2017-10-25
Actions #18

Updated by sebchlad over 6 years ago

This is an urgent issue.
Should we have more focus on this or is there anything preventing us from that?

Actions #19

Updated by SLindoMansilla over 6 years ago

Impediments: Resources. We were not able to manage the test on the current hardware that we have.

  • My machines get another issues making the test to never reach the point that we need to work further.
  • My machines are using Leap and Loewe (the shared worker machine offered by Nick) is using a non-compatible version with Leap.
Actions #20

Updated by okurz over 6 years ago

While I see these as impediments I don't see them as blockers. As already discussed with jorauch and SLindoMansilla in person but as it seems like we are still not progressing fast enough I am repeating again in written text:

If the partitioning_raid test module is unusable now as in unable to complete within 30m then either your adaptions are inefficient or the product itself has an important issue. Even if it's just a usability issue that raid partitions can only be setup with 3-4x times as many clicks/keypresses as SLE12SP3 then this should be reported as a product bug and escalated accordingly. The labeled jobs that show up now as "QA sucks" (because RAID is at the start of the alphabet and therefore shows up first in the openQA test overview and jobs are labeled with test issues) are very urgent to not fail in openQA issues otherwise PO+PM+RM will decide that the tests do not show product issues and therefore will not block any milestone builds so they are not providing anything useful.

Actions #21

Updated by riafarov over 6 years ago

Actions #26

Updated by SLindoMansilla over 6 years ago

Some needles recreated because were needed by SLE-12: https://gitlab.suse.de/openqa/os-autoinst-needles-sles/merge_requests/533

Actions #27

Updated by okurz over 6 years ago

merged

Actions #29

Updated by SLindoMansilla over 6 years ago

PPC raid PR merged.

Actions #30

Updated by SLindoMansilla over 6 years ago

Changes verified on OSD

aarch64

Notice: Expected to fail on bsc#1060993
Not possible to workaround.


Verifying other architechtures.

Actions #32

Updated by SLindoMansilla over 6 years ago

  • Description updated (diff)
  • Status changed from In Progress to Resolved

Acceptance criteria updated. The goal of this ticket was to not have an overview in openQA results about raid tests failing for tests bugs.
At least 1 bug is found for each arch. For some of them a workaround is applied.

Further work can/should be done in following tickets:

We considered this done.

Actions #33

Updated by SLindoMansilla over 6 years ago

  • Related to coordination #26912: [sle][functional][epic] test fails in partitioning - Different hotkey in same build, same arch and same elements added
Actions #34

Updated by SLindoMansilla over 6 years ago

  • Related to action #26920: [sle][functional] test fails in partitioning_raid - Workaround for bsc#1063844 causes needle to fail added
Actions #35

Updated by dimstar over 6 years ago

SLindoMansilla wrote:

PR to fix raid 10 test on x86_64: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3751

Merged

Broke Tumbleweed: https://openqa.opensuse.org/tests/508708

Actions #37

Updated by okurz over 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: RAID0
https://openqa.suse.de/tests/1241681

Actions #38

Updated by SLindoMansilla over 6 years ago

okurz wrote:

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: RAID0
https://openqa.suse.de/tests/1241681

That is actually a bug bsc#1060993, job comment updated.

Actions #39

Updated by okurz over 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: lvm-full-encrypt
https://openqa.suse.de/tests/1258091

Actions #40

Updated by okurz over 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: lvm+RAID1
https://openqa.suse.de/tests/1293122

Actions #41

Updated by jorauch over 6 years ago

  • Status changed from Resolved to Feedback
  • Target version changed from Milestone 11 to Milestone 12

Test still fails and was labeled with this ticket:
https://openqa.suse.de/tests/1293122#step/partitioning_raid/233

Actions #42

Updated by okurz over 6 years ago

@jorauch, @SLindoMansilla this is mainly a process issue rather than technical issue. We need to improve our test meta-review. You have to help me: When you get a notification like the ones in #25496#note-40 please react on it because I myself do not get this notification. This ticket was resolved correctly but test failures are labeled with this issue. I suggest the following: In the most recent SLE15 build find all job failures which are labeled with this ticket and update the labels, e.g. find another open ticket or create new ones, then close this ticket again.

Actions #43

Updated by SLindoMansilla over 6 years ago

  • Due date changed from 2017-10-25 to 2018-01-16

Making this visible for sprint 8.

Warning: this ticket is not about fixing raid tests, it is about proper job labeling and creation of atomic tickets than can be planned on a future sprint. Pay attention to acceptance criteria.

Actions #44

Updated by okurz over 6 years ago

  • Due date changed from 2018-01-16 to 2018-01-30
  • Target version changed from Milestone 12 to Milestone 13

mass-shift of tickets to next sprint due to training on sprint review day

Actions #45

Updated by SLindoMansilla over 6 years ago

  • Status changed from Feedback to Resolved

Raid tests are not failing in any raid related bug problem and all current bugs that makes them fail are already labeled: https://openqa.suse.de/tests/overview?distri=sle&version=15&build=422.1&groupid=110

This ticket can be considered done.

Actions

Also available in: Atom PDF