Project

General

Profile

action #39059

[sle][functional][y] detect "openSUSE sucks bug" about btrfs balance and record_soft_fail (was: yast2_gui tests modules as application could not start up)

Added by zluo over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Enhancement to existing tests
Target version:
SUSE QA - Milestone 19
Start date:
2018-08-01
Due date:
2018-10-09
% Done:

0%

Estimated time:
5.00 h
Difficulty:

Description

Motivation

Tests seem to be unstable and some yast modules do not start up on time. We don't know if SUT is under load at this point of time, but all of these failures have same symptoms,
so we should come up with some scalable solution, including disabling these test modules. -> We know that the SUTs are under high load looking at the "loadavg" log, e.g. https://openqa.suse.de/tests/1898401/file/yast2_software_management-loadavg.txt pointing to https://bugzilla.suse.com/show_bug.cgi?id=1063638

Please, list all issues you face in this ticket, where module doesn't show up on time.

Acceptance criteria

Observation

openQA test in scenario sle-12-SP4-Server-DVD-x86_64-yast2_gui@64bit fails in
yast2_network_settings

openQA test in scenario sle-12-SP4-Server-DVD-x86_64-yast2_gui@64bit fails in
yast2_users

Suggestions

As problem hits us, investigate possible solutions to that to mitigate an issue.

Expected result

Last good: 0305 (or more recent)

Further details

Always latest result in this scenario: latest

https://fate.suse.com/325532


Related issues

Related to openQA Tests - action #39011: [functional] Low performance on openqa production serverResolved2018-08-01

Related to openQA Tests - action #38621: [functional][y] test fails in welcome - "Module is not signed with expected PKCS#7 message" (bsc#1093659) - Use serial exception catching feature from openQA to make sure the jobs reference the bug, e.g. as labelResolved2018-05-23

Related to openQA Tests - action #36100: [functional][u] Better explicit system performance testResolved2018-05-112018-06-19

Related to openQA Tests - action #42446: [qe-core][functional] many opensuse tests fail in desktop_runner or gimp or other modules in what I think is boo#1105691 – can we detect this bug from the journal and track as soft-fail?New2018-10-13

Copied to openQA Tests - action #41459: [sle][functional][u] Explicit test module for btrfs snapshots cleanup performanceRejected2018-08-01

History

#1 Updated by riafarov over 4 years ago

  • Subject changed from [sle][functional][y] test fails in yast2_network_settings - application could not start up to [sle][functional][y] yast2_gui tests modules as application could not start up
  • Description updated (diff)

#2 Updated by okurz over 4 years ago

  • Related to action #39011: [functional] Low performance on openqa production server added

#3 Updated by okurz over 4 years ago

  • Status changed from New to Rejected
  • Assignee set to okurz

Sorry, doesn't feel to me as if this ticket is providing value. If a module does not startup in time then it should be solved with longer waiting. If the module never starts up it is a product problem and needs to be investigated and reported. We should be able to cover this with #39011 already

#4 Updated by okurz over 4 years ago

  • Related to action #38621: [functional][y] test fails in welcome - "Module is not signed with expected PKCS#7 message" (bsc#1093659) - Use serial exception catching feature from openQA to make sure the jobs reference the bug, e.g. as label added

#5 Updated by okurz over 4 years ago

  • Subject changed from [sle][functional][y] yast2_gui tests modules as application could not start up to [sle][functional][y] detect "openSUSE sucks bug" about btrfs balance and record_soft_fail (was: yast2_gui tests modules as application could not start up)
  • Due date set to 2018-09-25
  • Category changed from Bugs in existing tests to Enhancement to existing tests
  • Status changed from Rejected to New
  • Assignee deleted (okurz)
  • Priority changed from Normal to High
  • Target version set to Milestone 19

Just found https://openqa.suse.de/tests/1898401#step/yast2_software_management/6 and the logs clearly show that btrfs balance and scrub are running in the background so it is https://bugzilla.suse.com/show_bug.cgi?id=1063638, the "openSUSE/SLE sucks"-bug which is not moving foward.

We reference this bug in force_scheduled_tasks. It could be that our recent rework of that module does not actually work for SLE12SP4 as subsequent modules seem to be more worse affected by btrfs balance runs or the btrfs balance run actually is still running.

At best we can detect this bug and prevent reviewer confusion, e.g. again with the "serial exception handling feature". Should follow #38621

#6 Updated by okurz over 4 years ago

  • Description updated (diff)

#7 Updated by okurz over 4 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: yast2_gui
https://openqa.suse.de/tests/2009680

#8 Updated by riafarov over 4 years ago

  • Description updated (diff)
  • Status changed from New to Workable

#9 Updated by riafarov over 4 years ago

  • Estimated time set to 5.00 h

#10 Updated by riafarov over 4 years ago

  • Status changed from Workable to In Progress
  • Assignee set to riafarov

#11 Updated by riafarov over 4 years ago

  • Status changed from In Progress to Feedback

#12 Updated by okurz over 4 years ago

  • Related to action #36100: [functional][u] Better explicit system performance test added

#13 Updated by okurz over 4 years ago

  • Copied to action #41459: [sle][functional][u] Explicit test module for btrfs snapshots cleanup performance added

#14 Updated by riafarov over 4 years ago

Keep open so we add SOFTFAIL_BSC1063638 variable to all places where needed.

#16 Updated by riafarov over 4 years ago

  • Due date changed from 2018-09-25 to 2018-10-09

#17 Updated by riafarov over 4 years ago

  • Status changed from Feedback to Resolved

#18 Updated by okurz over 4 years ago

  • Related to action #42446: [qe-core][functional] many opensuse tests fail in desktop_runner or gimp or other modules in what I think is boo#1105691 – can we detect this bug from the journal and track as soft-fail? added

Also available in: Atom PDF