Project

General

Profile

action #25554

[functional][bsc#1063638][u] soft fail in force_cron_run is too strict

Added by okurz over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Start date:
2017-09-25
Due date:
2018-05-22
% Done:

0%

Estimated time:
Difficulty:
Duration: 172

Description

Observation

openQA test in scenario sle-15-Leanos-DVD-Staging:G-x86_64-gnome@64bit-staging fails in
force_cron_run
but the mentioned bug in the soft fail should be already fixed.

Problem

Probably the check is too strict.

‎[‎25 Sep 2017 09:59:15‏] ‎<‎okurz‎>‎ coolo: https://openqa.opensuse.org/tests/491791#step/force_cron_run/6 looks to me as if it won't get any better
‎[‎25 Sep 2017 10:22:17‏] ‎<‎coolo‎>‎ okurz: one way could be settling load *before* the crons?
‎[‎25 Sep 2017 10:22:31‏] ‎<‎coolo‎>‎ but havig a load of 0.3 for unused X sounds like a bug - a different one though :)
‎[‎25 Sep 2017 10:27:19‏] ‎<‎okurz‎>‎ hm, settle load before crons, execute them, check again, that could work. I wonder though if you have a good test scenario in mind which one could trigger 500 times to check if the bug is really solved or another one stopping us now

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #19476: [migration]test fails in force_cron_run by assert_script_run failedRejected2017-06-01

Related to openQA Tests - action #21798: [functional]test fails in force_cron_run, idle threshold to highResolved2017-08-06

Related to openQA Tests - action #23556: [sle][functional]test fails in force_cron_run because of missing file /usr/lib/cron/run-cronsRejected2017-08-23

Related to openQA Tests - action #28364: [sle][functional] remove reference to "btrfs cron jobs kills system responsiveness"-bug and workaround in console/force_cron_runResolved2017-11-24

Related to openQA Tests - action #27597: [sle][functional] test fails in force_cron_run - Wrong bug ticket assignedResolved2017-11-092018-01-16

Related to openQA Tests - action #31351: [functional][u][medium] force_cron_run does not actually run any crons (occasionally)Resolved2018-02-032018-07-03

History

#1 Updated by okurz over 2 years ago

  • Related to action #19476: [migration]test fails in force_cron_run by assert_script_run failed added

#2 Updated by okurz over 2 years ago

  • Related to action #21798: [functional]test fails in force_cron_run, idle threshold to high added

#3 Updated by okurz over 2 years ago

  • Related to action #23556: [sle][functional]test fails in force_cron_run because of missing file /usr/lib/cron/run-crons added

#4 Updated by okurz over 2 years ago

  • Related to action #28364: [sle][functional] remove reference to "btrfs cron jobs kills system responsiveness"-bug and workaround in console/force_cron_run added

#5 Updated by okurz over 2 years ago

  • Related to action #27597: [sle][functional] test fails in force_cron_run - Wrong bug ticket assigned added

#6 Updated by okurz over 2 years ago

  • Subject changed from soft fail in force_cron_run is too strict to [functional]soft fail in force_cron_run is too strict
  • Due date set to 2018-01-16
  • Target version set to Milestone 13

#7 Updated by okurz over 2 years ago

  • Due date changed from 2018-01-16 to 2018-01-30

mass-shift of tickets to next sprint due to training on sprint review day

#8 Updated by okurz over 2 years ago

  • Due date changed from 2018-01-30 to 2018-02-13

#9 Updated by okurz over 2 years ago

  • Assignee set to okurz

#10 Updated by okurz over 2 years ago

  • Subject changed from [functional]soft fail in force_cron_run is too strict to [functional][bsc#1063638]soft fail in force_cron_run is too strict
  • Due date changed from 2018-02-13 to 2018-05-15
  • Status changed from New to Blocked
  • Target version changed from Milestone 13 to Milestone 15

Now test tracks bsc#1063638, will set ticket to blocked and review progress on bug later.

#11 Updated by StefanBruens over 2 years ago

  • Related to action #31351: [functional][u][medium] force_cron_run does not actually run any crons (occasionally) added

#12 Updated by StefanBruens over 2 years ago

Currently settle_load reports any time above 30 seconds as a softfailure, which is too short IMHO:

  • load average (first value) is a 1 minute sliding window average. Even for a completely idle system this value can be arbitrarily large for any time < 1 minute.
  • force_cron_run is executed immediately after boot, some large values (>>= 1) in the window are to be expected
  • starting top will likely contribute to the load (once), as processes waiting for disk I/O contribute to the load value.

A timeout of 70 seconds (60 seconds window plus startup) is likely more appropriate.

#13 Updated by okurz over 2 years ago

That sounds very reasonable. We should do that, switch to 70 seconds.

#14 Updated by okurz about 2 years ago

  • Due date changed from 2018-05-15 to 2018-05-08

#15 Updated by mgriessmeier about 2 years ago

  • Subject changed from [functional][bsc#1063638]soft fail in force_cron_run is too strict to [functional][bsc#1063638][u]soft fail in force_cron_run is too strict

#16 Updated by okurz about 2 years ago

  • Subject changed from [functional][bsc#1063638][u]soft fail in force_cron_run is too strict to [functional][bsc#1063638][u] soft fail in force_cron_run is too strict
  • Status changed from Blocked to Workable
  • Assignee deleted (okurz)
  • Target version changed from Milestone 15 to Milestone 16

#17 Updated by zluo about 2 years ago

  • Assignee set to zluo

take over

#19 Updated by zluo about 2 years ago

I found that bsc#1063638 is not fixed yet.

record_soft_failure 'bsc#1063638' if (time - $before) > (is_jeos() ? 180 : 30);  

#20 Updated by zluo about 2 years ago

  • Status changed from Workable to In Progress

#21 Updated by zluo about 2 years ago

http://e13.suse.de/tests/2258

change:
record_soft_failure 'bsc#1063638' if (time - $before) > (is_jeos() ? 180 : 70);

#22 Updated by zluo about 2 years ago

it looks good, no softfailed now.

#23 Updated by okurz about 2 years ago

Please collect more statistics on this, e.g. run at least 20 jobs

#25 Updated by zluo about 2 years ago

my test runs show different test results: some got passed, some still softfailed.

http://e13.suse.de/tests

#26 Updated by mgriessmeier about 2 years ago

  • Due date changed from 2018-05-08 to 2018-05-22
  • Status changed from In Progress to Feedback

#27 Updated by zluo about 2 years ago

  • Status changed from Feedback to Resolved

Also available in: Atom PDF