action #30805: [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?) - openQA Tests (public) - openSUSE Project Management Tool

Actions

Copy link

#1

Updated by okurz almost 7 years ago

Subject changed from test fails in inkscape to [functional][opensuse][leap]test fails in inkscape - typing too fast?
Due date set to 2018-03-13
Target version set to Milestone 14

Actions

Copy link

#2

Updated by riafarov almost 7 years ago

Description updated (diff)
Status changed from New to Workable

Actions

Copy link

#3

Updated by mgriessmeier almost 7 years ago

Subject changed from [functional][opensuse][leap]test fails in inkscape - typing too fast? to [functional][opensuse][leap][medium][research] test fails in inkscape - typing too fast?

Actions

Copy link

#4

Updated by jorauch almost 7 years ago

Assignee set to jorauch

Has not been seen since then
Current build
https://openqa.opensuse.org/tests/621710
Triggered 50 runs on pinky to generate statistics -> http://pinky.arch.suse.de/tests

Actions

Copy link

#5

Updated by jorauch almost 7 years ago

Assignee deleted (~~jorauch~~)

Actions

Copy link

#6

Updated by jorauch almost 7 years ago

Status changed from Workable to In Progress
Assignee set to jorauch

Appeared a few times, would suggest reducing the typing limit in ensure_installed / x11_start_program (not sure which one applies here)

Actions

Copy link

#7

Updated by mgriessmeier almost 7 years ago

Due date changed from 2018-03-13 to 2018-03-27
Target version changed from Milestone 14 to Milestone 15

Actions

Copy link

#8

Updated by jorauch almost 7 years ago

Add inkscape test to suite that boots from image (extratests in kde) and trigger 50 times to reproduce
Most likely in x11_start_program

Actions

Copy link

#9

Updated by okurz almost 7 years ago

Related to action #33283: [opensuse][functional][u][sporadic][medium] test fails in kontact - typing string loosing characters added

Actions

Copy link

#10

Updated by jorauch almost 7 years ago

Might be related to https://progress.opensuse.org/issues/31687 ?

Actions

Copy link

#11

Updated by mgriessmeier almost 7 years ago

Related to action #31687: [opensuse][functional][medium][u] x11_start_program does not care if program can not be called in desktop runner even after three 'ret' presses with "valid" and "target_match" added

Actions

Copy link

#12

Updated by jorauch almost 7 years ago

Status changed from In Progress to Blocked

Setting to blocked by:
https://progress.opensuse.org/issues/31687
Reason:
We have no real check if the entered text is actually what we wanted to enter and even if we knew it x11_start_program lacks a proper error handling.
Will try to find a proper solution in blocker ticket, then this should become obsolete

Actions

Copy link

#13

Updated by jorauch almost 7 years ago

Created PR with code:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4715

Currently needle editor does not work for me, so needles and verification run need to be done

Actions

Copy link

#14

Updated by mgriessmeier almost 7 years ago

Due date changed from 2018-03-27 to 2018-04-10

Actions

Copy link

#15

Updated by okurz almost 7 years ago

jorauch wrote:

Currently needle editor does not work for me, so needles and verification run need to be done

mkittler already provided patches. You could try these.

Actions

Copy link

#16

Updated by mgriessmeier over 6 years ago

Due date changed from 2018-04-10 to 2018-04-24

Actions

Copy link

#17

Updated by mgriessmeier over 6 years ago

Due date changed from 2018-04-24 to 2018-05-08
Status changed from Blocked to Workable
Target version changed from Milestone 15 to Milestone 16

blocker has been resolved, moving to next sprint as workable

Actions

Copy link

#18

Updated by okurz over 6 years ago

Subject changed from [functional][opensuse][leap][medium][research] test fails in inkscape - typing too fast? to [functional][opensuse][leap][medium][u] test fails in inkscape - typing too fast?

The expected result is clearly that the test runs stable so removing the [research] tag even though it might involve some research, just research is not enough.

Actions

Copy link

#19

Updated by jorauch over 6 years ago

Status changed from Workable to In Progress

I see two options here:

we merge the WIP PR and create a needle for the typed text
we are happy that it did not happen for a long time and close this without merging the PR

Actions

Copy link

#20

Updated by jorauch over 6 years ago

Status changed from In Progress to Feedback
Assignee changed from jorauch to okurz

What do you think?

Actions

Copy link

#21

Updated by okurz over 6 years ago

Status changed from Feedback to In Progress
Assignee changed from okurz to jorauch

jorauch wrote:

I see two options here:

we merge the WIP PR and create a needle for the typed text

We will merge the PR only after it is not WIP anymore, obviously. The question there still holds: What makes inkscape special?

we are happy that it did not happen for a long time and close this without merging the PR

That is not true because I could easily find a recent failure: https://openqa.opensuse.org/tests/664611#step/inkscape/4

Actions

Copy link

#22

Updated by jorauch over 6 years ago

Status changed from In Progress to Feedback
Assignee changed from jorauch to okurz

jorauch wrote:
I see two options here:
we merge the WIP PR and create a needle for the typed text
We will merge the PR only after it is not WIP anymore, obviously. The question there still holds: What makes inkscape special?

Appearantly it's special because its failing regularly and we have a ticket for it.
When comparing the recent fail you posted with the inital issue we can see that it fails in different steps, but both are caused by x11_start_program
Maybe we should go a step back and try to harden x11_start_program before we fix symptoms all around our tests?

Actions

Copy link

#23

Updated by jorauch over 6 years ago

Related to action #35877: [functional][u] Find out in post-fail-hook if system is I/O-busy added

Actions

Copy link

#24

Updated by jorauch over 6 years ago

Status changed from Feedback to In Progress
Assignee changed from okurz to jorauch

Talked with okurz.

Our assumption: The upgrade systems have a "dirty" filesystem causing e.g. btrfs maintenance tasks to slow down the systems more than clean installation jobs. Then random test modules fail, e.g. inkscape.

To separate the concerns we have the following ideas:

Add new test suite "dirty system test" with normal timeout and explicit check for system responsiveness, e.g. starting krunner very often and check for the suggestions to popup (best directly after reboot)
update tests with TIMEOUT_SCALE=3 to exclude workload problems

Actions

Copy link

#25

Updated by lnussel over 6 years ago

the force_cron_run test is meant to take care of triggering all cron jobs so they don't disturb later.

Actions

Copy link

#26

Updated by okurz over 6 years ago

Status changed from In Progress to Feedback
Assignee changed from jorauch to okurz

lnussel wrote:

the force_cron_run test is meant to take care of triggering all cron jobs so they don't disturb later.

That's true, but still we see the update cases are more prone to fail for reason yet unknown. Could also be that the krunner itself needs more time to rebuild some cache or so. This is why I think #35877 could help. Don't you think?

Actions

Copy link

#27

Updated by jorauch over 6 years ago

Status changed from Feedback to In Progress
Assignee changed from okurz to jorauch

Additionally the force_cron_run is useless if we reboot in the meantime

Actions

Copy link

#28

Updated by mgriessmeier over 6 years ago

Subject changed from [functional][opensuse][leap][medium][u] test fails in inkscape - typing too fast? to [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?)
Due date changed from 2018-05-08 to 2018-05-22

Actions

Copy link

#29

Updated by jorauch over 6 years ago

We could either:

run force_cron_run after every reboot
change the complete order so there is no reboot in between
add a force_cron_run or wait to the reboot module

I'd prefer to just change the order since the reboot is not necessary for any of the following tests, but we would lose a snapshot.

Actions

Copy link

#30

Updated by okurz over 6 years ago

Related to action #33571: [opensuse][functional][u][medium] test fails in shutdown - emoticon settings are opened added

Actions

Copy link

#31

Updated by jorauch over 6 years ago

As discussed with okurz we should move the reboot before the shutdown
This is located in main_common.pm

Actions

Copy link

#32

Updated by jorauch over 6 years ago

Status changed from In Progress to Feedback

PR created:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5000

Waiting for merge and consequences in production

Actions

Copy link

#33

Updated by okurz over 6 years ago

validation test was fine, PR merged. Triggered 200 test jobs with

for i in {1..100} ; do openqa_clone_job_o3 --skip-chained-deps 671977 TEST=okurz_poo30805_$i BUILD=241.1:poo30805 _GROUP="Development Leap" ; done
for i in {1..100} ; do openqa_clone_job_o3 --skip-chained-deps 672079 TEST=okurz_poo30805_$i BUILD=241.1:poo30805 _GROUP="Development Leap" ; done

-> https://openqa.opensuse.org/tests/overview?build=241.1%3Apoo30805&version=15.0&distri=opensuse&groupid=39

Please check statistics.

Actions

Copy link

#34

Updated by jorauch over 6 years ago

There are a lot of obsolted, not sure what this status means?
In the overview there are no inkscape fails at least.
Can we close this or are the obsoleted a problem?

Actions

Copy link

#35

Updated by okurz over 6 years ago

Related to action #35688: [opensuse][functional][u][sporadic][bsc#1091353][medium] Various unstable tests on o3 - inkscape added

Actions

Copy link

#36

Updated by okurz over 6 years ago

I would not close this bug yet. I am working on #36117 and my idea is to create an explicit test module calling the desktop runner, see https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5089
Afterwards we should again run some more tests to check statistics.

btw, "obsoleted" means that a new build was triggered and that canceled further execution of jobs in older builds. For our purposes you can ignore these and we can take a look on passed vs. failed. However, many tests failed in modules like yast2_lan which might be helpful for the future to make tests more stable but let's focus on the x11 test modules for now.

So I suggest to wait for #36117 first and then come back to this one here.

Actions

Copy link

#37

Updated by mgriessmeier over 6 years ago

Blocked by action #36117: [functional][u][sporadic] test fails in xterm to show "xterm" (needle tag desktop-runner-plasma-suggestions) in krunner - system slower just after login? added

Actions

Copy link

#38

Updated by mgriessmeier over 6 years ago

Due date changed from 2018-05-22 to 2018-06-05
Status changed from Feedback to Blocked

Actions

Copy link

#39

Updated by mgriessmeier over 6 years ago

Due date changed from 2018-06-05 to 2018-06-19
Status changed from Blocked to Workable
Target version changed from Milestone 16 to Milestone 17

blocker resolved, moving to workable into next sprint (to be revisited in planning meeting)

Actions

Copy link

#40

Updated by mgriessmeier over 6 years ago

Due date deleted (~~2018-06-19~~)

Actions

Copy link

#41

Updated by okurz over 6 years ago

Due date set to 2018-07-03

Please try to check statistics again.

Actions

Copy link

#42

Updated by okurz over 6 years ago

Target version changed from Milestone 17 to Milestone 17

Actions

Copy link

#43

Updated by okurz over 6 years ago

Due date changed from 2018-07-03 to 2018-08-14
Status changed from Workable to Blocked
Assignee changed from jorauch to okurz
Target version changed from Milestone 17 to Milestone 18

no reaction. Feel free to unassign sooner, no problem to give tickets back to the backlog. As well as in other tickets, blocked by #31351

Actions

Copy link

#44

Updated by okurz over 6 years ago

Status changed from Blocked to Workable
Assignee deleted (~~okurz~~)

With blockers resolved within #35685 I did some statistical analysis in #35685#note-37 and found four out of 100 jobs failing in shutdown , could be more, some still running.

https://openqa.opensuse.org/tests/700813#step/shutdown/18 failed after the previous module, reboot, failed so let us discard that one for now, as well as https://openqa.opensuse.org/tests/700863#step/shutdown/19 for the same reason. https://openqa.opensuse.org/tests/700814#step/shutdown/4 as well as https://openqa.opensuse.org/tests/700864#step/shutdown/4 fail for what looks like the same error, that is: xterm does not open from the plasma desktop runner. My hypothesis for the root cause is higher system load after the bootup caused by the reboot module. This is why we moved it originally to further back in the schedule. Somehow we need to ensure that the desktop runner is handled more gracefully in the shutdown module after reboot but waiting for the desktop runner longer within the shutdown module itself sounds wrong as this would make the module more dependant on the previous module. Maybe call the code from "desktop_runner" in reboot after the bootup to ensure the desktop runner is responsive in followup modules?

Actions

Copy link

#45

Updated by okurz over 6 years ago

Status changed from Workable to In Progress
Assignee set to okurz

Actions

Copy link

#46

Updated by okurz over 6 years ago

https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5365

Actions

Copy link

#47

Updated by okurz over 6 years ago

Status changed from In Progress to Resolved

merged and stable. Missing failures ar handled elsewhere, e.g. gnucash in #38387, chromium in #36304

Project

General

Profile

QA (public) » openQA Project (public) » openQA Tests (public)

Tags

Custom queries

action #30805

[functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?)

Observation¶

Tasks¶

Reproducible¶

Expected result¶

Further details¶

Updated by okurz almost 7 years ago

Updated by riafarov almost 7 years ago

Updated by mgriessmeier almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by mgriessmeier almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by okurz almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by mgriessmeier almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by jorauch almost 7 years ago

Updated by mgriessmeier almost 7 years ago

Updated by okurz almost 7 years ago

Updated by mgriessmeier over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by okurz over 6 years ago

Updated by jorauch over 6 years ago

Updated by jorauch over 6 years ago

Updated by okurz over 6 years ago

Updated by jorauch over 6 years ago

Updated by jorauch over 6 years ago

Updated by jorauch over 6 years ago

Updated by lnussel over 6 years ago

Updated by okurz over 6 years ago

Updated by jorauch over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by jorauch over 6 years ago

Updated by okurz over 6 years ago

Updated by jorauch over 6 years ago

Updated by jorauch over 6 years ago

Updated by okurz over 6 years ago

Updated by jorauch over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by mgriessmeier over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago

Updated by okurz over 6 years ago