action #47321
closed[functional][u][opensuse][sporadic] openSUSE Leap 15.1 fails updates_packagekit_gpk test
0%
Description
Observation¶
openQA test in scenario opensuse-15.0-DVD-Updates-x86_64-gnome@64bit-2G fails in
updates_packagekit_gpk
on feb 7 the updates_packagekit_gpk started to fail on leap 15.0 tests
It seems there is a GNOME screen and it is no longer getting detected correctly.
Actually updates_packagekit_gpk already failed earlier in before, e.g. from 20 days ago:
https://openqa.opensuse.org/tests/838957#step/updates_packagekit_gpk/3
on 2019-01-26
And we can find even older failures in the same modules but that could be different steps.
Reproducible¶
Often but not always
Expected result¶
Last good: 20190207-3 however a lot of tests failed in before in a prerequisite
Last good: 20190124-1
Suggestions¶
- Change the assert_screen to increase the timeout to 60 seconds for the second loop, and add a soft failure already there
- [easy] Apply a solution similar to: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6825
- Only focus on failures in the initial steps of updates_packagekit_gpk
- Report product regression bug or fix test regression
- Bisect test changes as well as product changes with statistical investigation
- Optional: Increase stability of test module or scenario, e.g. by reordering test module schedule
Further details¶
Always latest result in this scenario: latest
Workaround¶
Retrigger failing jobs
Updated by okurz almost 6 years ago
- Subject changed from openSUSE Leap 15.0 started failing updates_packagekit_gpk test to [functional][u] openSUSE Leap 15.0 started failing updates_packagekit_gpk test
- Category set to Bugs in existing tests
- Target version set to Milestone 22
@msmeissn for easier ticket creation you can use the openQA built-in reporting feature, see https://wiki.microfocus.net/index.php?title=RD-OPS_QA/openQA_review#Workflow_for_SLES.2C_SLED_and_HA_as_an_example for an example with screenshot
Updated by okurz almost 6 years ago
- Subject changed from [functional][u] openSUSE Leap 15.0 started failing updates_packagekit_gpk test to [functional][u][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test
- Description updated (diff)
- Status changed from New to Workable
Updated by jorauch almost 6 years ago
For me this looks like a problem with gnome and not with the test module
Updated by jorauch almost 6 years ago
- Priority changed from Urgent to High
After further looking I am pretty the problem is gnome taking too long to show up:
https://openqa.opensuse.org/tests/848696#step/updates_packagekit_gpk/3
https://openqa.opensuse.org/tests/849010#step/updates_packagekit_gpk/3
One time the screen is completely black, one time we have at least the bar on the top
I am not totally sure whether this is a test-performance or a product issue
Updated by okurz almost 6 years ago
yes, I agree with you but I don't see how we removed the urgency, or have you?
Updated by okurz almost 6 years ago
- Priority changed from High to Urgent
Setting back to "Urgent" as you haven't actually picked it up.
Updated by szarate almost 6 years ago
Let's see how much it happens, 1x worker... hacked the patch test to die after (quicker results :D) http://phobos.suse.de/tests/overview?version=15.0&build=20190214-2&distri=opensuse
Updated by szarate almost 6 years ago
- Status changed from Workable to Feedback
- Priority changed from Urgent to Normal
It was consistently failing on Feb 7, but seems not after?.
I fail to see the urgency of this ticket at this point, as for what I could check, the tests simply pass on my worker (They weren't matching the needle for the first ones, the rest are chopping through the work without problems, but is not showing the symptoms described in the ticket itself, after a needle update, life finds it's way again...).
Anywho, changing priority as last builds on o3 have been passing without problems. Waiting for the rest of jobs in my instance to finish...
Updated by szarate almost 6 years ago
- Assignee changed from szarate to okurz
So, last job is passing: https://openqa.opensuse.org/tests/latest?version=15.0&arch=x86_64&test=gnome&machine=64bit-2G&distri=opensuse&flavor=DVD-Updates
The rest of jobs on my instance started to fail due to missing repos. In any case, looks like gnome is just slow to show up (Peformance problems? also worker seems busy with 4+ seconds)... after looking at: https://openqa.opensuse.org/tests/854596/file/autoinst-log.txt
Suggestions would be:
- - Change the assert_screen to increase the timeout to 60 seconds for the second loop, and add a soft failure already there
- - Actually wait, since this wait_still_screen call looks like a noop for me, sleeping for 5 seconds.
Regarless of any of the suggestions: Log where's the counter. Eases up bug investigation.
Updated by okurz almost 6 years ago
- Description updated (diff)
- Status changed from Feedback to Workable
- Assignee deleted (
okurz) - Priority changed from Normal to High
- Target version changed from Milestone 22 to Milestone 23
szarate wrote:
It was consistently failing on Feb 7, but seems not after?.
It still fails some times as we can see now, e.g. https://openqa.opensuse.org/tests/854596#step/updates_packagekit_gpk/3 from 13h ago so as in the subject: "sporadic" as in "not reproducibly all the time"
I fail to see the urgency of this ticket at this point
[…]
Anywho, changing priority as last builds on o3 have been passing without problems. Waiting for the rest of jobs in my instance to finish...
Thank you for looking into this. And I agree that you removed the urgency by confirming that the jobs do not fail in 100% of the cases and that we have a valid workaround (retrigger failed tests). I have noted that in the description now.
In any case, looks like gnome is just slow to show up (Peformance problems? also worker seems busy with 4+ seconds)
And this is why I bumped the prio up a little bit again to "High" as it could be either a product regression or a recent test regression and bisecting is easier when both product and test code changes are more recent. I have put suggestions in the ticket description to bisect both test and/or product.
Suggestions would be:
- - Change the assert_screen to increase the timeout to 60 seconds for the second loop, and add a soft failure already there
- - Actually wait, since this wait_still_screen call looks like a noop for me, sleeping for 5 seconds.
Regarless of any of the suggestions: Log where's the counter. Eases up bug investigation.
Yes, that can help however I see other possibilities as well which I have suggested in the updated description.
So as you asked, updated and back to "Workable" with urgency removed, hence "Urgent"->"High"
Updated by szarate almost 6 years ago
Added: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6807 as follow up to my comment on #note-10 wrt logging the counter
Updated by okurz almost 6 years ago
- Actually wait, since this wait_still_screen call looks like a noop for me, sleeping for 5 seconds.
It's not a "noop" as the actual sleep is at least 2s but not necessarily much more. On top, it describes in a clearer way the intention than a simple sleep.
Updated by szarate almost 6 years ago
okurz wrote:
- Actually wait, since this wait_still_screen call looks like a noop for me, sleeping for 5 seconds.
It's not a "noop" as the actual sleep is at least 2s but not necessarily much more. On top, it describes in a clearer way the intention than a simple sleep.
Agreed, sounds better than a sleep, so we can bump that to 5 :)
Updated by okurz almost 6 years ago
- Status changed from Workable to Feedback
- Assignee set to szarate
- Priority changed from High to Normal
- Target version changed from Milestone 23 to Milestone 22
I don't think wait_still_screen(5)
is a stable approach either. Hence my suggestion "Increase stability of test module or scenario, e.g. by reordering test module schedule"
Updated by okurz almost 6 years ago
- Target version changed from Milestone 22 to Milestone 23
Updated by szarate almost 6 years ago
- Description updated (diff)
- Assignee changed from szarate to okurz
updated my pr to better mark calls to ensure unlocked desktop. That's all for the time being for me.
Updated the description with suggestions, @okurz set back to workable if agreed.
Updated by okurz almost 6 years ago
- Blocked by action #48110: [functional][u][sporadic] test failed in different modules that switch from textmode terminal to graphical terminal - unable to login into the gnome session again but we should not even need to login when selecting the correct tty added
Updated by okurz almost 6 years ago
- Status changed from Feedback to Blocked
- Target version changed from Milestone 23 to Milestone 25
Unfortunately I doubt any approach similar to https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6825 would help here as we start from a graphical session and that is the problem. I blocked this ticket by #48110 now as I think that one has priority which could involve many more unstable test modules. Only then we should revisit here.
Updated by pcervinka almost 6 years ago
Looking on recent Leap 42.3 results, updates_packagekit_gpk has sporadic failures too. Do you think it is the similar reason https://openqa.opensuse.org/tests/873333 ?
Updated by okurz over 5 years ago
- Assignee changed from okurz to mgriessmeier
Move to new QSF-u PO after I moved to the "tools"-team. I mainly checked the subject line so in individual instances you might not agree to take it over completely into QSF-u. Feel free to discuss with me or reassign to me or someone else in this case. Thanks.
Updated by mgriessmeier over 5 years ago
- Target version changed from Milestone 25 to Milestone 26
Updated by mgriessmeier over 5 years ago
- Subject changed from [functional][u][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test to [functional][u][opensuse][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test
- Status changed from Blocked to New
- Assignee deleted (
mgriessmeier) - Priority changed from Normal to High
- Target version changed from Milestone 26 to Milestone 27
to be groomed - still happening
Updated by SLindoMansilla over 5 years ago
- Subject changed from [functional][u][opensuse][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test to [qam][opensuse][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test
The reported problem doesn't happen anymore. The new failure is related to the serial device: https://openqa.opensuse.org/tests/1014115#step/updates_packagekit_gpk/48
Updated by mgriessmeier about 5 years ago
- Target version changed from Milestone 27 to Milestone 28
Updated by mgriessmeier almost 5 years ago
- Target version changed from Milestone 28 to Milestone 31
Updated by tjyrinki_suse over 4 years ago
- Subject changed from [qam][opensuse][sporadic] openSUSE Leap 15.0 started failing updates_packagekit_gpk test to [functional][u][opensuse][sporadic] openSUSE Leap 15.1 fails updates_packagekit_gpk test
- Start date deleted (
2019-02-09)
Leap 15.0 is now EOL, but this still happens sporadically for 15.1 updates_packagekit_gpk and it looks the similar kind of problems like originally, not any different https://openqa.opensuse.org/tests/1283905
This used to be in QSF-u backlog for a long time and the milestones are still being updated accordingly, should it be still there as the problem remains as originally described?
Updated by SLindoMansilla over 4 years ago
- Status changed from New to Rejected
- Assignee set to SLindoMansilla
- Target version changed from Milestone 31 to Milestone 30
Not reproducible as described in this ticket.
The last occurrence of a failing updates_packagekit_gpk is caused by missing keys (notice missing pair single quote) and clicks (needle expects xterm to be closed): https://openqa.opensuse.org/tests/1014115#step/updates_packagekit_gpk/51