action #61907
closed[kernel][ppc64le] Update kdump memory size for ppc64le
0%
Description
test fails in kdump_and_crash Leap15.2 ppc64le
crash is reporting following error with new Leap15.2 ppc64le build 69.1:
crash: invalid kernel virtual address: 9 type: "first vmlist addr"
Errors like the one above typically occur when the kernels and memory source
do not match. These are the files being used:
KERNEL: /boot/vmlinux-5.3.16-lp152.1-default
DEBUGINFO: /usr/lib/debug/boot/vmlinux-5.3.16-lp152.1-default.debug
DUMPFILE: /var/crash/2020-01-07-18:10/vmcore
build 69.1 (kernel: 5.3.16-lp152.1-default crash: 7.2.6)
https://openqa.opensuse.org/tests/1135914#step/kdump_and_crash/71
This is the first test trial since 50.1 one that passed:
build 50.1 (kernel: 5.3.13-lp152.1-default crash: 7.2.6)
https://openqa.opensuse.org/tests/1109602#step/kdump_and_crash/71
Observation¶
openQA test in scenario opensuse-15.2-DVD-ppc64le-extra_tests_in_textmode@ppc64le fails in
kdump_and_crash
Test suite description¶
Maintainer: okurz@suse.de
Mainly console extratest.
Reproducible¶
Fails since (at least) Build 28.1
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by michel_mno over 4 years ago
another test failed with different signature, currently assume this is same issue on
https://openqa.opensuse.org/tests/1141833#step/kdump_and_crash/65 for TW snapshot 20200111
crash: invalid kernel virtual address: 7097ab71e7ab670c type: "list entry"
SCRIPT_FINISHEDuYP6m-1-
at /usr/lib/os-autoinst/testapi.pm line 1091.
Updated by okurz over 4 years ago
- Subject changed from test fails in kdump_and_crash Leap15.2 ppc64le to [qam][ppc64le] test fails in kdump_and_crash Leap15.2 ppc64le
- Assignee set to pcervinka
I don't see where the tests are doing anything wrong, looks more like a product issue to be tracked in bugzilla.opensuse.org for me.
@pcervinka as test module maintainer, can you comment?
Updated by okurz over 4 years ago
- Subject changed from [qam][ppc64le] test fails in kdump_and_crash Leap15.2 ppc64le to [kernel][qam][ppc64le] test fails in kdump_and_crash Leap15.2 ppc64le
Updated by pcervinka over 4 years ago
- Related to action #61082: [functional][u] test fails in kdump_and_crash - The test needs adaptions added
Updated by pcervinka over 4 years ago
- Status changed from New to In Progress
I believe it is result of change which removed memory increase https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9269
Updated by SLindoMansilla over 4 years ago
pcervinka wrote:
I believe it is result of change which removed memory increase https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9269
I think a new bug is happening.
I had to remove the workaround because, apart from the bug being marked as resolved, each time the workaround was applied the test failed on aarch64 and ppc64le: https://openqa.suse.de/tests/3767851#step/kdump_and_crash/57
The check_screen failed sometimes sporadically, causing the test to not apply the workaround, and it worked: https://openqa.suse.de/tests/3620940#step/kdump_and_crash/36
After removing the workaround, the test worked on the last build: https://progress.opensuse.org/issues/61082#note-6
Updated by pcervinka over 4 years ago
No, you should mark your failure with already created bug https://bugzilla.suse.com/show_bug.cgi?id=1158540 for aarch64, which exactly describes whole situation.
Anyway I will fix current situation for ppc64le.
Updated by pcervinka over 4 years ago
Here is the run for ppc64le with reverted workaround https://openqa.suse.de/tests/3811574, which was successful (as expected).
Workaround is not needed for aarch64, but ppc64le still needs more memory. Also reference in previous workaround was not valid anymore. So I will find better bug for it or create new one which will better reflect reality.
Updated by SLindoMansilla over 4 years ago
pcervinka wrote:
Here is the run for ppc64le with reverted workaround https://openqa.suse.de/tests/3811574, which was successful (as expected).
Workaround is not needed for aarch64, but ppc64le still needs more memory. Also reference in previous workaround was not valid anymore. So I will find better bug for it or create new one which will better reflect reality.
Thanks!
This is what I meant. It is a new bug introduced in the new build. The last build worked only without the workaround: https://openqa.suse.de/tests/3776603#step/kdump_and_crash/69
Updated by pcervinka over 4 years ago
Not exactly new for the build, issue was there all the time masked/solved by the workaround which set kdump memory to 640MB. Can't comment, why your job verification was fine. Kdump in kernel group started to fail just after workaround removal https://openqa.suse.de/tests/3766514#next_previous. But we can say, that history of this workaround is not clear, issue for aarch64 is gone, original bug is closed. I will reintroduce workaround for ppc64le, with better reference (this will be funny part).
Updated by pcervinka over 4 years ago
Also detection of "lower memory" in workaround by needles, was not stable at all and created many junk(working for short time) needles. This must be redone to make logic flow more transparent.
Updated by pcervinka over 4 years ago
Experimental run on spvm with reverted change for ppc64le https://openqa.suse.de/tests/3813095.
Updated by pcervinka over 4 years ago
- Subject changed from [kernel][qam][ppc64le] test fails in kdump_and_crash Leap15.2 ppc64le to [kernel][ppc64le] Update kdump memory size for ppc64le
Updated by pcervinka over 4 years ago
Updated by pcervinka over 4 years ago
Updated by pcervinka over 4 years ago
- Status changed from In Progress to Feedback
Let's observe couple more runs on osd.
Updated by hjluo about 4 years ago
- Related to action #62267: [sle][Migration][SLE15SP2][Regression] test fails in install_service - kdump lead system crash added
Updated by okurz about 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: toolchain_zypper
https://openqa.suse.de/tests/3868582
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz about 4 years ago
- Status changed from Resolved to Feedback
Hi pcervinka, I see the test still failing with this ticket as label: https://openqa.suse.de/tests/3907124#step/kdump_and_crash/69
Updated by pcervinka about 4 years ago
- Status changed from Feedback to Resolved
I would say incorrect test auto-labeling, it is completely different issue (failure at different step):
[2020-02-21T02:50:39.120 UTC] [debug] output not validating at /var/lib/openqa/cache/openqa.suse.de/tests/sle/lib/kdump_utils.pm line 278.
Please, create new poo and put it on me. Thank you!
Updated by okurz about 4 years ago
sorry if my intentions weren't clear. I just crosschecked the existing reports. You could simply delete the automatically carried over comment from the referenced test scenario to prevent the false-match. I did that now and will leave it to the next test reviewer to investigate in detail what the specific issue is.
Updated by pcervinka about 4 years ago
No problem.. And when looking further on that kdump fail in toolchain_zypper, it was even wrongly marked with this poo since beginning.
Updated by SLindoMansilla about 4 years ago
I created a new ticket for aarch64: #63772
Updated by okurz about 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: toolchain_zypper
https://openqa.suse.de/tests/3973817
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz about 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: toolchain_zypper
https://openqa.suse.de/tests/3973817
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz about 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: toolchain_zypper
https://openqa.suse.de/tests/4024341
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz about 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: toolchain_zypper
https://openqa.suse.de/tests/4024341
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed