action #69432
closedtest fails with no module details after boot_ltp, broken run-time scheduling?
0%
Description
Observation¶
openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-ltp_aio_stress_part1@64bit fails after
boot_ltp
with no details after that. mdoucha suspects
<okurz> do you think the os-autoinst change "Avoid updating last_good if there is no possible user of it" can explain the ltp failures?
<mdoucha> Possible but unlikely. KLP tests create snapshot as well but work fine. https://openqa.suse.de/tests/4500234
<mdoucha> This looks like broken run-time scheduling. Modules added after VM start are ignored by os-autoinst
<okurz> o3 shows the same problems since 5 days, nobody seems to have realized that: https://openqa.opensuse.org/tests/1340472
<mdoucha> good, that narrows it down to 2 or 3 days, not a full week
Reproducible¶
Fails since Build 20200721 (current job)
Expected result¶
Last good: 20200720
Further details¶
Always latest result in this scenario: latest
Updated by okurz over 4 years ago
Comparing os-autoinst versions I see:
$ git log1 --no-merges dc25ddd8..7963b3d4
ef154996 Avoid updating last_good if there is no possible user of it
98de5809 Simplify runalltests in autotest.pm
e6593f21 Simplify passing test list in tools/invoke-tests
ce0023a1 Fix link to architecture documentation
1eaf6e49 Improve build instructions in README, mainly to cover CMake
990c8f62 CMake: Tweak test execution
d4ffa525 Improve argument parsing and source directory handling in tools/invoke-tests
a32956f9 CMake: Add targets for computing test coverage
54ace987 CMake: Add targets for invoking tests
092821da CMake: Add target for updating dependencies
cf2c737f (okurz/feature/base_os) docker: Bump base OS version to Leap 15.2
I consider as likely candidates:
ef154996 Avoid updating last_good if there is no possible user of it
98de5809 Simplify runalltests in autotest.pm
e6593f21 Simplify passing test list in tools/invoke-tests
for a test scenario when we for example partially revert we could pick any of these failing ltp test cases which are also fast to run, e.g. https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Incidents-Kernel&machine=64bit&test=ltp_input&version=12-SP4
At first for reproduction:
openqa-clone-job --within-instance https://openqa.suse.de --skip-chained-deps 4500252 WORKER_CLASS=openqaworker5 TEST=okurz_poo69432_ltp_input _GROUP=0 BUILD=X
Created job #4500617: sle-12-SP4-Server-DVD-Incidents-Kernel-x86_64-Build:15909:kernel-ec2-ltp_input@64bit -> https://openqa.suse.de/t4500617
Created revert https://github.com/os-autoinst/os-autoinst/pull/1490 and applied hotfix on openqaworker5:
curl -s https://raw.githubusercontent.com/os-autoinst/os-autoinst/revert-1483-snapoptim/autotest.pm > /usr/lib/os-autoinst/autotest.pm
triggered new test, passed. Hotpatched all osd workers with salt on osd:
sudo salt -l error --state-output=changes -C 'G@roles:worker' cmd.run 'curl -s https://raw.githubusercontent.com/os-autoinst/os-autoinst/revert-1483-snapoptim/autotest.pm > /usr/lib/os-autoinst/autotest.pm'
mdoucha will retrigger.
Updated by MDoucha over 4 years ago
The bug is caused specifically by this change in autotest.pm:
- for my $t (@testorder) {
+ for my $testindex (0 .. $#testorder) {
+ my $t = $testorder[$testindex];
If @testorder changes during VM runtime, the index sequence will not be updated and the newly added test modules will be ignored.
I recommend using ltp_math
for debugging, it's the fastest LTP job that uses run-time scheduling.
Updated by okurz over 4 years ago
- Related to action #52673: os-autoinst: Do not save "lastgood" snapshot on last module unless img is preserved with snapshot (e.g. --no-cleanup) added
Updated by okurz over 4 years ago
- Status changed from In Progress to Resolved
@favogt has provided his original PR with the fix in https://github.com/os-autoinst/os-autoinst/pull/1492 . I confirmed it working fine on openqaworker5 with https://openqa.suse.de/t4501133
The current state on osd is ok again and also git master is fixed with the revert. The merge and verification of the changes by favogt are left for #52673