action #122683
closed[qe-core][leap15.4 update] fails in journalctl(sporadic issue)
100%
Description
Observation¶
openQA test in scenario opensuse-15.4-DVD-Updates-x86_64-extra_tests_textmode@64bit fails in
journalctl
Test suite description¶
Maintainer: slindomansilla@suse.de.
Mainly post-installation console extra tests.
Reproducible¶
Fails since (at least) Build 20230104-1
Expected result¶
Last good: 20230103-4 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by rfan1 almost 2 years ago
The issue is a sporadic one, but can be seen very often!
I am not sure it is a product bug or not, need some experts' investigation.
Updated by dvenkatachala almost 2 years ago
- Status changed from New to In Progress
Updated by ph03nix almost 2 years ago
I can take over. This ticket is for two issues:
Updated by ph03nix almost 2 years ago
Ok, managed to clone the isolated job after some fiddling: https://duck-norris.qe.suse.de/tests/11659
Updated by ph03nix almost 2 years ago
The expected output of journalctl --list-boots
should be something of the following:
-1 1f7a4db8b7324b9da299c39265e74076 Mon 2023-01-09 07:51:02 ESTâÂÂMon 2023-01-09 07:52:28 EST
0 d12abe675e6a4eca9c745cae03929f4f Mon 2023-01-09 07:53:03 ESTâÂÂMon 2023-01-09 07:53:17 EST
Note that the ID from -1
is unique but persistent (id of boot 0 will be id of boot -1 after a reboot).
In the error case, the previous boot log is missing: https://openqa.opensuse.org/tests/3025859#step/journalctl/101
0 1c11c7f913664a28a73b603adcaa5bd7 Mon 2023-01-09 01:39:52 ESTâÂÂMon 2023-01-09 02:17:26 EST
SCRIPT_FINISHEDrRGty-0-
Updated by ph03nix almost 2 years ago
First PR: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16197
This is an attempt to fix the list-boot
issue. We need to keep monitoring the situation.
Updated by ph03nix almost 2 years ago
What's interesting is that https://openqa.opensuse.org/tests/3025859#step/journalctl/101 suggests, that the previous boot is shown, but the current one not (yet?):
# journalctl --list-boots
0 1c11c7f913664a28a73b603adcaa5bd7 Mon 2023-01-09 01:39:52 ESTâÂÂMon 2023-01-09 02:17:26 EST
The time range from 01:39:52 to 02:17:26 suggests that this is the previous boot. What we are missing is the listing of the current boot. Perhaps a flush
after a fresh reboot might help here?
Updated by ph03nix almost 2 years ago
Ok, I could confuse journald by setting the time to the future:
leap15-4:~ # journalctl --list-boot
-5 d9773467a0b1482c88da5943aa1d27e1 Tue 2023-01-10 11:05:36 CET—Tue 2023-01-10 11:37:27 CET
-4 8117a46f8ea149dd9d5e128cf99f8df8 Tue 2023-01-10 11:37:39 CET—Tue 2023-01-10 11:38:35 CET
-3 35a4266dd6834bffbcc8ad5db85ea2b2 Tue 2023-01-10 11:38:48 CET—Tue 2023-01-10 01:02:11 CET
-2 6aec08ab7e1349199aaf743842151c5d Tue 2023-01-10 11:40:01 CET—Tue 2023-01-10 01:02:18 CET
-1 3e0679efb092442e8b033920344bb17f Tue 2023-01-10 11:48:14 CET—Tue 2023-01-10 01:00:09 CET
0 6b3f8bf2b76f44a8ba43b66f1316e92d Tue 2023-01-10 11:54:04 CET—Tue 2023-01-10 23:26:15 CET
leap15-4:~ # date --set 14:15:16
Tue Jan 10 14:15:16 CET 2023
leap15-4:~ # reboot
Connection to leap15_4 closed by remote host.
Connection to leap15_4 closed.
phoenix@racetrack-7290:~> ssh leap15_4
Last login: Tue Jan 10 11:57:19 2023 from 192.168.122.1
==== openSUSE Leap 15.4 | Test system ==========================================
== ==
== Welcome on the openSUSE Leap 15.4 test system. ==
== This machine is a disposable test machine, so don't commit any changes ==
== 2 cores | 2 GB RAM | 20 GB HDD ==
== 192.168.122.7 ==
== ==
===================================================== Have a lot of fun ... ====
leap15-4:~ # journalctl --list-boot
-5 d9773467a0b1482c88da5943aa1d27e1 Tue 2023-01-10 11:05:36 CET—Tue 2023-01-10 11:37:27 CET
-4 8117a46f8ea149dd9d5e128cf99f8df8 Tue 2023-01-10 11:37:39 CET—Tue 2023-01-10 11:38:35 CET
-3 35a4266dd6834bffbcc8ad5db85ea2b2 Tue 2023-01-10 11:38:48 CET—Tue 2023-01-10 01:02:11 CET
-2 6aec08ab7e1349199aaf743842151c5d Tue 2023-01-10 11:40:01 CET—Tue 2023-01-10 01:02:18 CET
-1 3e0679efb092442e8b033920344bb17f Tue 2023-01-10 11:48:14 CET—Tue 2023-01-10 01:00:09 CET
0 6b3f8bf2b76f44a8ba43b66f1316e92d Tue 2023-01-10 11:54:04 CET—Tue 2023-01-10 23:26:15 CET
leap15-4:~ # journalctl --flush
leap15-4:~ # journalctl --list-boot
-5 d9773467a0b1482c88da5943aa1d27e1 Tue 2023-01-10 11:05:36 CET—Tue 2023-01-10 11:37:27 CET
-4 8117a46f8ea149dd9d5e128cf99f8df8 Tue 2023-01-10 11:37:39 CET—Tue 2023-01-10 11:38:35 CET
-3 35a4266dd6834bffbcc8ad5db85ea2b2 Tue 2023-01-10 11:38:48 CET—Tue 2023-01-10 01:02:11 CET
-2 6aec08ab7e1349199aaf743842151c5d Tue 2023-01-10 11:40:01 CET—Tue 2023-01-10 01:02:18 CET
-1 3e0679efb092442e8b033920344bb17f Tue 2023-01-10 11:48:14 CET—Tue 2023-01-10 01:00:09 CET
0 6b3f8bf2b76f44a8ba43b66f1316e92d Tue 2023-01-10 11:54:04 CET—Tue 2023-01-10 23:26:15 CET
leap15-4:~ # timedatectl
Local time: Tue 2023-01-10 11:59:27 CET
Universal time: Tue 2023-01-10 10:59:27 UTC
RTC time: Tue 2023-01-10 00:05:45
Time zone: Europe/Amsterdam (CET, +0100)
System clock synchronized: no
NTP service: inactive
RTC in local TZ: no
Will investigate further.
Updated by ph03nix over 1 year ago
Ok, so the issue is that the log is not being present, if the time is set to a value that is between the range of a previous boot. e.g. if the previous boot is between 13:26:20 CET
and 16:00:08 CET
(see below) and the system time is set to a value between those, then it will not show up.
See the following reproducer:
leap15-4:~ # journalctl --list-boot
0 fe15c0cee26e47559e3c11b149b440a2 Tue 2023-01-10 13:26:20 CET—Tue 2023-01-10 13:29:06 CET
leap15-4:~ # date --set="20:00:00"
Tue Jan 10 20:00:00 CET 2023
leap15-4:~ # timedatectl
Local time: Tue 2023-01-10 20:00:05 CET
Universal time: Tue 2023-01-10 19:00:05 UTC
RTC time: Tue 2023-01-10 12:32:45
Time zone: Europe/Amsterdam (CET, +0100)
System clock synchronized: no
NTP service: active
RTC in local TZ: no
leap15-4:~ # journalctl --flush
leap15-4:~ # journalctl --list-boot
0 fe15c0cee26e47559e3c11b149b440a2 Tue 2023-01-10 13:26:20 CET—Tue 2023-01-10 20:00:05 CET
leap15-4:~ # date --set="16:00:00"
Tue Jan 10 16:00:00 CET 2023
leap15-4:~ # reboot
leap15-4:~ # journalctl --list-boot
0 fe15c0cee26e47559e3c11b149b440a2 Tue 2023-01-10 13:26:20 CET—Tue 2023-01-10 16:00:08 CET
leap15-4:~ #
We need to ensure the time is set correctly before enabling the persistent journal, otherwise we might run into this issue.
Updated by ph03nix over 1 year ago
We should also move the journalctl
test to be executed after the ntp_client
test: https://openqa.opensuse.org/tests/3011030 to avoid time synchronization issues.
Updated by ph03nix over 1 year ago
Move journalctl after ntp_client
: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16209
Updated by ph03nix over 1 year ago
Found upstream bug: https://bugzilla.suse.com/show_bug.cgi?id=1207042
Updated by rfan1 over 1 year ago
- Related to action #122974: [qe-core][qem]test fails in journalctl, we may need start "chronyd" service at first added