okurz@openqa:~> sudo salt -C 'G@osarch:ppc64le' cmd.run 'ppc64_cpu --smt'
petrol.qe.nue2.suse.org:
SMT is off
grenache-1.oqa.prg2.suse.org:
SMT=8
mania.qe.nue2.suse.org:
SMT is off
and also on diesel I can see SMT=8
so clearly the smt_off service did not do it's job. I did an explicit systemctl stop smt_off && systemctl start smt_off
but also that did not have an effect. Also ppc64_cpu --smt=off
is not effective. Reboot not effective.
Trying alternate paths to disable SMT already on boot time. According to https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html there is a parameter "nosmt" also available for "PPC, but we run kernel 5.3 where that flag does not yet seem to be supported according to https://www.kernel.org/doc/html/v5.3/admin-guide/kernel-parameters.html. The patch adding to PPC is described in https://lwn.net/Articles/936829/
Also
$ cat /sys/devices/system/cpu/smt/control
notimplemented
Trying a complete poweroff and on again. No effect.
Maybe it's also that an upgrade happened and diesel rebooted – on request or by accident – but petrol+mania haven't rebooted yet? On petrol in journalctl auto-update
I see that as expected kernel-default and multiple related patches are locked but at least another patch for "sed" was installed. That will certainly not trigger a request for reboot but petrol and mania show that they have been online since 58 days so there could be other updates that accumulated which might be at fault here. systemctl status
says that petrol is up since Tue 2024-01-30 15:54:02 CET; 1 month 28 days ago
so we should check /var/log/zypp/history what was installed since then. Or try an earlier snapshot.
On diesel snapper list
says
# snapper list
# | Type | Pre # | Date | User | Used Space | Cleanup | Description | Userdata
------+--------+-------+--------------------------+------+------------+---------+------------------------+--------------
0 | single | | | root | | | current |
2184* | single | | Thu Jul 28 12:57:20 2022 | root | 16.50 MiB | | writable copy of #2181 |
3166 | pre | | Tue Dec 12 06:34:11 2023 | root | 339.06 MiB | number | zypp(zypper) | important=yes
3167 | post | 3166 | Tue Dec 12 06:35:00 2023 | root | 247.69 MiB | number | | important=yes
3292 | pre | | Thu Jan 25 02:03:10 2024 | root | 313.69 MiB | number | zypp(zypper) | important=yes
3293 | post | 3292 | Thu Jan 25 02:08:18 2024 | root | 241.19 MiB | number | | important=yes
3440 | pre | | Thu Mar 14 02:03:08 2024 | root | 292.81 MiB | number | zypp(zypper) | important=yes
3441 | post | 3440 | Thu Mar 14 02:05:31 2024 | root | 176.69 MiB | number | | important=yes
3448 | pre | | Sat Mar 16 02:02:51 2024 | root | 174.44 MiB | number | zypp(zypper) | important=yes
3449 | post | 3448 | Sat Mar 16 02:03:12 2024 | root | 176.62 MiB | number | | important=yes
3472 | pre | | Tue Mar 26 02:02:57 2024 | root | 160.81 MiB | number | zypp(zypper) | important=yes
3473 | post | 3472 | Tue Mar 26 02:03:08 2024 | root | 158.19 MiB | number | | important=yes
3482 | pre | | Thu Mar 28 02:03:05 2024 | root | 182.06 MiB | number | zypp(zypper) | important=no
3483 | post | 3482 | Thu Mar 28 02:03:39 2024 | root | 5.50 MiB | number | | important=no
3484 | pre | | Thu Mar 28 06:14:51 2024 | root | 960.00 KiB | number | zypp(zypper) | important=no
3485 | post | 3484 | Thu Mar 28 06:15:14 2024 | root | 1.75 MiB | number | | important=no
3486 | pre | | Fri Mar 29 02:03:14 2024 | root | 7.06 MiB | number | zypp(zypper) | important=no
3487 | post | 3486 | Fri Mar 29 02:03:32 2024 | root | 576.00 KiB | number | | important=no
3488 | pre | | Fri Mar 29 02:04:08 2024 | root | 512.00 KiB | number | zypp(zypper) | important=no
3489 | post | 3488 | Fri Mar 29 02:04:15 2024 | root | 1.12 MiB | number | | important=no
3490 | pre | | Fri Mar 29 06:14:53 2024 | root | 1.69 MiB | number | zypp(zypper) | important=no
3491 | post | 3490 | Fri Mar 29 06:15:16 2024 | root | 1.56 MiB | number | | important=no
The snapshots on date 2024-01-25 correspond to the following section from /var/log/zypp/history:
2024-01-25 02:04:25|command|root@diesel|'zypper' '-n' '--no-refresh' '--non-interactive-include-reboot-pat
ches' 'patch' '--replacefiles' '--auto-agree-with-licenses' '--download-in-advance'|
2024-01-25 02:04:30|install|ghostscript|9.52-150000.180.1|ppc64le||repo-sle-update|f54ef118624a5ab3f6f7dd7
1a916042771f3cc23d3de2147b2ba4dc17f768ba0|
2024-01-25 02:04:30|install|ghostscript-x11|9.52-150000.180.1|ppc64le||repo-sle-update|698afe39e0b153c3121
d412fb893bcadd4e7c70447c1fbfea936313450d4c26f|
2024-01-25 02:04:30|install|libsystemd0|249.17-150400.8.40.1|ppc64le||repo-sle-update|38c0b17187c869d87b05
ada1fb52650f3b6da409cc6af08f65301b8ac1401f57|
2024-01-25 02:04:31|install|libudev1|249.17-150400.8.40.1|ppc64le||repo-sle-update|738ffa24b6b2dbe745dc60e
eefaa838b1bae2ccfec2ad0b9cce5d2e24546fe19|
2024-01-25 02:04:48|install|systemd|249.17-150400.8.40.1|ppc64le||repo-sle-update|73c2c0fe61ff700f2a1ed43b
bea574bb4a5177a15c833e2a2653a4eb6a644378|
2024-01-25 02:04:57|install|systemd-container|249.17-150400.8.40.1|ppc64le||repo-sle-update|05866df31f992d
77cf1d421ea9cbeb077b096d3abef7952e0d590ed0b1571bf7|
2024-01-25 02:04:57|install|yast2-pkg-bindings|4.5.3-150500.3.3.1|ppc64le||repo-sle-update|01afd2e363728a7
bb41773c4659a1d9b95af418e6ee9d14cd7c09a23e2ed5e05|
2024-01-25 02:05:09|install|systemd-network|249.17-150400.8.40.1|ppc64le||repo-sle-update|0af13c615c334824
1cad1a3b52a6cf16a8c88f73acf6c65eb84904da49f348a7|
2024-01-25 02:05:18|install|udev|249.17-150400.8.40.1|ppc64le||repo-sle-update|f50c81b9ade932af51457528f9d
017d33b46d2866d99c3fcdd97d9dd87769f4e|
2024-01-25 02:05:18|install|systemd-sysvinit|249.17-150400.8.40.1|ppc64le||repo-sle-update|7fbd22b5b6e6abb
fe9a24813a71ee7db687cceb040edadb67452f4deb06c42a0|
2024-01-25 02:05:18|install|systemd-lang|249.17-150400.8.40.1|noarch||repo-sle-update|97c2f4cfcadf70bd920e
f694516249ad2b8d1e2f33b98814b85f482aaeba4947|
2024-01-25 02:05:19|install|systemd-coredump|249.17-150400.8.40.1|ppc64le||repo-sle-update|7db32c9702d820f
2abbc2e0c4f07a37d0e0847dd0ce25ada65067e2141387bcf|
so I could try 3293 which was before the last boot of petrol+mania and check if SMT is still disabled in that one. After boot I could confirm that actually SMT is off. zypper dup
shows a bigger list of pending updates. Let's first crosscheck without installing anything more, mv /etc/systemd/system/auto-update.timer{,.disabled-poo158266}
and reboot. After reboot SMT still off. Now trying to install only some packages from pending zypper dup
and check effect:
sudo zypper in perl-Mojolicious-Plugin-AssetPack perl-Mojo-IOLoop-ReadWriteProcess python3-nftables yast2-theme yast2-packager yast2-network yast2-logs yast2-http-server yast2 xorg-x11-server-Xvfb xorg-x11-server wicked-service wicked wget-lang wget webkit2gtk-4_0-injected-bundles timezone stress-ng-bash-completion stress-ng sed-lang sed WebKitGTK-4.0-lang cmake cmake-full os-autoinst openQA-worker
After that I should try
sudo zypper -n in aaa_base aaa_base-extras audit bind-utils cmake-man containerd
coreutils coreutils-doc coreutils-lang cpio cpio-lang cpio-mt cpp7 dhcp docker docker-bash-completion docker-rootless-extras docker-zsh-completion gcc7 gcc7-c++ gdb ghostscript ghostscript-x11 git-core
git-gui gitk glibc glibc-devel glibc-extra glibc-i18ndata glibc-lang glibc-locale glibc-locale-base
gnutls grub2 grub2-powerpc-ieee1275 grub2-powerpc-ieee1275-extras grub2-snapper-plugin
grub2-systemd-sleep-plugin hwdata inst-source-utils kdump kpartx krb5 libOpenCL1 libaom3 libasan4
libaudit1 libauparse0 libavahi-client3 libavahi-common3 libavahi-glib1 libavcodec57 libavformat57
libavif13 libavresample3 libavutil55 libbluetooth3 libdns_sd libduktape206 libfreebl3 libgfortran4
libgif7 libgnutls30 libjasper4 libjavascriptcoregtk-4_0-18 libmaxminddb0 libmetalink3 libmpath0
libnetpbm11 libnftables1 libopenblas_openmp0 libopenblas_pthreads0 libopenssl-1_1-devel libopenssl1_0_0
libopenssl1_1 libopenssl3 libopenvswitch-2_14-0 libpq5 libpython2_7-1_0 libpython3_6m1_0 libqrencode4
libsoftokn3 libsource-highlight4 libssh-config libssh2-1 libssh4 libstdc++6-devel-gcc7 libswresample2
libswscale4 libtesseract5 libtiff5 libubsan0 libuv1 libwebkit2gtk-4_0-37 libxkbcommon-x11-0
libxkbcommon0 libxml2-2 libxml2-tools login_defs mozilla-nss mozilla-nss-certs multipath-tools netcfg
netpbm nftables nscd openssh openssh-askpass-gnome openssh-clients openssh-common openssh-helpers
openssh-server openssl-1_1 openvswitch pam-config perl-Bootloader perl-Git powerpc-utils ppc64-diag
python python-base python-curses python-xml python3 python3-M2Crypto python3-attrs python3-base
python3-bind python3-curses python3-dbm python3-nftables python3-pycryptodome python3-pyserial
python3-rpm python3-tk rpm runc sed sed-lang shadow stress-ng stress-ng-bash-completion sudo
sudo-plugin-python supportutils suse-module-tools system-group-audit systemd-presets-common-SUSE
systemd-rpm-macros tesseract-ocr timezone webkit2gtk-4_0-injected-bundles wget wget-lang wicked
wicked-service xorg-x11-server xorg-x11-server-Xvfb yast2 yast2-http-server yast2-logs yast2-network
yast2-packager yast2-theme
I see "powerpc-utils" in the list which includes the application "ppc64_cpu". I did zypper al -m "poo#158266: ppc64_cpu --smt=off becomes ineffective" powerpc-utils
and will first try to upgrade all other packages and see if that is also problematic.
Installed, rebooted, ppc64_cpu --smt
shows off so all good. The effective upgrade was powerpc-utils 1.3.11-150500.3.6.1 -> 1.3.11-150500.3.14.3
, let's see if we can bisect.
One version in between, powerpc-utils-1.3.11-150500.3.9.1.ppc64le, not affected.
Crosschecking again 1.3.11-150500.3.9.1 -> 1.3.11-150500.3.14.3
.
ppc64_cpu --smt && ppc64_cpu --smt=on && ppc64_cpu --smt && ppc64_cpu --smt=off && ppc64_cpu --smt
shows as expected
SMT=8
SMT=8
SMT=8
let's try if we need the reboot after downgrade:
zypper in --oldpackage powerpc-utils-1.3.11-150500.3.9.1
ppc64_cpu --smt && ppc64_cpu --smt=on && ppc64_cpu --smt && ppc64_cpu --smt=off && ppc64_cpu --smt
SMT is off
One or more cpus could not be on/offlined
so it's now off but I can't switch it on again. Back upgrading to the affected version
zypper -n in powerpc-utils-1.3.11-150500.3.14.3.ppc64le
ppc64_cpu --smt && ppc64_cpu --smt=on && ppc64_cpu --smt && ppc64_cpu --smt=off && ppc64_cpu --smt
SMT is off
SMT is off
SMT is off
so apparently the new version can't change it from off but was we already see after reboot if it's on it can't switch off.
I will create a proper lock in all our o3&osd ppc machines and then create a bugzilla entry
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/763 for OSD. Merged right now to have the package lock effective before petrol+mania are affected. I downgraded on kerosene (o3), petrol+mania