Project

General

Profile

Actions

action #76951

closed

Check if new firmware for kerosene (aka. power8.o.o) exists and remove os-autoinst workarounds again when according machine settings are applied when necessary size:M

Added by okurz over 3 years ago. Updated 22 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2020-10-25
Due date:
2024-04-17
% Done:

0%

Estimated time:

Description

Motivation

New firmware might help to prevent qemu failing to run. If we find new firmware we could remove the parameters in os-autoinst again, see clone source ticket

Suggestions

  • Read about context of the needed workaround #75259
  • Currently https://kerosene-sp.qe.nue2.suse.org lists FW840.00. Compare to other machines like diesel+petrol to see if there is a newer ASM version?
  • Look for new firmware for the machine, just search for new firmware on IBM web pages
  • Check if the new firmware means we do not need https://github.com/os-autoinst/os-autoinst/pull/1554 anymore, if yes, remove again, if no, remove again but add according settings to the machine settings in openQA, this is also what "adamw" did:
[04/11/2020 17:41:52] <adamw> okurz: i don't really know what the consequences of it are, but i tend to the idea that qemu wouldn't be trying to make it the default without reason :) i can ask some virt guys if you like
[04/11/2020 17:42:09] <adamw> okurz: but on the whole, yes, it seems to be it'd be more appropriate to put it in your templates rather than hardwire it into os-autoinst.
[04/11/2020 17:42:29] <adamw> that's what i was doing when we had the problem (i was setting an older machine type in our ppc64le Machine vars)

Related issues 3 (0 open3 closed)

Related to openQA Infrastructure - action #63142: Upgrade firmware of ppc9 machine redcurrantRejectednicksinger2020-02-05

Actions
Copied from openQA Infrastructure - action #75259: 100% of powerpc tests incomplete auto_review:"(?s)Running on power8.*qemu-system-ppc64: Requested safe cache capability level not supported by kvm":retryResolvedokurz2020-10-25

Actions
Copied to openQA Infrastructure - action #158526: Apply the latest firmware+BIOS upgrade for diesel as well size:SResolvedmkittler

Actions
Actions #1

Updated by okurz over 3 years ago

  • Copied from action #75259: 100% of powerpc tests incomplete auto_review:"(?s)Running on power8.*qemu-system-ppc64: Requested safe cache capability level not supported by kvm":retry added
Actions #2

Updated by okurz over 3 years ago

  • Related to action #63142: Upgrade firmware of ppc9 machine redcurrant added
Actions #3

Updated by okurz over 3 years ago

  • Assignee set to nicksinger

as you want to involve aeisner could you please also ask him about power8.o.o

Actions #4

Updated by okurz over 3 years ago

  • Subject changed from Check if new firmware for power8.o.o exists to Check if new firmware for power8.o.o exists and remove os-autoinst workarounds again when according machine settings are applied when necessary
  • Description updated (diff)
Actions #5

Updated by okurz over 3 years ago

  • Target version changed from future to Ready

@nicksinger as discussed you will create a ticket to infra and CC osd-admins@suse.de and aeisner

Actions #6

Updated by nicksinger over 3 years ago

  • Target version changed from Ready to future

I've created an infra ticket which you should already see on the osd-admins ML (Subject: [openQA][ppc] Please upgrade the firmware(s) for power8.opensuse.org). RT might reveal our ticket id in some minutes too…

Actions #7

Updated by nicksinger almost 2 years ago

  • Assignee deleted (nicksinger)

I think we cannot expect anybody else then us to take this task. As I'm not planning to work on this task any time soon I unassign myself. It could be worth to do some kind of mob/pair session on this topic.

Actions #8

Updated by okurz about 1 year ago

  • Tags set to infra
Actions #9

Updated by okurz 3 months ago

  • Target version changed from future to Tools - Next
Actions #10

Updated by okurz 3 months ago

  • Target version changed from Tools - Next to Ready
Actions #11

Updated by okurz 2 months ago

  • Subject changed from Check if new firmware for power8.o.o exists and remove os-autoinst workarounds again when according machine settings are applied when necessary to Check if new firmware for kerosene (aka. power8.o.o) exists and remove os-autoinst workarounds again when according machine settings are applied when necessary
  • Description updated (diff)
  • Status changed from New to Workable
Actions #12

Updated by okurz 2 months ago

  • Subject changed from Check if new firmware for kerosene (aka. power8.o.o) exists and remove os-autoinst workarounds again when according machine settings are applied when necessary to Check if new firmware for kerosene (aka. power8.o.o) exists and remove os-autoinst workarounds again when according machine settings are applied when necessary size:M
Actions #13

Updated by okurz 29 days ago

  • Priority changed from Low to Normal
Actions #14

Updated by okurz 24 days ago

  • Status changed from Workable to In Progress
  • Assignee set to okurz
Actions #15

Updated by nicksinger 24 days ago

  • Assignee changed from okurz to nicksinger
Actions #16

Updated by nicksinger 24 days ago

Used https://www.ibm.com/support/fixcentral/main/selectFixes to first enter the machine type "8247-22L" and then following the wizard to receive a download link for "POWER8 System Firmware SV860_245 (FW860.B3)" and grabbed the tar.gz file.
I then followed https://www.ibm.com/support/pages/node/6985523 "Installing the Firmware" to install it which immediately rebooted the system. It took ~5 minutes before anything came back online again but we now see "FW860.B3" in the ASM web interface.
Haven't tested yet if the workaround for qemu is still needed or if we need a different firmware (e.g. for PCIe cards) for that. Will do that now.

Actions #17

Updated by nicksinger 24 days ago

https://github.com/os-autoinst/os-autoinst/pull/1554 mentions the previous error we had:

QEMU: qemu-system-ppc64: Requested safe cache capability level not supported by kvm, try appending -machine cap-cfpc=broken

now we have:

kerosene-8:~ # /usr/bin/qemu-system-ppc64
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ccf-assist=on

"workaround" sounds better than "broken" so maybe we already improved? I'm trying to research a little more about this to understand the impact and possible fixes (more firmwares?)

Actions #18

Updated by nicksinger 24 days ago

nevermind, with "-enable-kvm" everything works just fine. I hot-patched the worker to see if the mentioned options are still needed. So far it looks quite successful:

Actions #19

Updated by nicksinger 23 days ago

  • Status changed from In Progress to Feedback
Actions #20

Updated by okurz 23 days ago

  • Due date set to 2024-04-17

https://github.com/os-autoinst/os-autoinst/pull/2480 merged. osd-deployment was stuck, currently running https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/1066124, please monitor impact on o3+osd and if no related job failures show up resolve.

Actions #21

Updated by nicksinger 23 days ago

  • Status changed from Feedback to In Progress

Got feedback about mania: https://suse.slack.com/archives/C02CANHLANP/p1712233554981339 which fails https://openqa.suse.de/tests/13943972 because of:

[2024-04-04T12:22:56.372110Z] [warn] [pid:11706] !!! : qemu-system-ppc64: Requested count cache flush assist capability level not supported by KVM
[2024-04-04T12:22:56.372192Z] [debug] [pid:11706] QEMU: Try appending -machine cap-ccf-assist=off

Machine was still running FW860.42 (from 2018), I just conducted the upgrade to FW860.B3 (from 2023).

Actions #22

Updated by nicksinger 22 days ago

New firmware commited onto mania after validating it works correctly again. Diesel is fine as well. Grenache didn't execute tests for a long time but given it is hmc-controlled the change should have no impact on that one except we use nedsted virt somewhere.

This leaves only petrol as PPC-worker in OSD and it indeed shows the same problem. I'm going to upgrade the firmware there as well.

Actions #23

Updated by nicksinger 22 days ago

petrol is our first different PowerPC platform. I looked up the product ID in http://petrol-sp.qe.nue2.suse.org -> "FRU Information" -> "FRU Device ID: 3" which is "8335-GCA" and used it to download "OP820" for it with version "OP8_v1.12_2.98". The included readme (https://ak-delivery04-mul.dhe.ibm.com/sar/CMA/SFA/08ct2/0/S822LC-8335-GCA-GTA-OpenPowerReadme_op820.30.xhtml) mentioned the necessary ipmitool commands to flash it:

ipmitool -H <BMC_IP> -U ADMIN  -I lanplus -P admin -z 30000 hpm upgrade <xxxxx.hpm> component 0 force
ipmitool -H <BMC_IP> -U ADMIN  -I lanplus  -P admin -z 30000 hpm upgrade <xxxxx.hpm> component 1 force
# Wait for BMC to reboot  (It takes about 2-5 minutes for BMC to reach ready state. The 5 minute wait is recommended)..
ipmitool -H <BMC_IP> -I lan -U ADMIN -P admin raw 0x3a 0x0a — If it returns 0x00 then BMC is at ready state otherwise it is not yet ready to continue with next step
ipmitool -H <BMC_IP>  -U ADMIN -I lanplus  -P admin -z 30000 hpm upgrade <xxxxx.hpm> component 2 force

I used mania to execute these commands as it is in the same network and it reduces the risk of a failed flash. After a reboot of the machine the qemu issues are gone:
https://openqa.suse.de/tests/13949433

Actions #25

Updated by okurz 22 days ago

  • Copied to action #158526: Apply the latest firmware+BIOS upgrade for diesel as well size:S added
Actions

Also available in: Atom PDF