Project

General

Profile

Actions

action #157432

closed

parted /dev/sda disk got error at powerVM worker

Added by tinawang123 about 1 month ago. Updated about 12 hours ago.

Status:
Rejected
Priority:
Low
Assignee:
Category:
Support
Target version:
Start date:
2024-03-18
Due date:
% Done:

0%

Estimated time:

Description

Failed job:
https://openqa.suse.de/tests/13768555#step/bootloader_start/39
Reproduce steps:

  1. wipefs -af /dev/sda erased disk successful.
  2. sync
  3. parted -s /dev/sda mklabel gpt Error: Partition(s)2,3 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making future changes.
Actions #1

Updated by okurz about 1 month ago

  • Category set to Support
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready
Actions #2

Updated by okurz about 1 month ago

@tinawang123 I don't think we can anything about that from openQA side. This is quite usual behaviour I have seen in Linux regardless of the architecture and definitely not related to a specific machine. You need to handle that behaviour in test code accordingly.

Actions #3

Updated by JERiveraMoya about 1 month ago

  • Subject changed from [tools] parted /dev/sda disk got error at powerVM worker to parted /dev/sda disk got error at powerVM worker

okurz wrote in #note-2:

@tinawang123 I don't think we can anything about that from openQA side. This is quite usual behaviour I have seen in Linux regardless of the architecture and definitely not related to a specific machine. You need to handle that behaviour in test code accordingly.

The reason to introduce this wiping + partitioning was to make more stable the tests (because the disk was reused with whatever happened before making it unpredictable with sporadic failures), this has been working for ages in powervm and for example in s390x we do something similar, https://openqa.suse.de/tests/13783114#step/bootloader_start/49.
You can compare with what we expect for pvm with old passing job: https://openqa.suse.de/tests/11163215#step/bootloader_start/35.

Is there any other way to have a fresh lpar there? to reboot at that point and handling on the test would be an overkill, this new setup for some reason does't allow to perform that operation. Googling I hit some result regarding potential kernel issues (but not idea...honestly).

Actions #4

Updated by okurz about 1 month ago

JERiveraMoya wrote in #note-3:

Is there any other way to have a fresh lpar there?

Potentially by wiping the LPAR assigned storage from novalink at the beginning of the test execution.
As alternative one could try to force a refresh of storage devices from the Linux system.

to reboot at that point and handling on the test would be an overkill, this new setup for some reason does't allow to perform that operation.

What do you mean with "new setup"? What we have now in PRG2 are the very same machines that were already used in before.

Actions #5

Updated by openqa_review about 1 month ago

  • Due date set to 2024-04-02

Setting due date based on mean cycle time of SUSE QE Tools

Actions #6

Updated by okurz about 1 month ago

  • Status changed from In Progress to Feedback
Actions #7

Updated by okurz about 1 month ago

  • Due date changed from 2024-04-02 to 2024-04-30
  • Priority changed from Normal to Low
  • Target version changed from Ready to Tools - Next

no response. Following up with lower prio

Actions #8

Updated by okurz about 1 month ago

  • Due date deleted (2024-04-30)
  • Status changed from Feedback to Rejected
  • Target version changed from Tools - Next to Ready

rejecting due to no response

Actions #9

Updated by JERiveraMoya about 1 month ago ยท Edited

Unfortunately, latest build doesn't pass the bootloader to give you any feedback here (sorry for the delay in any case).
Once that happens we might consider your advice there (although technically I don't know how can be done).
The issues most likely will persist, but if you prefer reject it for now we can reopen later, up to you how to handle it.
The other point is that now I know that they are the same machines that have existing issues from years ago, thanks for that info.

Actions #10

Updated by JERiveraMoya about 12 hours ago

Here is the sporadic issue: https://openqa.suse.de/tests/latest?arch=ppc64le&distri=sle&flavor=Online&machine=ppc64le-spvm&test=create_hdd_textmode_yast&version=15-SP6#next_previous

What command did you suggest that could be run before the parted to refresh the storage?
For the other suggestion we don't have expertise to remove LPAR assigned storage from novalink.

Can be connected with https://progress.opensuse.org/issues/157447 ?
If you need a new ticket instead of reopening this one, please let us know.

Actions

Also available in: Atom PDF