action #44468
open[qe-core] Proper handling of assets for svirt workers
0%
Description
Motivation / Observation¶
Trying to verify a SR to fix the systemd-testsuite, I was blocked because OSD workers and QSF shared workers were malfunctioning.
The problem with QSF shared workers is that the needed qcow2 image was missing:
- sle-15-SP1-s390x-*-textmode@s390x-kvm-sle12.qcow2
Because
- openqa.suse.de:/var/lib/openqa/share/factory
was NFS-mounted into
- s390p8.suse.de:/var/lib/openqa/share/factory
And the image were already cleaned up from openqa.suse.de:/var/lib/openqa/share/factory.
QSF shared-workers (shared-workers.qa.suse.de) have CACHEDIRECTORY configured, but this directory is ignored by the svirt backend.
Only some changes in the test code are necessary to take CACHEDIRECTORY into account, but some configuration is needed at infrastructure level.
- shared-workers.qa.suse.de:/var/lib/openqa/cache
needs to be made available from
- s390p8.suse.de:/var/lib/openqa/cache
Acceptance criteria¶
- AC: SUT's host machine on svirt backends (ie. s390p8.suse.de), which have a jump host (ie. shared-workers.qa.suse.de) with configured CACHEDIRECTORY, have available assets from that CACHEDIRECTORY.
Suggestions¶
- Look at what has already been done in that area, e.g. #44468#note-30.
Files
Updated by okurz about 6 years ago
- Related to action #32281: Can't locate images in Xen jobs added
Updated by SLindoMansilla about 6 years ago
- Related to action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation added
Updated by SLindoMansilla about 6 years ago
- Related to deleted (action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation)
Updated by SLindoMansilla about 6 years ago
- Blocks action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation added
Updated by SLindoMansilla about 6 years ago
In the attached draft can be seen the current status (black, blue, green) and the expected status (red).
Updated by SLindoMansilla about 6 years ago
- Status changed from New to In Progress
- Assignee set to SLindoMansilla
As others are also affected: http://openqa-apac1.suse.de/tests/2417#step/bootloader_zkvm/6
this ticket is gaining priority:
Updated by SLindoMansilla about 6 years ago
Make a separate PR for the mandatory fix: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6320
Merged.
Waiting for mgriessmeier's and nsinger's feedback.
Updated by SLindoMansilla about 6 years ago
- Status changed from In Progress to New
What was done¶
The already merged fix was to look for assets in the right machine. Before, the test was looking for assets on the worker machine. On OSD it was working because the assets were NFS mounted from the webui across the worker and the SUT's host.
On shared-workers.qa.suse.de we don't have the NFS share mounted, which caused the test to not find the assets.
This is fixed now in this PR: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6191
What is still missing?¶
The worker cache is not propagated to the svirt remote machine, so the assets are missing there.
We need to discuss how to resolve this problem.
My proposal is to NFS-mount the cache directory from the worker into the SUT's host.
After that, some changes in the test code are needed: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6253
Updated by okurz almost 6 years ago
- Target version changed from Milestone 22 to Milestone 24
Updated by okurz almost 6 years ago
- Assignee deleted (
SLindoMansilla) - Target version changed from Milestone 24 to Milestone 25
Let's delay the work on this one a bit further.
Updated by SLindoMansilla almost 6 years ago
- Status changed from New to Workable
- Assignee set to mgriessmeier
As spoken in the grooming meeting, we want to NFS mount the directory
from shared-workers.qa.suse.de:/var/lib/openqa/cache
into [zVM|zKVM|sKVM|XEN]:/var/lib/openqa/cache
Updated by mgriessmeier over 5 years ago
- Status changed from Workable to In Progress
Updated by mgriessmeier over 5 years ago
- Target version changed from Milestone 25 to Milestone 26
Updated by mgriessmeier over 5 years ago
- Status changed from In Progress to Feedback
- Target version changed from Milestone 26 to Milestone 27
SLindoMansilla wrote:
As spoken in the grooming meeting, we want to NFS mount the directory
from shared-workers.qa.suse.de:/var/lib/openqa/cache
into [zVM|zKVM|sKVM|XEN]:/var/lib/openqa/cache
mounted on s390p8 - please check if it's working as expected, then I will do the rest
Updated by SLindoMansilla over 5 years ago
Who should check?
I think we forgot about this ticket.
PR closed and opened draft PR: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8378
Updated by mgriessmeier over 5 years ago
- Target version changed from Milestone 27 to Milestone 28
let's rediscuss in grooming
Updated by mgriessmeier about 5 years ago
- Target version changed from Milestone 28 to Milestone 30
needs to be discussed offline
Updated by SLindoMansilla almost 5 years ago
- Status changed from Feedback to Workable
- Assignee changed from mgriessmeier to SLindoMansilla
Tasks¶
- Look what was done since 8 months regarding NFS on shared-workers
- Refine the ticket in a grooming meeting if more work has to be done
Updated by szarate over 4 years ago
Sergio, do you really plan to work on this ticket?
Also: xen and hyperv are also part of the svirt thing, i guess all this could be unified, as the directories where things are copied, is the same.
Updated by SLindoMansilla over 4 years ago
szarate wrote:
Sergio, do you really plan to work on this ticket?
Also: xen and hyperv are also part of the svirt thing, i guess all this could be unified, as the directories where things are copied, is the same.
Yes, I "plan" (have the intention) to look into all tickets I have assigned, but I cannot promise any date...
So, if you feel like doing it, please take over. If not, be sure that "some day" I will continue working on this.
Updated by szarate over 4 years ago
- Assignee deleted (
SLindoMansilla) - Priority changed from High to Normal
Updated by szarate over 4 years ago
- Subject changed from [functional][u][labs] Proper handling of assets for svirt workers to [functional][u][tools] Proper handling of assets for svirt workers
Updated by tjyrinki_suse over 4 years ago
- Subject changed from [functional][u][tools] Proper handling of assets for svirt workers to [qe-core][functional][tools] Proper handling of assets for svirt workers
Updated by SLindoMansilla almost 4 years ago
- Blocks deleted (action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation)
Updated by SLindoMansilla almost 4 years ago
- Related to action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation added
Updated by okurz about 3 years ago
I came to this ticket due to periodically reviewing tickets as described on https://progress.opensuse.org/projects/openqatests/wiki#How-we-work-on-tickets
This ticket was set to "Normal" priority but was not updated within the SLO period for "Normal" tickets (365 days) as described on https://progress.opensuse.org/projects/openqatests/wiki/Wiki#SLOs-service-level-objectives
First reminder: Please consider picking up this ticket within the next 365 days or just set the ticket to the next lower priority of "Low" (no SLO related time period).
Updated by slo-gin about 2 years ago
This ticket was set to Normal priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by mkittler about 1 year ago
- Description updated (diff)
I tried to implement this via https://github.com/os-autoinst/os-autoinst/pull/2381. So you could enable use of the CACHEDIRECTORY
by setting SVIRT_WORKER_CACHE=1
. However, be aware of the caveat documented by https://github.com/os-autoinst/os-autoinst/pull/2391/files and that this can lead to other problems (e.g. #138746). That's also why this setting has been disabled by default again via https://github.com/os-autoinst/os-autoinst/pull/2401. Also note that this change doesn't cover VMWare yet (but it seems like this ticket is focusing on s390x where it generally works besides the mentioned caveats).
Updated by mgrifalconi 9 months ago
- Subject changed from [qe-core][functional][tools] Proper handling of assets for svirt workers to [tools] Proper handling of assets for svirt workers
I see tools team looked into it, removing qe-core Please contact qe-core if you need assistance. Makes no sense to me to have multiple squads in the same ticket. I expect one to drive the topic and ask for help when needed (maybe with a sub task for that squad)
Updated by okurz 9 months ago
- Subject changed from [tools] Proper handling of assets for svirt workers to [qe-core] Proper handling of assets for svirt workers
Then please drive this within qe-core as the relevant code is AFAIK in os-autoinst-distri-opensuse for handling assets for svirt workers where qcow files are copied and such.