Project

General

Profile

Actions

action #180956

open

coordination #169654: [epic] Create test scenarios for Agama

Add validation to check that installed repos are the ones expected and enabled

Added by JERiveraMoya 14 days ago. Updated about 3 hours ago.

Status:
In Progress
Priority:
Normal
Assignee:
Target version:
-
Start date:
2025-04-14
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Sporadically It was raised that repo might not be enabled, we don't know for sure, perhaps some other job in development overwrote the image name published but just in case let's add this check for both mediums, Online and Full.

It is time to start to improve on validations, for now in QE Yam we are going quite light in that regards, leaning our attention to feature validation and the automation of the installation itself, but a very important part of testing the system is to validate expectation at the end.

Notice that validation now lives under some of our squad folders, yam/* due to those are simple validation for the installer, not complex one, more like safeties than other thing.

Acceptance criteria

  • AC1: Add to default installations (interactive and unattended) including SAP ones and also HA, a validation to check for the expected repo(s) and that is enabled.
  • AC2: Run N times (50 or more) with a single scenario (in this case, unattended installation for sles as this is the one where was potentially detected this issue when creating images for other squads) to see if we can catch this situation.
  • AC3: Report bugs relative to this topic or any other inconsistencies.
Actions #1

Updated by JERiveraMoya 14 days ago

  • Description updated (diff)
Actions #3

Updated by JERiveraMoya 5 days ago

  • Status changed from Workable to In Progress
  • Assignee set to jfernandez
Actions #4

Updated by JERiveraMoya 5 days ago

this would be an example of failure in the child job but in the non-uefi: https://openqa.suse.de/tests/17407499#step/fips_setup/3
where we cannot find any repo.
But we should try with the uefi one to narrow the scope and an any noise.

Actions #5

Updated by jfernandez 3 days ago

We have run 40 tests of parent job create_hdd_textmode_qesec, adding own validation of repositories validation code -> https://github.com/okynos/os-autoinst-distri-opensuse/blob/180956-add-repo-validation-sle16/tests/yam/validate/validate_repositories.pm
We added a sleep into the code to be able to debug in case the error is present in a test.
First launch 10 jobs:

Second launch 10 jobs:

Third launch 10 jobs:

Fourth launch 10 jobs:

Bonus job (included console/validate_repos check): https://openqa.suse.de/tests/17454082
No error found.

After a talk with @JERiveraMoya we guessed that the error could be present in the way to load the qcow image, in some scenarios the job takes a lot of time to start after the image generation. He suggested to use START_DIRECTLY_AFTER_TEST parameter in OpenQA job definition to avoid the second job fips_ker_mode_textmode_core to take another image or a timing issue related to the qcow image.
We started this round of testing with green results:
First test: https://openqa.suse.de/tests/17460858 + https://openqa.suse.de/tests/17460859 (It doesn't failed at fips_setup step)
Second test: https://openqa.suse.de/tests/17462471 + https://openqa.suse.de/tests/17462472

More tests to come...

I will prepare a PR with the repository validation when all archs are ready.

Actions #7

Updated by JERiveraMoya about 3 hours ago · Edited

Thanks for trying to reproduce it. We would these three steps if you agree:
(1) Lunch 40, 10 is not sufficient, the error is so strange, that usually needs more than that.
(2) Merge PR adding the validation module at least to x86_64 uefi unattended.
(3) Comment here in ticket how to enable this module and what test data is needed to run it.
(4) Notify QE Security pointing to (3)

After that we can continue adding the same validation everywhere with lower priority.

Actions

Also available in: Atom PDF