action #129703
open[security] test fails in evolution_prepare_servers on ipmi
0%
Description
openQA test in scenario sle-15-SP5-Online-x86_64-fips_env_mode_tests_crypt_tool_intel_ipmi@64bit-ipmi fails in
evolution_prepare_servers
Last good: 100.1 (or more recent)
When this ipmi test eventually gets past boot_to_desktop etc, this seems like something that could have changed and causing an error that cannot be workarounded by retrying:
-bash: cd: /usr/share/doc/packages/dovecot: No such file or directory
bash: mkcert.sh: No such file or directory
However, dovecot is installed earlier as part of the same module: https://openqa.suse.de/tests/11181584#step/evolution_prepare_servers/6 - so how is the problem possible? The doc directory could be missing but it also does work for non-ipmi case: https://openqa.suse.de/tests/11175697#step/evolution_prepare_servers/32
Usually this ipmi test fails earlier in boot_to_desktop or console_setup, but there was also one earlier case of the same dovecot problem: https://openqa.suse.de/tests/11178402#step/evolution_prepare_servers/33
Updated by amanzini over 1 year ago
- Assignee deleted (
amanzini)
test sometimes fail in the FIPS_SETUP step, seems unable to install some package
Loading repository data...
Reading installed packages...
Package 'libgnutls30-hmac' not found.
'libgcrypt20-hmac' is already installed.
No update candidate for 'libgcrypt20-hmac-1.6.1-16.83.1.x86_64'. The highest available version is already installed.
'libcryptsetup12-hmac' is already installed.
No update candidate for 'libcryptsetup12-hmac-2.0.6-3.3.1.x86_64'. The highest available version is already installed.
'libfreebl3-hmac' is already installed.
No update candidate for 'libfreebl3-hmac-3.79.4-58.97.1.x86_64'. The highest available version is already installed.
'libsoftokn3-hmac' is already installed.
No update candidate for 'libsoftokn3-hmac-3.79.4-58.97.1.x86_64'. The highest available version is already installed.
'libopenssl1_1-hmac' is already installed.
No update candidate for 'libopenssl1_1-hmac-1.1.1d-2.81.1.x86_64'. The highest available version is already installed.
Dtgik-104-
excerpt from fips_setup.pm
:
if (is_sle('>=15-sp4')) {
my $pkg_list = {
'libcryptsetup12-hmac' => '2.4.3',
'libsoftokn3-hmac' => '3.68.3',
'libgnutls30-hmac' => '3.7.3',
'libfreebl3-hmac' => '3.68.3',
'libopenssl1_1-hmac' => '1.1.1l',
'libgcrypt20-hmac' => '1.9.4'
};
zypper_call("in " . join(' ', keys %$pkg_list));
package_upgrade_check($pkg_list);
}
Updated by openqa_review over 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: fips_env_mode_tests_crypt_tool_intel_ipmi@64bit-ipmi
https://openqa.suse.de/tests/11181584#step/evolution_prepare_servers/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by JERiveraMoya about 1 year ago
This scenario looks like a waste of time from the point of test maintenance (reproducing it again running multiple times the job and I hit all those several strange issues), it fails sporadically in too many places and no other squad is using similar setup, not even QE virtualization!, probably drop it is the best option.
Slack conversation: https://suse.slack.com/archives/C02CANHLANP/p1695192194470199
wdyt?
Updated by tjyrinki_suse about 1 year ago
We should not let go of the wishful thinking that IPMI backend usage would be reliable for us to use in the future, as our stakeholders request it.
In practice we are however quite used to expecting IPMI failures all over. It is a valid point though if we should do something similar ie have separate IPMI tests if for example https://openqa.suse.de/tests/12180565 (sle-15-SP6-Online-x86_64-Build20.1-sev-es-gi-guest_developing-on-host_developing-kvm) is somehow more reliable.
But files disappearing right after installation, I'm not sure if that kind of problem even can be related to the way of being run. Could we try the same MACHINE=64bit-ipmi-amd-zen3 as they are using in that test, would that be any more reliable?
There is something different at least that on qemu more packages are installed https://openqa.suse.de/tests/11175697#step/evolution_prepare_servers/6 than on our IPMI: https://openqa.suse.de/tests/11181584#step/evolution_prepare_servers/6 - maybe somehow recommended packages installation is disabled in our create_hdd_textmode_intel_ipmi? Not that it would affect this missing mkcert.sh, as I double-checked it's in the "dovecot23" package.
Anyway, we might want to add automatic soft-fails for IPMI cases to make the results more readable.
Updated by JERiveraMoya about 1 year ago
I understand better the scope now, which is bigger that the test suite to fix in the description, I understand now that ipmi is important for our scope, thanks for the clarification.
I was doing some testing to see the options. The first thing I noticed is that we are not testing in the same way than with qemu, using setting START_DIRECTLY_AFTER_TEST
in the children test suite we are reusing the same VM, first the parent is executed and then the children are executed one after the other in alphabetical order, when in case of qemu we get a fresh VM after parent for each children.
This is what made me suspect about this recurrent error, https://openqa.suse.de/tests/11781313#step/consoletest_setup/33
this is most likely happening because we setup everything for each children, repeating even the console setup, then I tried to remove those setup from the 2nd and 3rd children executed and the first time was a success for not the second, there are other sporadic errors in different places. Besides I realized dropping the setup is not a safe option, because if the 1st child fails, 2nd and 3rd might succeed without having the fips setup done, leading to false positive.
Next option I thought was to duplicate the parent to basically mimic what we have in qemu, would be really really slow, one parent for each child, if it would worth it, but seems not because we hit this https://openqa.suse.de/tests/12365064#step/fips_setup/65 very often as well in the first child we run.
What helped was to use that worker as you mentioned to at least run the parent https://gitlab.suse.de/qe-security/osd-sle15-security/-/merge_requests/181
But if this area is owned by virtualization, we should try to provide them simple failure that they could fix one by one, otherwise is really messy.
Updated by JERiveraMoya almost 1 year ago
In QU we also have issues with ipmi, ie: https://openqa.suse.de/tests/12426812
in product also don't even boot: https://openqa.suse.de/tests/12430609#step/boot_from_pxe/13
Updated by openqa_review 12 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: create_hdd_textmode_intel_ipmi@64bit-ipmi
https://openqa.suse.de/tests/12505996#step/boot_from_pxe/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by openqa_review 11 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: create_hdd_textmode_intel_ipmi@64bit-ipmi
https://openqa.suse.de/tests/12426812#step/boot_from_pxe/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.
Updated by tjyrinki_suse 11 months ago
- Status changed from Workable to Blocked
Blocked since currently no 15-SP6 FIPS works.
Updated by tjyrinki_suse 6 months ago
We haven't had ipmi results for a while now. FIPS itself would be testable now.
Updated by tjyrinki_suse 26 days ago
The recent ipmi reworks may have fixed this, as nowadays it's passing:
https://openqa.suse.de/tests/14920501#step/evolution_prepare_servers/32