action #117205: Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn) size:M - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #117205

closed

Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn) size:M

Added by xguo over 2 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

High

Assignee:

xguo

Category:

Target version:

openQA Project (public) - Ready

Start date:

2022-09-26

Due date:

% Done:

Estimated time:

Description

Refer to the latest test results of the OSD build24.1, https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=24.1&groupid=263
find out some new failures about boot_from_pxe test module,
at the same time, these boot_from_pxe failures were assigned worker: grenache-1:17.
Please help us check with this assigned worker: grenache-1:17.
For now, this boot_from_pxe failure block our OSD tests.
Please refer to https://openqa.suse.de/admin/workers/1248 for more details

Acceptance criteria¶

AC1: The affected worker slots are back in production with working pxe boot

Rollback steps¶

Re-enable workers for production, see https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/445/diffs

Files

Download all files

boot_from_pxe_fail_01.png (62.2 KB) boot_from_pxe_fail_01.png		xguo, 2022-09-26 07:58
boot_from_pxe_fail_02.png (31.7 KB) boot_from_pxe_fail_02.png		xguo, 2022-09-26 07:58
activate_license_01.png (375 KB) activate_license_01.png		xguo, 2022-11-07 16:15
activate_license_02.png (245 KB) activate_license_02.png		xguo, 2022-11-07 16:15
Selection_109.png (131 KB) Selection_109.png		waynechen55, 2022-11-09 07:16

Actions

Copy link Download all files

Updated by xguo over 2 years ago

File boot_from_pxe_fail_01.png boot_from_pxe_fail_01.png added
File boot_from_pxe_fail_02.png boot_from_pxe_fail_02.png added

short update, test failure url:
https://openqa.suse.de/tests/9586828#step/boot_from_pxe/11

refer to the attachments for more details.

Actions

Copy link

Updated by jbaier_cz over 2 years ago

Important info from the job:

# Test died: Error connecting to <root@10.162.2.87>: No route to host at /usr/lib/os-autoinst/testapi.pm line 1739.
    testapi::select_console("root-ssh") called at sle/lib/Utils/Backends.pm line 98
    Utils::Backends::use_ssh_serial_console() called at sle/lib/susedistribution.pm line 741
    susedistribution::activate_console(Distribution::Sle::15_current=HASH(0x10031e11708), "install-shell") called at /usr/lib/os-autoinst/testapi.pm line 1747
    testapi::select_console("install-shell") called at sle/lib/bootloader_setup.pm line 1503
    bootloader_setup::sync_time() called at sle/tests/boot/boot_from_pxe.pm line 202
    boot_from_pxe::run(boot_from_pxe=HASH(0x1003719ad18)) called at /usr/lib/os-autoinst/basetest.pm line 328

Actions

Copy link

Updated by xguo over 2 years ago

Quick update, Assigned worker: openqaworker2:17 have the same problem.

Test failure url:
https://openqa.suse.de/tests/9564617#step/boot_from_pxe/10

Also, refer to https://openqa.suse.de/admin/workers/603 for more details.

Actions

Copy link

Updated by mkittler over 2 years ago

Yes, and it also fails with a similar error:

[2022-09-22T19:42:31.224242+02:00] [debug] Could not connect to root@kermit.qa.suse.de, Retrying after some seconds...
[2022-09-22T19:42:41.225551+02:00] [debug] Could not connect to root@kermit.qa.suse.de, Retrying after some seconds...
[2022-09-22T19:42:51.332197+02:00] [info] ::: basetest::runtest: # Test died: Error connecting to <root@kermit.qa.suse.de>: No route to host at /usr/lib/os-autoinst/testapi.pm line 1739.
    testapi::select_console("root-ssh") called at sle/lib/Utils/Backends.pm line 98
    Utils::Backends::use_ssh_serial_console() called at sle/lib/susedistribution.pm line 741
    susedistribution::activate_console(Distribution::Sle::15_current=HASH(0x56531c77dde0), "install-shell") called at /usr/lib/os-autoinst/testapi.pm line 1747
    testapi::select_console("install-shell") called at sle/lib/bootloader_setup.pm line 1503
    bootloader_setup::sync_time() called at sle/tests/boot/boot_from_pxe.pm line 202
    boot_from_pxe::run(boot_from_pxe=HASH(0x56531f320b50)) called at /usr/lib/os-autoinst/basetest.pm line 328
    eval {...} called at /usr/lib/os-autoinst/basetest.pm line 322
    basetest::runtest(boot_from_pxe=HASH(0x56531f320b50)) called at /usr/lib/os-autoinst/autotest.pm line 360
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 360
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 243
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 243
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 294
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x5653211b2d30)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x5653211b2d30), CODE(0x56532050abc8)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 488
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x5653211b2d30)) called at /usr/lib/os-autoinst/autotest.pm line 296
    autotest::start_process() called at /usr/bin/isotovideo line 273

So this is about the IPMI backend and the involved baremetal SUTs are so far 10.162.2.87 (quinn.qa.suse.de) and kermit.qa.suse.de. Likely the connection error is just because the iPXE boot didn't work in the first place. Judging by the title of the iPXE menu that menu is provided within qanet. I would guess @nicksinger is the one who set it up then. Maybe he can have a look.

Actions

Copy link

Updated by nicksinger over 2 years ago

Status changed from New to In Progress
Assignee set to nicksinger

Actions

Copy link

Updated by nicksinger over 2 years ago

Me and @mgriessmeier tried to debug a little what is going wrong here. We found the following logfiles in the atftpd log:

Sep 26 13:54:06 qanet atftpd[18168.140301027731200]: Creating new socket: 10.162.0.1:44486
Sep 26 13:54:06 qanet atftpd[18168.140301027731200]: Serving /mnt/openqa/repo/SLE-15-SP5-Online-x86_64-Build21.1-Media1/boot/x86_64/loader/linux to 10.162.2.87:49159
Sep 26 13:54:06 qanet atftpd[18168.140301027731200]: tsize option -> 11394848
Sep 26 13:54:06 qanet atftpd[18168.140301027731200]: blksize option -> 1408
Sep 26 13:54:11 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:17 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:23 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:28 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:33 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:38 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:43 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:48 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:53 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:54:58 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:55:03 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:55:08 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:55:13 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:55:18 qanet atftpd[18168.140301027731200]: timeout: retrying...
Sep 26 13:55:23 qanet atftpd[18168.140301027731200]: client (10.162.2.87) not responding
Sep 26 13:55:23 qanet atftpd[18168.140301027731200]: End of transfer
Sep 26 13:55:23 qanet atftpd[18168.140301027731200]: Server thread exiting

We tried to set the "--no-timeout" option which according to the docs (https://linux.die.net/man/8/atftpd) should disable the client-side timeouts while loading files. It still fails which indicates that the client is not timing out while loading resources (kernel and initrd to be precise) but rather the server itself fails to deliver blocks to the client.
However, monitoring wireshark shows that file transfer seems to be fine and the client acknowledges every block up until and including the last one.

We have to further look into the issue what is causing grub to fail (sporadically) while booting the supplied files.

Actions

Copy link

Updated by okurz over 2 years ago

Priority changed from Normal to High
Target version set to Ready

Actions

Copy link

Updated by okurz over 2 years ago

Project changed from QA (public) to openQA Infrastructure (public)

Actions

Copy link

#10

Updated by openqa_review over 2 years ago

Due date set to 2022-10-12

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

#11

Updated by okurz over 2 years ago

Subject changed from Some boot_from_pxe failed from assigned worker: grenache-1:17 to Some boot_from_pxe failed from assigned worker: grenache-1:17 and also openqaworker2:17

Discussed in weekly unblock 2022-09-28: Jobs on other "neighboring" machines work ok, e.g. https://openqa.suse.de/tests/9621432#step/boot_from_pxe/7 on scooter. Other product versions also fail on the assigned worker https://openqa.suse.de/admin/workers/1248 linked to the physical machine quinn, e.g. https://openqa.suse.de/tests/9617954 on SLE Micro 5.3, https://openqa.suse.de/tests/9583700 on SLE15-SP3. Also kermit directly on top of quinn in the same rack shows problems, assigned worker instance openqaworker2:17, e.g. https://openqa.suse.de/tests/9620065#step/boot_from_pxe/8

I suggest to try to boot an OS on one of the machines and run mtr and/or iperf to check connection to qanet and check for any instabilities. Relevant for both cases is that qanet is in SUSE Nbg SRV2 and the physical hosts are in NUE Lab 2.2.14 (TAM) but we also have other machines in this configuration which seem to work fine, e.g. fozzie, gonzo, scooter.

Actions

Copy link

#12

Updated by nicksinger over 2 years ago

Subject changed from Some boot_from_pxe failed from assigned worker: grenache-1:17 and also openqaworker2:17 to Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn)
Status changed from In Progress to Workable
Assignee changed from nicksinger to xguo

I unfortunately failed to boot any other Linux over PXE. I also gave booting over samba a try but failed to mount a rescue medium with an anonymous share. Most likely it works with authentication but I didn't set it up.

@xguo: Other machines work fine, so we can exclude a infrastructure problem. I'd kindly ask you to take over here and check why these machines do not behave properly.
If you have questions regarding our setup I'm happy to explain to you.

Another thing you could try is to reset the bios/flash - we had a similar issue just recently with a power machine.

Actions

Copy link

#13

Updated by okurz over 2 years ago

Description updated (diff)

I merged https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/445 so the two workers are disabled from production jobs for now.

Actions

Copy link

#14

Updated by xguo over 2 years ago

nicksinger wrote:

I unfortunately failed to boot any other Linux over PXE. I also gave booting over samba a try but failed to mount a rescue medium with an anonymous share. Most likely it works with authentication but I didn't set it up.

@xguo: Other machines work fine, so we can exclude a infrastructure problem. I'd kindly ask you to take over here and check why these machines do not behave properly.
If you have questions regarding our setup I'm happy to explain to you.

Another thing you could try is to reset the bios/flash - we had a similar issue just recently with a power machine.

@nicksinger

Great, got it. I will confirm it. if there was any new problem, I will let you know.

Thanks,
Leon

Actions

Copy link

#15

Updated by xguo over 2 years ago

okurz wrote:

I merged https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/445 so the two workers are disabled from production jobs for now.

@okurz

Super. this merged is so helpful with us.

Thanks so much for your great help.

Actions

Copy link

#16

Updated by tinita over 2 years ago

Subject changed from Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn) to Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn) size:M
Description updated (diff)

Actions

Copy link

#17

Updated by livdywan over 2 years ago

Due date deleted (~~2022-10-12~~)
Assignee deleted (~~xguo~~)

Apparently @xguo misunderstood Nick's request for help, only confirmed that the machines were no longer being used and suggests to ask infra to investigate network connectivity.

Actions

Copy link

#18

Updated by xguo over 2 years ago

cdywan wrote:

Apparently @xguo misunderstood Nick's request for help, only confirmed that the machines were no longer being used and suggests to ask infra to investigate network connectivity.

To be honest, I don't have enough permissions and experience to configure the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn). Meanwhile, Just only do my best to check the situation of the assigned workers.

Quick update, currently, I have confirmed that the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) work very well on boot_from_pxe test module now.

grenache-1:17 (kermit)
https://openqa.suse.de/tests/9752953

openqaworker2:17 (quinn)
https://openqa.suse.de/tests/9752361

Next, do more validation run both assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) together on our OSD environment, if there was not any more the same problem for the assigned workers again, plan to create a new MR to active them.

Actions

Copy link

#19

Updated by xguo over 2 years ago

xguo wrote:

cdywan wrote:

Apparently @xguo misunderstood Nick's request for help, only confirmed that the machines were no longer being used and suggests to ask infra to investigate network connectivity.

To be honest, I don't have enough permissions and experience to configure the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn). Meanwhile, Just only do my best to check the situation of the assigned workers.

Quick update, currently, I have confirmed that the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) work very well on boot_from_pxe test module now.

grenache-1:17 (kermit)
https://openqa.suse.de/tests/9752953

openqaworker2:17 (quinn)
https://openqa.suse.de/tests/9752361

Next, do more validation run both assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) together on our OSD environment, if there was not any more the same problem for the assigned workers again, plan to create a new MR to active them.

@nicksinger would you mind giving me some details about reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) - Supermicro test machines as your mentioned.

Meanwhile, quick update, for now, I just only reproduce the same problem in 1/10 times both worker grenache-1:17 (kermit) and openqaworker2:17 (quinn).

Reproduce the same problem in 1/10 time as below:
https://openqa.suse.de/tests/9768087

Please refer to the following url for more details:
https://openqa.suse.de/admin/workers/603
https://openqa.suse.de/admin/workers/1248

Next, create a new MR to enable worker grenache-1:17 (kermit) and openqaworker2:17 (quinn)
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/448

Actions

Copy link

#20

Updated by waynechen55 over 2 years ago

xguo wrote:

xguo wrote:

cdywan wrote:

Apparently @xguo misunderstood Nick's request for help, only confirmed that the machines were no longer being used and suggests to ask infra to investigate network connectivity.

To be honest, I don't have enough permissions and experience to configure the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn). Meanwhile, Just only do my best to check the situation of the assigned workers.

Quick update, currently, I have confirmed that the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) work very well on boot_from_pxe test module now.

grenache-1:17 (kermit)
https://openqa.suse.de/tests/9752953

openqaworker2:17 (quinn)
https://openqa.suse.de/tests/9752361

Next, do more validation run both assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) together on our OSD environment, if there was not any more the same problem for the assigned workers again, plan to create a new MR to active them.

@nicksinger would you mind giving me some details about reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) - Supermicro test machines as your mentioned.

Meanwhile, quick update, for now, I just only reproduce the same problem in 1/10 times both worker grenache-1:17 (kermit) and openqaworker2:17 (quinn).

Reproduce the same problem in 1/10 time as below:
https://openqa.suse.de/tests/9768087

Please refer to the following url for more details:
https://openqa.suse.de/admin/workers/603
https://openqa.suse.de/admin/workers/1248

Next, create a new MR to enable worker grenache-1:17 (kermit) and openqaworker2:17 (quinn)
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/448

@xguo @nicksinger @okurz Any follow-up actions on these two machines/workers ? Although I said "we can have a try" in this mr, I agree with Oliver that 10% percentage is not a small value (although this might be the result of limited times of experiment). Root cause for this issue should be identified and fixed which is the best. But, at the same time, virtualization lost two machines to run tests. It is not a very big problem if clock is not ticking. So could we come up with a plan or schedule to have this fixed in any event ?

Actions

Copy link

#21

Updated by xguo over 2 years ago

waynechen55 wrote:

xguo wrote:

xguo wrote:

cdywan wrote:

Apparently @xguo misunderstood Nick's request for help, only confirmed that the machines were no longer being used and suggests to ask infra to investigate network connectivity.

To be honest, I don't have enough permissions and experience to configure the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn). Meanwhile, Just only do my best to check the situation of the assigned workers.

Quick update, currently, I have confirmed that the assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) work very well on boot_from_pxe test module now.

grenache-1:17 (kermit)
https://openqa.suse.de/tests/9752953

openqaworker2:17 (quinn)
https://openqa.suse.de/tests/9752361

Next, do more validation run both assigned worker: grenache-1:17 (kermit) and openqaworker2:17 (quinn) together on our OSD environment, if there was not any more the same problem for the assigned workers again, plan to create a new MR to active them.

@nicksinger would you mind giving me some details about reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) - Supermicro test machines as your mentioned.

Meanwhile, quick update, for now, I just only reproduce the same problem in 1/10 times both worker grenache-1:17 (kermit) and openqaworker2:17 (quinn).

Reproduce the same problem in 1/10 time as below:
https://openqa.suse.de/tests/9768087

Please refer to the following url for more details:
https://openqa.suse.de/admin/workers/603
https://openqa.suse.de/admin/workers/1248

Next, create a new MR to enable worker grenache-1:17 (kermit) and openqaworker2:17 (quinn)
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/448

@xguo @nicksinger @okurz Any follow-up actions on these two machines/workers ? Although I said "we can have a try" in this mr, I agree with Oliver that 10% percentage is not a small value (although this might be the result of limited times of experiment). Root cause for this issue should be identified and fixed which is the best. But, at the same time, virtualization lost two machines to run tests. It is not a very big problem if clock is not ticking. So could we come up with a plan or schedule to have this fixed in any event ?

To be honest, this problem of these two machines/workers is still not resolved now. we still do not make sure if the same produce still reproduce after enabled these two machines/workers on OSD. Depend on current situation, we still do not make sure how to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) as Nick's mentioned. Meanwhile, these two machines/workers were old Supermicro test machines now. Means that there is the risk to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn). @nicksinger @okurz would you mind give us some ideas and comments for this ticket? thanks.

Actions

Copy link

#22

Updated by Julie_CAO over 2 years ago

My recent tests runing on these machines look good. I suggest add them back to OSD worker pool.
https://openqa.suse.de/admin/workers/1248

Actions

Copy link

#23

Updated by okurz over 2 years ago

xguo wrote:

@xguo @nicksinger @okurz Any follow-up actions on these two machines/workers ? Although I said "we can have a try" in this mr, I agree with Oliver that 10% percentage is not a small value (although this might be the result of limited times of experiment). Root cause for this issue should be identified and fixed which is the best. But, at the same time, virtualization lost two machines to run tests. It is not a very big problem if clock is not ticking. So could we come up with a plan or schedule to have this fixed in any event ?

To be honest, this problem of these two machines/workers is still not resolved now. we still do not make sure if the same produce still reproduce after enabled these two machines/workers on OSD. Depend on current situation, we still do not make sure how to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) as Nick's mentioned. Meanwhile, these two machines/workers were old Supermicro test machines now. Means that there is the risk to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn). @nicksinger @okurz would you mind give us some ideas and comments for this ticket? thanks.

I agree with all of your points. This is why this ticket is on the backlog of the SUSE QE Tools team. The ticket has priority "High" meaning that at least an update should happen once every 30 days. As many members of the SUSE QE Tools team are more experts in software development this can be a reason why there is no quick follow-up. Of course you are welcome to try the mentioned steps yourself as updating/resetting firmware config and configuring and testing the network can all be done remotely anyway.

Actions

Copy link

#24

Updated by xguo over 2 years ago

okurz wrote:

xguo wrote:

@xguo @nicksinger @okurz Any follow-up actions on these two machines/workers ? Although I said "we can have a try" in this mr, I agree with Oliver that 10% percentage is not a small value (although this might be the result of limited times of experiment). Root cause for this issue should be identified and fixed which is the best. But, at the same time, virtualization lost two machines to run tests. It is not a very big problem if clock is not ticking. So could we come up with a plan or schedule to have this fixed in any event ?

To be honest, this problem of these two machines/workers is still not resolved now. we still do not make sure if the same produce still reproduce after enabled these two machines/workers on OSD. Depend on current situation, we still do not make sure how to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn) as Nick's mentioned. Meanwhile, these two machines/workers were old Supermicro test machines now. Means that there is the risk to reset bios|flash for grenache-1:17 (kermit) and openqaworker2:17 (quinn). @nicksinger @okurz would you mind give us some ideas and comments for this ticket? thanks.

I agree with all of your points. This is why this ticket is on the backlog of the SUSE QE Tools team. The ticket has priority "High" meaning that at least an update should happen once every 30 days. As many members of the SUSE QE Tools team are more experts in software development this can be a reason why there is no quick follow-up. Of course you are welcome to try the mentioned steps yourself as updating/resetting firmware config and configuring and testing the network can all be done remotely anyway.

Great, got it. try to update bios of these two machines/workers remotely.

@okurz, quick update, I try to update the latest BIOS(v3.2) for grenache-1:17 (kermit) and openqaworker2:17 (quinn) - supermicro X10DRW-i.
Found that there need to be activated license firstly before updating BIOS version for grenache-1:17 (kermit) and openqaworker2:17 (quinn) together.

Supermicro Update Manager (SUM)

$ ./sum -i sp.kermit.qa.suse.de -U ADMIN -p XX -c UpdateBios --file /home/xguo/Downloads/X10DRW9_B22/UEFI/X10DRW9.B22  -v --journal_level 6 
Supermicro Update Manager (for UEFI BIOS) 2.9.0 (2022/08/04) (x86_64)
Copyright(C) 2013-2022 Super Micro Computer, Inc. All rights reserved.
Journal created:
    /home/xguo/journal/supermicro/sum/journal.txt
    /home/xguo/journal/supermicro/sum/sum_journal_2022-11-07_15-53-19.core

Check node product key activation of the managed system


********************************<<<<<ERROR>>>>>*********************************

ExitCode                = 80
Description             = Node product key is not activated.
Program Error Code      = 622.13
Error message:
        One of the node product key (SFT-OOB-LIC or SFT-DCMS-SINGLE) should be
    activated to execute this task.
        SFT-DCMS-SINGLE node product key is not activated.
        SFT-OOB-LIC node product key is not activated.

********************************************************************************

Would you mind helping me make sure who can give me the activate license for grenache-1:17 (kermit) and openqaworker2:17 (quinn) together. thanks

Actions

Copy link Download all files

#25

Updated by xguo over 2 years ago

File activate_license_01.png activate_license_01.png added
File activate_license_02.png activate_license_02.png added

Required activate license to update BIOS from grenache-1:17 (kermit) BMC Web.

Please refer to the attachments for more details.

Actions

Copy link

#26

Updated by mkittler over 2 years ago

I'm not sure whether anybody in the tools team will be able to help with the licensing problem.

Actions

Copy link

#27

Updated by waynechen55 over 2 years ago

File Selection_109.png Selection_109.png added

@xguo It seems license can be ordered online at Supermicro estore: https://store.supermicro.com/software.html

Actions

Copy link

#28

Updated by xguo over 2 years ago

waynechen55 wrote:

@xguo It seems license can be ordered online at Supermicro store: https://store.supermicro.com/software.html

Awesome, got it. it is so helpful to us.

Actions

Copy link

#29

Updated by xguo over 2 years ago

Quick update, have updated the latest Firmware Revision: 03.89 and BIOS Version: 3.2 both grenache-1:17 (kermit) and openqaworker2:17 (quinn) successfully together now.
Let's see what happens next

Actions

Copy link

#30

Updated by xlai over 2 years ago

xguo wrote:

Quick update, have updated the latest Firmware Revision: 03.89 and BIOS Version: 3.2 both grenache-1:17 (kermit) and openqaworker2:17 (quinn) successfully together now.
Let's see what happens next

@xguo Hi Leon, would you please give an update about this issue? Still reproduce after upgrading firmware and bios?

Actions

Copy link

#31

Updated by xguo over 2 years ago

@xlai short update, after updated the latest Firmware and BIOS Version both grenache-1:17 (kermit) and openqaworker2:17 (quinn) successfully, confirmed that the same problem do not reproduce again any more.

@okurz, would you mind helping us update ticket as RESOLVED. thanks

FYI.
If the same problem reproduce both grenache-1:17 (kermit) and openqaworker2:17 (quinn) again, will reopen this ticket.

Actions

Copy link

#32

Updated by okurz over 2 years ago

Status changed from Workable to Resolved
Assignee set to xguo

xguo wrote:

@xlai short update, after updated the latest Firmware and BIOS Version both grenache-1:17 (kermit) and openqaworker2:17 (quinn) successfully, confirmed that the same problem do not reproduce again any more.

this is great news. Thanks a lot.

@okurz, would you mind helping us update ticket as RESOLVED. thanks

done

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #117205

Some boot_from_pxe failed from assigned worker: grenache-1:17 (kermit) and also openqaworker2:17 (quinn) size:M

Acceptance criteria¶

Rollback steps¶

Updated by xguo over 2 years ago

Updated by jbaier_cz over 2 years ago

Updated by xguo over 2 years ago

Updated by mkittler over 2 years ago

Updated by nicksinger over 2 years ago

Updated by nicksinger over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by openqa_review over 2 years ago

Updated by okurz over 2 years ago

Updated by nicksinger over 2 years ago

Updated by okurz over 2 years ago

Updated by xguo over 2 years ago

Updated by xguo over 2 years ago

Updated by tinita over 2 years ago

Updated by livdywan over 2 years ago

Updated by xguo over 2 years ago

Updated by xguo over 2 years ago

Updated by waynechen55 over 2 years ago

Updated by xguo over 2 years ago

Updated by Julie_CAO over 2 years ago

Updated by okurz over 2 years ago

Updated by xguo over 2 years ago

Updated by xguo over 2 years ago

Updated by mkittler over 2 years ago

Updated by waynechen55 over 2 years ago

Updated by xguo over 2 years ago

Updated by xguo over 2 years ago

Updated by xlai over 2 years ago

Updated by xguo over 2 years ago

Updated by okurz over 2 years ago