action #21838

[functional][u][saga] PowerVM backend

Added by RBrownSUSE over 2 years ago. Updated over 1 year ago.

Status:ResolvedStart date:08/08/2017
Priority:HighDue date:31/07/2018
Assignee:nicksinger% Done:

0%

Category:Feature requests
Target version:SUSE QA tests - Milestone 18
Difficulty:
Duration: 256

Description

Motivation

Test/Improve/Run PowerVM backend once Jan has Novalink running on SLES, which will allow openQA to run on the Novalink host and therefore run the pvm backend as currently available in os-autoinst

Acceptance criteria

  • AC1: We have at least one standard openQA scenario running on a continuous base for at least one current product in development

Further details

This ticket does not (necessarily) include putting worker(s) into the the production osd environment which should be covered in #25594


Related issues

Related to openQA Project - action #32437: [tools][functional][8h][u][easy] Play around with the Pow... Resolved 08/08/2017 27/03/2018
Related to openQA Tests - action #33340: [tools][functional][u][medium][pvm] Enable graphical inst... Resolved 15/03/2018
Blocked by openQA Project - action #33337: [tools][functional][medium][pvm][u] Implement proper hand... Resolved 01/04/2018 31/07/2018
Blocks openQA Tests - action #33388: [functional][u][easy][pvm] Implement proper split from ot... Resolved 17/02/2019
Blocked by openQA Tests - action #38513: [functional][u][pvm] Have any test scenario showing up wi... Resolved 15/03/2018 31/07/2018
Precedes openQA Tests - action #33697: [functional][u][hard][pvm] Enable the powerVM backend to ... New 01/08/2018

History

#1 Updated by sebchlad over 2 years ago

  • Assignee set to sebchlad

As talked with Marita; it seems to be an important openQA feature for SLES15 testing.
I understand from Marita that this is currently handled by Jan L. from Stefan F. team, so I assume Jan is delivering this feature.

We should make sure we have the same understanding of responsibility here. Also we should make clear if we really need this for SLES15 testing; if yes, when we should have this feature in openQA enabled.

#2 Updated by sebchlad over 2 years ago

  • Assignee changed from sebchlad to szarate

After a sync talk on 27.09.2017 with Marita and Santiago we agreed that Santiago will talk to Coolo about this backlog item and will clarify the done criteria for this.

AP Marita: create an epic in Suse qa test project with detailed plans description for overall ppc arch.

#5 Updated by maritawerner over 2 years ago

  • Parent task set to #25592

#7 Updated by szarate over 2 years ago

  • Parent task deleted (#25592)

On openQA side of things, there's work already done on this, but there is no hardware available for powerVM inside openQA infraestructure. (See https://progress.opensuse.org/issues/25594#note-4)

As we have no hardware available and Ihno looks for it, we are not planning to invest time on this task.

#9 Updated by szarate over 2 years ago

  • Assignee deleted (szarate)

Coolo mentioned today that he was going to experiment with grenache.

#10 Updated by coolo about 2 years ago

I have some prototype, but I'm blocked by grenache being in the arch network and the lpars constantly broken without a way to debug it :(

I have a ticket with Infra to move it into arch network.

#11 Updated by nicksinger about 2 years ago

FSP dhcp entries, LPAR dhcp entries and DNS entries have been created. Machine was moved to the lab and should be available. FSP1 reachable over webui. I'm not 100% certain about the network for the LPARs since this machine has quiet some amount of NICs. Let me know if anything is not working as expected.

#12 Updated by coolo about 2 years ago

I can't reach it :(

coolo@f84#~>ping grenache.qa.suse.de
PING grenache.qa.suse.de (10.162.6.235) 56(84) bytes of data.
From 10.160.255.253 (10.160.255.253) icmp_seq=1 Destination Host Unreachable
From 10.160.255.253 (10.160.255.253) icmp_seq=2 Destination Host Unreachable

#13 Updated by nicksinger about 2 years ago

Seems like the VIOS/NovaLink did not survive the move. After several reboots I got the system back in HMC and can access the two LPARS (VIOS and Nova) by mkvterm. The VIOS-Lpar just stalls after:

-------------------------------------------------------------------------------
                       Welcome to the Virtual I/O Server.
                   boot image timestamp: 11:47:48 01/06/2016
                 The current time and date: 15:08:41 01/19/2018
        processor count: 2;  memory size: 2048MB;  kernel size: 29476015
    boot device: /pci@800000020000011/pci1014,034A@0/sas/disk@a068818600:2
-------------------------------------------------------------------------------

without any proper login prompt. This most likely results in NovaLink being unable to find its boot-device:

 Open in progress  

 Open Completed. 

 No OS image was detected by firmware.
 At least one disk in the bootlist was not found yet.
 Firmware is now retrying the entries in the bootlist.
 Press ctrl-C to stop retrying.

 No OS image was detected by firmware.
 At least one disk in the bootlist was not found yet.
 Firmware is now retrying the entries in the bootlist.
 Press ctrl-C to stop retrying.

#14 Updated by nicksinger about 2 years ago

VIOS continued but failed with the following message:

Saving Base Customize Data to boot disk
Starting the sync daemon
Starting the error daemon
System initialization completed.
TE=OFF
CHKEXEC=OFF
CHKSHLIB=OFF
CHKSCRIPT=OFF
CHKKERNEXT=OFF
STOP_UNTRUSTD=OFF
STOP_ON_CHKFAIL=OFF
LOCK_KERN_POLICIES=OFF
TSD_FILES_LOCK=OFF
TSD_LOCK=OFF
TEP=OFF
TLP=OFF
Successfully updated the Kernel Authorization Table.
Successfully updated the Kernel Role Table.
Successfully updated the Kernel Command Table.
Successfully updated the Kernel Device Table.
Successfully updated the Kernel Object Domain Table.
Successfully updated the Kernel Domains Table.
OPERATIONAL MODE Security Flags
ROOT                      :    ENABLED
TRACEAUTH                 :   DISABLED
System runtime mode is now OPERATIONAL MODE.
Setting tunable parameters...complete
Starting Multi-user Initialization
 Performing auto-varyon of Volume Groups 
 Activating all paging spaces 
0517-075 swapon: Paging device /dev/paging00 is already active.
swapon: Paging device /dev/hd6 activated.

The current volume is: /dev/hd1
Primary superblock is valid.

The current volume is: /dev/hd10opt
Primary superblock is valid.
 Performing all automatic mounts 
Replaying log for /dev/VMLibrary.
Multi-user initialization completed
Checking for srcmstr active...complete
Starting tcpip daemons:
0513-059 The syslogd Subsystem has been started. Subsystem PID is 5439528.
0513-059 The portmap Subsystem has been started. Subsystem PID is 7995554.
0513-059 The inetd Subsystem has been started. Subsystem PID is 6160604.
Finished starting tcpip daemons.
Starting NFS services:
0513-059 The biod Subsystem has been started. Subsystem PID is 6488268.
[STOP] Virtual I/O server detected, ASO will exit.

No clue where the other VIOS should come from. Have to continue "debugging" on monday and maybe even need JLoeser to continue further.

#15 Updated by coolo about 2 years ago

https://github.com/os-autoinst/os-autoinst/pull/909 has a backend and https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4213 a bootloader to start novalink lpars

grenache was moved to QA network and reinstalled from scratch.

grenache has 10 LPARs configured (and could easily eat more from my impression) with shared processors. grenache.qa.suse.de is the novalink lpar and grenache-1.qa.suse.de runs the workers - 7 so far. Each of these workers is configured to install one lpar (grenache-2.qa.suse.de .. grenache-8.qa.suse.de)

I also put a webui on grenache-1.qa.suse.de for the time being for easier development - there you can see that the serial handling / reconnect of the tests isn't yet working, but I miss the time to finish that sort of details

#16 Updated by sebchlad about 2 years ago

  • Subject changed from [tools] PowerVM backend to [tools][functional] PowerVM backend
  • Assignee set to nicksinger

Nick, as we talked; one can start using this set-up for some testing and we could also improve the back-end on the way.

#17 Updated by nicksinger about 2 years ago

  • Copied to action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend added

#18 Updated by nicksinger about 2 years ago

  • Copied to deleted (action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend)

#19 Updated by nicksinger about 2 years ago

  • Related to action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend added

#21 Updated by okurz about 2 years ago

  • Subject changed from [tools][functional] PowerVM backend to [tools][functional][saga] PowerVM backend
  • Category set to 132
  • Target version set to Milestone 18

#22 Updated by ldevulder about 2 years ago

Could it be possible to add multi-machine support and test this for ppc64le HA? :) It could save us lot of time during the build validation.

#23 Updated by coolo about 2 years ago

I don't think there is inherit problem - but you would need to undo a lot of qemu specific hardcoding. But you can create vlans for the lpars and then have PARALLEL_WITH 2 lpars.

#24 Updated by nicksinger about 2 years ago

  • Copied to action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects added

#25 Updated by nicksinger about 2 years ago

  • Copied to deleted (action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects)

#26 Updated by nicksinger about 2 years ago

  • Blocked by action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects added

#27 Updated by nicksinger about 2 years ago

  • Blocked by action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend added

#28 Updated by nicksinger about 2 years ago

  • Blocks action #33388: [functional][u][easy][pvm] Implement proper split from other backends added

#29 Updated by nicksinger about 2 years ago

  • Precedes action #33697: [functional][u][hard][pvm] Enable the powerVM backend to conduct multimachine tests added

#31 Updated by okurz almost 2 years ago

  • Subject changed from [tools][functional][saga] PowerVM backend to [functional][saga] PowerVM backend

I see where some confusion came from now. I assume that the [tools] team is not really pushing this but QSF is expected to do it so I am assigning [functional] only. I am happy if the [tools] team will the primary driver and reverts my decision though.

#32 Updated by okurz almost 2 years ago

  • Due date set to 28/08/2018

#33 Updated by okurz almost 2 years ago

  • Due date changed from 28/08/2018 to 31/07/2018
  • Priority changed from Normal to High

let's see if we can bring this closer. Based on the related tasks I deem 2018-07 feasible.

#34 Updated by okurz almost 2 years ago

  • Target version changed from Milestone 18 to Milestone 18

#35 Updated by okurz almost 2 years ago

  • Description updated (diff)

Added one acceptance criterion. IMHO #25594 better describes "putting workers into the production osd environment" than this one.

#36 Updated by okurz over 1 year ago

  • Subject changed from [functional][saga] PowerVM backend to [functional][u][saga] PowerVM backend

#37 Updated by okurz over 1 year ago

  • Blocked by action #38513: [functional][u][pvm] Have any test scenario showing up with useful test results conducted on pvm backend triggered as part of SLE12SP4 validation tests added

#38 Updated by okurz over 1 year ago

  • Blocked by deleted (action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend)

#39 Updated by okurz over 1 year ago

  • Related to action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend added

#40 Updated by okurz over 1 year ago

  • Status changed from New to Resolved

Only a formality was missing. #33340 was not really "blocking" this issue but is now just related.

Also available in: Atom PDF