Project

General

Profile

action #21838

[functional][u][saga] PowerVM backend

Added by RBrownSUSE almost 3 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
SUSE QA tests - Milestone 18
Start date:
2017-08-08
Due date:
2018-07-31
% Done:

0%

Estimated time:
Difficulty:
Duration: 256

Description

Motivation

Test/Improve/Run PowerVM backend once Jan has Novalink running on SLES, which will allow openQA to run on the Novalink host and therefore run the pvm backend as currently available in os-autoinst

Acceptance criteria

  • AC1: We have at least one standard openQA scenario running on a continuous base for at least one current product in development

Further details

This ticket does not (necessarily) include putting worker(s) into the the production osd environment which should be covered in #25594


Related issues

Related to openQA Project - action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backendResolved2017-08-082018-03-27

Related to openQA Tests - action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backendResolved2018-03-15

Blocked by openQA Project - action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnectsResolved2018-04-012018-07-31

Blocks openQA Tests - action #33388: [functional][u][easy][pvm] Implement proper split from other backendsResolved2019-02-17

Blocked by openQA Tests - action #38513: [functional][u][pvm] Have any test scenario showing up with useful test results conducted on pvm backend triggered as part of SLE12SP4 validation testsResolved2018-03-152018-07-31

Precedes openQA Project - action #33697: [tools][hard][pvm] Enable the powerVM backend to conduct multimachine testsNew2018-08-01

History

#1 Updated by sebchlad almost 3 years ago

  • Assignee set to sebchlad

As talked with Marita; it seems to be an important openQA feature for SLES15 testing.
I understand from Marita that this is currently handled by Jan L. from Stefan F. team, so I assume Jan is delivering this feature.

We should make sure we have the same understanding of responsibility here. Also we should make clear if we really need this for SLES15 testing; if yes, when we should have this feature in openQA enabled.

#2 Updated by sebchlad almost 3 years ago

  • Assignee changed from sebchlad to szarate

After a sync talk on 27.09.2017 with Marita and Santiago we agreed that Santiago will talk to Coolo about this backlog item and will clarify the done criteria for this.

AP Marita: create an epic in Suse qa test project with detailed plans description for overall ppc arch.

#5 Updated by maritawerner almost 3 years ago

  • Parent task set to #25592

#7 Updated by szarate over 2 years ago

  • Parent task deleted (#25592)

On openQA side of things, there's work already done on this, but there is no hardware available for powerVM inside openQA infraestructure. (See https://progress.opensuse.org/issues/25594#note-4)

As we have no hardware available and Ihno looks for it, we are not planning to invest time on this task.

#9 Updated by szarate over 2 years ago

  • Assignee deleted (szarate)

Coolo mentioned today that he was going to experiment with grenache.

#10 Updated by coolo over 2 years ago

I have some prototype, but I'm blocked by grenache being in the arch network and the lpars constantly broken without a way to debug it :(

I have a ticket with Infra to move it into arch network.

#11 Updated by nicksinger over 2 years ago

FSP dhcp entries, LPAR dhcp entries and DNS entries have been created. Machine was moved to the lab and should be available. FSP1 reachable over webui. I'm not 100% certain about the network for the LPARs since this machine has quiet some amount of NICs. Let me know if anything is not working as expected.

#12 Updated by coolo over 2 years ago

I can't reach it :(

coolo@f84#~>ping grenache.qa.suse.de
PING grenache.qa.suse.de (10.162.6.235) 56(84) bytes of data.
From 10.160.255.253 (10.160.255.253) icmp_seq=1 Destination Host Unreachable
From 10.160.255.253 (10.160.255.253) icmp_seq=2 Destination Host Unreachable

#13 Updated by nicksinger over 2 years ago

Seems like the VIOS/NovaLink did not survive the move. After several reboots I got the system back in HMC and can access the two LPARS (VIOS and Nova) by mkvterm. The VIOS-Lpar just stalls after:

-------------------------------------------------------------------------------
                       Welcome to the Virtual I/O Server.
                   boot image timestamp: 11:47:48 01/06/2016
                 The current time and date: 15:08:41 01/19/2018
        processor count: 2;  memory size: 2048MB;  kernel size: 29476015
    boot device: /pci@800000020000011/pci1014,034A@0/sas/disk@a068818600:2
-------------------------------------------------------------------------------

without any proper login prompt. This most likely results in NovaLink being unable to find its boot-device:

 Open in progress  

 Open Completed. 

 No OS image was detected by firmware.
 At least one disk in the bootlist was not found yet.
 Firmware is now retrying the entries in the bootlist.
 Press ctrl-C to stop retrying.

 No OS image was detected by firmware.
 At least one disk in the bootlist was not found yet.
 Firmware is now retrying the entries in the bootlist.
 Press ctrl-C to stop retrying.

#14 Updated by nicksinger over 2 years ago

VIOS continued but failed with the following message:

Saving Base Customize Data to boot disk
Starting the sync daemon
Starting the error daemon
System initialization completed.
TE=OFF
CHKEXEC=OFF
CHKSHLIB=OFF
CHKSCRIPT=OFF
CHKKERNEXT=OFF
STOP_UNTRUSTD=OFF
STOP_ON_CHKFAIL=OFF
LOCK_KERN_POLICIES=OFF
TSD_FILES_LOCK=OFF
TSD_LOCK=OFF
TEP=OFF
TLP=OFF
Successfully updated the Kernel Authorization Table.
Successfully updated the Kernel Role Table.
Successfully updated the Kernel Command Table.
Successfully updated the Kernel Device Table.
Successfully updated the Kernel Object Domain Table.
Successfully updated the Kernel Domains Table.
OPERATIONAL MODE Security Flags
ROOT                      :    ENABLED
TRACEAUTH                 :   DISABLED
System runtime mode is now OPERATIONAL MODE.
Setting tunable parameters...complete
Starting Multi-user Initialization
 Performing auto-varyon of Volume Groups 
 Activating all paging spaces 
0517-075 swapon: Paging device /dev/paging00 is already active.
swapon: Paging device /dev/hd6 activated.

The current volume is: /dev/hd1
Primary superblock is valid.

The current volume is: /dev/hd10opt
Primary superblock is valid.
 Performing all automatic mounts 
Replaying log for /dev/VMLibrary.
Multi-user initialization completed
Checking for srcmstr active...complete
Starting tcpip daemons:
0513-059 The syslogd Subsystem has been started. Subsystem PID is 5439528.
0513-059 The portmap Subsystem has been started. Subsystem PID is 7995554.
0513-059 The inetd Subsystem has been started. Subsystem PID is 6160604.
Finished starting tcpip daemons.
Starting NFS services:
0513-059 The biod Subsystem has been started. Subsystem PID is 6488268.
[STOP] Virtual I/O server detected, ASO will exit.

No clue where the other VIOS should come from. Have to continue "debugging" on monday and maybe even need JLoeser to continue further.

#15 Updated by coolo over 2 years ago

https://github.com/os-autoinst/os-autoinst/pull/909 has a backend and https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4213 a bootloader to start novalink lpars

grenache was moved to QA network and reinstalled from scratch.

grenache has 10 LPARs configured (and could easily eat more from my impression) with shared processors. grenache.qa.suse.de is the novalink lpar and grenache-1.qa.suse.de runs the workers - 7 so far. Each of these workers is configured to install one lpar (grenache-2.qa.suse.de .. grenache-8.qa.suse.de)

I also put a webui on grenache-1.qa.suse.de for the time being for easier development - there you can see that the serial handling / reconnect of the tests isn't yet working, but I miss the time to finish that sort of details

#16 Updated by sebchlad over 2 years ago

  • Subject changed from [tools] PowerVM backend to [tools][functional] PowerVM backend
  • Assignee set to nicksinger

Nick, as we talked; one can start using this set-up for some testing and we could also improve the back-end on the way.

#17 Updated by nicksinger over 2 years ago

  • Copied to action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend added

#18 Updated by nicksinger over 2 years ago

  • Copied to deleted (action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend)

#19 Updated by nicksinger over 2 years ago

  • Related to action #32437: [tools][functional][8h][u][easy] Play around with the PowerVM backend added

#21 Updated by okurz over 2 years ago

  • Subject changed from [tools][functional] PowerVM backend to [tools][functional][saga] PowerVM backend
  • Category set to 132
  • Target version set to Milestone 18

#22 Updated by ldevulder over 2 years ago

Could it be possible to add multi-machine support and test this for ppc64le HA? :) It could save us lot of time during the build validation.

#23 Updated by coolo over 2 years ago

I don't think there is inherit problem - but you would need to undo a lot of qemu specific hardcoding. But you can create vlans for the lpars and then have PARALLEL_WITH 2 lpars.

#24 Updated by nicksinger over 2 years ago

  • Copied to action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects added

#25 Updated by nicksinger over 2 years ago

  • Copied to deleted (action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects)

#26 Updated by nicksinger over 2 years ago

  • Blocked by action #33337: [tools][functional][medium][pvm][u] Implement proper handling of reconnects added

#27 Updated by nicksinger over 2 years ago

  • Blocked by action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend added

#28 Updated by nicksinger over 2 years ago

  • Blocks action #33388: [functional][u][easy][pvm] Implement proper split from other backends added

#29 Updated by nicksinger over 2 years ago

  • Precedes action #33697: [tools][hard][pvm] Enable the powerVM backend to conduct multimachine tests added

#31 Updated by okurz about 2 years ago

  • Subject changed from [tools][functional][saga] PowerVM backend to [functional][saga] PowerVM backend

I see where some confusion came from now. I assume that the [tools] team is not really pushing this but QSF is expected to do it so I am assigning [functional] only. I am happy if the [tools] team will the primary driver and reverts my decision though.

#32 Updated by okurz about 2 years ago

  • Due date set to 2018-08-28

#33 Updated by okurz about 2 years ago

  • Due date changed from 2018-08-28 to 2018-07-31
  • Priority changed from Normal to High

let's see if we can bring this closer. Based on the related tasks I deem 2018-07 feasible.

#34 Updated by okurz about 2 years ago

  • Target version changed from Milestone 18 to Milestone 18

#35 Updated by okurz about 2 years ago

  • Description updated (diff)

Added one acceptance criterion. IMHO #25594 better describes "putting workers into the production osd environment" than this one.

#36 Updated by okurz almost 2 years ago

  • Subject changed from [functional][saga] PowerVM backend to [functional][u][saga] PowerVM backend

#37 Updated by okurz almost 2 years ago

  • Blocked by action #38513: [functional][u][pvm] Have any test scenario showing up with useful test results conducted on pvm backend triggered as part of SLE12SP4 validation tests added

#38 Updated by okurz almost 2 years ago

  • Blocked by deleted (action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend)

#39 Updated by okurz almost 2 years ago

  • Related to action #33340: [tools][functional][u][medium][pvm] Enable graphical installation for the powerVM backend added

#40 Updated by okurz almost 2 years ago

  • Status changed from New to Resolved

Only a formality was missing. #33340 was not really "blocking" this issue but is now just related.

Also available in: Atom PDF