Project

General

Profile

Actions

action #132593

closed

coordination #121855: [epic] Agama web interactive installation

Run manual testing for Agama on powerpc64le kvm

Added by JERiveraMoya 10 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

There are several Agama for ppc64le images:
Agama on Alp: https://download.suse.de/ibs/home:/gmoro:/alp/SUSE_ALP_Products_Micro_1.0_images/iso/
Agama (devel branch) on opensuse: https://download.opensuse.org/repositories/systemsmanagement:/Agama:/Devel/images/
And many more staging (experimental) repositories.

Acceptance criteria

AC1: Conduct manual testing session with images provided in OBS for ppc64le to check that are bootable.
AC2: About Agama on ALP, Contact @gmoro about results in #proj-alp-wg-bootable-images (low priority for now)
AC3: Additionally test the installation itself and file some bugs.

Additional information

openQA worker could be used, or just using commands qemu-system-ppc64 -m 2048 -vga none -nographic -cdrom image.iso


Files

y2log-UPNIh1.tar.xz (165 KB) y2log-UPNIh1.tar.xz save_y2logs AFTER creating gpt label (gparted -> mklabel gpt) then install and reboot JRivrain, 2023-07-28 14:49
y2log-zh5dhh.tar.xz (176 KB) y2log-zh5dhh.tar.xz logs of "install" (agama install command) JRivrain, 2023-07-28 15:03
y2log-f5ciZb.tar.xz (180 KB) y2log-f5ciZb.tar.xz patterns need to be selected to install: alp_base JRivrain, 2023-08-02 18:22
y2log-Vh2x7X.tar.xz (152 KB) y2log-Vh2x7X.tar.xz "cannot connect to d-bus" even after multiple attempts. JRivrain, 2023-08-03 20:10

Related issues 1 (0 open1 closed)

Related to qe-yam - action #134027: Run manual testing for Agama on aarch64Rejected2023-08-09

Actions
Actions #1

Updated by JERiveraMoya 10 months ago

  • Description updated (diff)
Actions #2

Updated by leli 10 months ago

  • Status changed from Workable to In Progress
  • Assignee set to leli
Actions #3

Updated by leli 10 months ago

  • Status changed from In Progress to Workable
  • Assignee deleted (leli)

Can't download the image, release it to others to pick up.

Actions #4

Updated by JERiveraMoya 10 months ago

  • Priority changed from Normal to Urgent
Actions #5

Updated by JRivrain 9 months ago

  • Assignee set to JRivrain
Actions #6

Updated by JRivrain 9 months ago

  • Status changed from Workable to In Progress
Actions #7

Updated by JRivrain 9 months ago

Both default and ALP images boot, I will do further testing tomorrow.

Actions #8

Updated by JRivrain 9 months ago

I'm currently testing on my own laptop's virt-manager, as I'm not confident to hack production server. Simple setup, qemu-system-ppc64le (emulated POWER9 on x86_64) with VGA, 9G ram, scsi on qcow2 for storage.
There is no running X server, not sure it's expected, and the storage detection hangs indefinitely, needs more research, not sure if my test environment is correct.

Actions #9

Updated by JRivrain 9 months ago

I wondered if it was something wrong with the fact I was using an emulated ppc on x86, so I managed to create a VM on one of the qa servers, as follows:

qemu-system-ppc64 -m 4096 -enable-kvm -vga none -nographic -cdrom /var/lib/openqa/share/factory/iso/agama-live.ppc64le-2.1.0-ALP-Build4.6.iso -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-file,filename=img.qcow2,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0,file=hd0-file,cache.no-flush=on,discard=unmap -device virtio-blk,id=hd0-device,drive=hd0,serial=hd0 -net nic, -net user,hostfwd=tcp::9091-:9090 -smp 1 -machine usb=off,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off -cpu host

But I'm getting the same result, the storage stays un-detected. I wonder if it has something to do with qemu though, rather than the iso itself. So wih two different setup, I get the same result, the storage detection does not end. In my local VM, I used a scsi device, while in the server I used virtio, same effect.
Looks like parted returns an error "unrecognized disk label". I also saw this on openqa worker, so not the fault of my setup.
so I tried creating a parition label then ran the following:
/usr/bin/agama config set software.product=ALP-Micro
/usr/bin/agama config set user.userName=bernhard user.password=nots3cr3t
/usr/bin/agama config set root.password=nots3cr3t
/usr/bin/sleep 30
/usr/bin/agama install
But it looks like the disk got partitioned, then nothing.

This problem does not occur with x86_64 images.

I left the VM on with access to the webui for the devs to take a look.
See attached logs.

Actions #12

Updated by IGonzalezSosa 9 months ago

Hi all,

In the logs, I see that it was not possible to select the patterns to install: "Failed to select default product patterns:
alp-micro-base, alp-micro-cockpit, alp-micro-container_runtime, alp-micro-hardware, alp-micro-selinux".

We have made quite some changes to product definitions in Agama's configuration so using the latest version (from our Staging repo) should work. Beware that now the shipped product is not "ALP-Micro" but "ALP-Dolomite".

Some references:

Actions #13

Updated by JRivrain 9 months ago

So I tried one of the new images, there are good and bad news

Good: on my local VM, which emulates a power9 cpu, the disk is now detected with agama-live.ppc64le-3.0.0-ALP-Build1.3.iso, the X server starts.
Bad: - kernel panic on power8 machines, also affecting all tumbleweed tests in Openqa, reported here https://bugzilla.suse.com/show_bug.cgi?id=1213915.
- on power9 cpu, with the image agama-live.ppc64le-3.0.0-Opensuse-Build1.3.iso, we stay with "cannot connect to d-bus" even after multiple refresh. I'll add more info on this tomorrow.
- the x server starts but not the web UI, we need to connect to it from outside the VM.
- On the ALP image we cannot press Install because of "These patterns need to be selected to install: alp_base", attached logs.

Actions #15

Updated by JRivrain 9 months ago

Adding log from agama-live.ppc64le-3.0.0-openSUSE-Build2.3.iso with issue "cannot connect to d-bus" even after multiple attempts.

Actions #16

Updated by ancorgs 9 months ago

The problem with ALP is caused by incomplete repositories. The ALP crew (basically Jiri Srain) is working to fix the repositories.

The problem with openSUSE is caused by an error in the definition of the Agama configuration for Leap16 for PPC. Imobach is working on a fix.

Actions #17

Updated by IGonzalezSosa 9 months ago

The problem with PPC is fixed in systemsmanagement:Agama:Devel, which is supposed to be the stable image. The image in the Staging repo is broken but for a different problem.

Actions #18

Updated by leli 9 months ago

  • Related to action #134027: Run manual testing for Agama on aarch64 added
Actions #19

Updated by JRivrain 9 months ago

Testing of agama-live.ppc64le-3.0.0-openSUSE-Build2.12.iso:

Good: Installs !
Bad:

  • No X
  • Had to hit 3 times "reload" before having d-bus (on emulated cpu, though. might not happen on an actual ppc machine.)
  • says "Installing packages (44 remains)". Should not be "s", if we it means "44 packages remain".
  • command "agama info" says:

localhost:~ # RUST_BACKTRACE=full agama info 2>&1 |tee log
thread 'main' panicked at 'not implemented', agama-cli/src/main.rs:134:14
stack backtrace:
0: 0x131e2a5f8 - ::fmt::h27986f626f3d266d
1: 0x131e7fc08 - core::fmt::write::h04b3adcc86b42705
2: 0x131e3aae8 - std::io::Write::write_fmt::h381f783c180f3aee
3: 0x131e2a360 - std::sys_common::backtrace::print::hc447b4a25b999336
4: 0x131e35f3c - std::panicking::default_hook::{{closure}}::h0e42885f268339bf
5: 0x131e35abc - std::panicking::default_hook::h94e6d0826a7c5e4a
6: 0x131e36714 - std::panicking::rust_panic_with_hook::haf71e88ac2c4b719
7: 0x131e2aa04 - std::panicking::begin_panic_handler::{{closure}}::h2f3a545a69506f4b
8: 0x131e2a788 - std::sys_common::backtrace::rust_end_short_backtrace::h5db65786be6a5fe7
9: 0x131e362dc - rust_begin_unwind
10: 0x1317427f4 - core::panicking::panic_fmt::hb15c6763dc815b03
11: 0x1317428d0 - core::panicking::panic::h3c8f2c3c89f8922d
12: 0x131851478 - std::thread::local::LocalKey::with::h0f28b2fd2b3e9638
13: 0x131801ea4 - as core::future::future::Future>::poll::h0ef437adb6f6941d
14: 0x13180a578 - async_io::driver::block_on::hbe3eb62c8b348b7a
15: 0x131756ec4 - async_global_executor::executor::block_on::h8691131bcea2bb85
16: 0x131856048 - std::thread::local::LocalKey::with::h7cdea6092b3b117d
17: 0x131856410 - std::thread::local::LocalKey::with::h8646b83cbffb9d12
18: 0x131755c84 - async_std::task::builder::Builder::blocking::h391cfc366c82763b
19: 0x13175f93c - agama::main::hc7a6c33c490dec78
20: 0x1318130c4 - std::sys_common::backtrace::
rust_begin_short_backtrace::hdfc7e20342f649eb
21: 0x1318bf70c - std::rt::lang_start::{{closure}}::hb43d386c47f7ddb6
22: 0x131e3616c - std::panicking::try::h143417f27bf52a68
23: 0x131e52088 - std::rt::lang_start_internal::h40c0157c729a6ae5
24: 0x13175f9a4 - main
25: 0x7fff8e848cac - __libc_start_call_main
26: 0x7fff8e848eec - __libc_start_main@GLIBC_2.17
27: 0x0 -

Actions #20

Updated by JRivrain 9 months ago

  • Description updated (diff)
  • Priority changed from Urgent to High
Actions #21

Updated by JERiveraMoya 8 months ago

  • Status changed from In Progress to Resolved

The goal of this ticket was to support booting image work group, seems that we did the extra mile.
In the future we might need extra manual testing for exotic architecture where will be a good moment to introduce automation for them (most likely when ALP will be in IBS and we have proper openQA workers for it).
Resolving it. Thanks!

Actions

Also available in: Atom PDF