action #126083
open[WSL] Deploy WSL image on Windows for Arm machine
70%
Description
Hello team
Guillaume_G was asking in https://etherpad.opensuse.org/p/ReleaseEngineering-20230315#L93 to upload our x86_64 WSL-DistroLauncher binaries.
The .appx can be found here https://build.opensuse.org/package/binaries/Virtualization:WSL/kiwi-images-wsl/openSUSE_Factory_ARM_images https://download.opensuse.org/ports/aarch64/tumbleweed/appliances/
We're generally okay with doing so, if we would be testing such scenario in openQA.
Lubos
Updated by ggardet_arm over 1 year ago
APPX image is available at https://download.opensuse.org/ports/aarch64/tumbleweed/appliances/
Updated by maritawerner over 1 year ago
- Tags set to qac
I think "cloud-qa" does not exist but WSL sound like the qac team.
Updated by jlausuch over 1 year ago
- Tags changed from qac to qac, new_test
- Project changed from openQA Tests to 199
- Subject changed from [cloud-qa] please add a test to deploy WSL image on Windows for Arm machine to Deploy WSL image on Windows for Arm machine
- Status changed from New to Workable
Updated by ggardet_arm over 1 year ago
Fabian did some investigations to run the tests within qemu, as done for x86_64, but, there were some problems.
Another solution would be to run tests on bare metal (such as on a Windows Dev Kit 2023) with generalhw backend.
Updated by ggardet_arm over 1 year ago
The notes from Fabian: https://gist.github.com/Vogtinator/293c4f90c5e92838f7e72610725905fd
Updated by favogt over 1 year ago
The main blocker is that Win11 on arm64 does not support serial ports, so the code to run commands and get their output or even wait until they're finished does not work.
Updated by ph03nix about 1 year ago
- Subject changed from Deploy WSL image on Windows for Arm machine to [difficult] Deploy WSL image on Windows for Arm machine
- Priority changed from Normal to Low
Lowering priority as this task appears very difficult and we have more important tasks to do now.
Updated by favogt about 1 year ago
- Status changed from Workable to In Progress
- Assignee set to favogt
With the latest insider build I managed to get the serial port working.
I updated the gist at https://gist.github.com/Vogtinator/293c4f90c5e92838f7e72610725905fd accordingly.
Only missing piece is that for some reason SMB doesn't work, it only shows weird errors.
I added a workaround to use HTTP instead: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17701
With that the WSL1 test passes, I already added it to the dev group.
WSL2 needs Hyper-V to work in the guest, which requires either nested virtualization (which we don't have) or software emulation (slow). I'm giving it a try with the latter, maybe it "just works" (tm).
Updated by ph03nix about 1 year ago
- Subject changed from [difficult] Deploy WSL image on Windows for Arm machine to Deploy WSL image on Windows for Arm machine
favogt wrote in #note-11:
With the latest insider build I managed to get the serial port working.
I updated the gist at https://gist.github.com/Vogtinator/293c4f90c5e92838f7e72610725905fd accordingly.
This is fantastic news! Thanks Fabian, that makes this task now feasable
Updated by favogt about 1 year ago
For WSL1 tests, only https://github.com/os-autoinst/openqa-trigger-from-obs/pull/231 is missing to have it in the Dev group. Can be moved to the ARM main group if it proves reliable enough. I'm not sure whether this image has some expiration date built in, it's possible that after some time it warns about using an outdated preview image build.
I also experimented with WSL2 a bit, but didn't get it to work. With Hyper-V platform enabled (only possible on software emulation) it does not boot. I managed to enable hypervisordebug and attach WinDbg and it told me that EL3 is needed (i.e. -M virt,secure=on
+ ATF firmware), but then it fails even earlier due to an endless loop. With -cpu max
it also tries to write to the HACR_EL2 register but that's not supported by current ATF, resulting in an "Undefined Instruction" exception. With -cpu neoverse-n1
this doesn't happen.
FTR, the setup to connect WinDbg:
To enable debugging in the VM with Hyper-V, run bcdedit. With Hyper-V and EL2 enabled, it no longer boots, so this needs to be done before enabling Hyper-V or in recovery mode:
# Create a boot entry without Hyper-V and debugging first
bcdedit /copy {Current} /d NoHyperV
bcdedit /set {uuid of ^} hypervisorlaunchtype off
# hvaa64.exe debugging
bcdedit /hypervisorsettings serial DEBUGPORT:1 BAUDRATE:115200
bcdedit /set {default} hypervisordebug on
# ntoskrnl.exe debugging
bcdedit /set {default} dbgtransport kdhvcom.dll
bcdedit /dbgsettings serial DEBUGPORT:1 BAUDRATE:115200
bcdedit /debug {default} on
The Hyper-V VM needs a -serial unix:/.../hyperv-serial,server,nodelay
, the VM with windbg -serial unix:/.../windbg-serial,server,nowait,nodelay
.
WinDbg can't connect to this directly though, it has to be demuxed first (not documented anywhere, grrr!) using a tool from the Windows SDK Debug tools:
vmdemux.exe -src com:port=com1,baud=115200
Start the Hyper-V VM, then run socat unix-connect:/.../hyperv-serial unix-connect:/.../windbg-serial
to connect the two.
At some point WinDbg can be connected to \\.\pipe\Vm0
for the Hypervisor and \\.\pipe\Vm1
for the kernel in the root partition.
Updated by favogt about 1 year ago
- Assignee changed from favogt to ggardet_arm
- % Done changed from 0 to 70
I also experimented with WSL2 a bit, but didn't get it to work. With Hyper-V platform enabled (only possible on software emulation) it does not boot. I managed to enable hypervisordebug and attach WinDbg and it told me that EL3 is needed (i.e. -M virt,secure=on + ATF firmware), but then it fails even earlier due to an endless loop. With -cpu max it also tries to write to the HACR_EL2 register but that's not supported by current ATF, resulting in an "Undefined Instruction" exception. With -cpu neoverse-n1 this doesn't happen.
After some more debugging and experimenting I was able to get it to work: https://openqa.opensuse.org/tests/3560224#step/wsl_cmd_check/32
As for some reason Windows fails to boot if ATF is used as EL3, I tried an awful hack: Boot in EL2 but have QEMU present to the guest that EL3 is present:
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 0bb0585441..1cf23abc88 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2024,9 +2024,9 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
* feature registers as well.
*/
cpu->isar.id_pfr1 = FIELD_DP32(cpu->isar.id_pfr1, ID_PFR1, SECURITY, 0);
- cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPSDBG, 0);
+ /*cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPSDBG, 0);
cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
- ID_AA64PFR0, EL3, 0);
+ ID_AA64PFR0, EL3, 0);*/
/* Disable the realm management extension, which requires EL3. */
cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
With this patch, the -M virt,virtualization=on,gic-version=3,secure=off -cpu max
(or neoverse-n1
instead of max
to avoid the harmless undefined instruction exception) combination results in a booting system with Hyper-V enabled! As PoC I put this on openqaworker22:/usr/bin/qemu-system-aarch64-poo126083
and pointed the win11_uefi_aarch64_wsl2
machine type to it.
To avoid the hack it needs to be debugged why Windows fails to boot with ATF present. It might be enough to just get the latest ATF built with proper QEMU+OVMF+GICv3 support, such that using -M virt,virtualization=on,gic-version=3,secure=on -cpu max -bios atf.bin -kernel /usr/share/qemu/qemu-uefi-aarch64.bin
works. @ggardet_arm, could you have a look there?
Updated by favogt about 1 year ago
favogt wrote in #note-14:
To avoid the hack it needs to be debugged why Windows fails to boot with ATF present. It might be enough to just get the latest ATF built with proper QEMU+OVMF+GICv3 support, such that using
-M virt,virtualization=on,gic-version=3,secure=on -cpu max -bios atf.bin -kernel /usr/share/qemu/qemu-uefi-aarch64.bin
works. @ggardet_arm, could you have a look there?
WinDbg shows that HAL_INITIALIZATION_FAILED
in HalpInitializeInterrupts
. I suspect something that ATF does with the GIC isn't handled properly.
I found a way to boot it on vanilla QEMU 8.1, documented on https://gist.github.com/Vogtinator/293c4f90c5e92838f7e72610725905fd#file-wsl2-md. It needs one workaround for a QEMU bug which I fixed locally, I'll try to send that upstream.
The openQA workers run an older QEMU though which is missing fixes for using -M virt,secure=on
with -kernel
, so that doesn't quite work yet.
Updated by ggardet_arm about 1 year ago
FTR, openQA tested WSL successfully: https://openqa.opensuse.org/tests/overview?distri=opensuse&version=Tumbleweed&build=20230917&groupid=38&flavor=WSL