action #107980
openqa-aarch64 - kernel traces in dmesg "watchdog: BUG: soft lockup - CPU#.* stuck for .*s! [qemu-system-aar:.*]"
0%
Description
openqa-aarch64
machine shows kernel traces in dmesg. This may explain the performances issues / mistyping issues: https://progress.opensuse.org/issues/54914
[ 8545.240606] ------------[ cut here ]------------ [ 8545.240613] NETDEV WATCHDOG: eth0 (hns-nic): transmit queue 5 timed out [ 8545.240639] WARNING: CPU: 16 PID: 0 at ../net/sched/sch_generic.c:468 dev_watchdog+0x314/0x320 [ 8545.240640] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [ 8545.240705] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [ 8545.240724] Supported: No, Unsupported modules are loaded [ 8545.240728] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G N 5.3.18-150300.59.49-default #1 SLE15-SP3 [ 8545.240729] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [ 8545.240731] pstate: 40000005 (nZcv daif -PAN -UAO) [ 8545.240733] pc : dev_watchdog+0x314/0x320 [ 8545.240735] lr : dev_watchdog+0x314/0x320 [ 8545.240736] sp : ffff80001231bd80 [ 8545.240737] x29: ffff80001231bd80 x28: 0000000000000002 [ 8545.240739] x27: ffff800011335000 x26: 0000000000000140 [ 8545.240741] x25: 00000000ffffffff x24: 0000000000000000 [ 8545.240743] x23: ffff800011335000 x22: ffff009fa9f33480 [ 8545.240745] x21: ffff800011797000 x20: ffff009fa9f33000 [ 8545.240746] x19: 0000000000000005 x18: ffffffffffffffff [ 8545.240748] x17: 0000000000000000 x16: 0000000000000007 [ 8545.240750] x15: ffff800011799908 x14: ffff8000119bcce0 [ 8545.240752] x13: ffff8000119bc933 x12: ffff8000117c2000 [ 8545.240754] x11: 0000000000000000 x10: ffff8000119bb000 [ 8545.240756] x9 : 0000000000000000 x8 : 0000000000000001 [ 8545.240758] x7 : 00000000000005af x6 : 0000003571b8a92a [ 8545.240759] x5 : 0000000000000001 x4 : 0000000000000000 [ 8545.240761] x3 : ffff80001231bb10 x2 : ffff001fbba02208 [ 8545.240763] x1 : dcdbc4d1d4d45200 x0 : 0000000000000000 [ 8545.240765] Call trace: [ 8545.240768] dev_watchdog+0x314/0x320 [ 8545.240773] call_timer_fn+0x3c/0x180 [ 8545.240774] expire_timers+0x9c/0x168 [ 8545.240777] run_timer_softirq+0x218/0x2a8 [ 8545.240780] __do_softirq+0x11c/0x320 [ 8545.240784] irq_exit+0x108/0x120 [ 8545.240787] __handle_domain_irq+0x6c/0xc0 [ 8545.240788] gic_handle_irq+0xf4/0x2a0 [ 8545.240790] el1_irq+0xcc/0x180 [ 8545.240794] arch_cpu_idle+0x34/0x1c8 [ 8545.240800] default_idle_call+0x24/0x48 [ 8545.240803] do_idle+0x1dc/0x2e0 [ 8545.240804] cpu_startup_entry+0x2c/0x30 [ 8545.240807] secondary_start_kernel+0x1b4/0x258 [ 8545.240809] ---[ end trace 55439782d98aa8d8 ]--- [ 8545.240815] hns-nic HISI00C2:00 eth0: watchdog_timo changed to 1000. [14000.639084] watchdog: BUG: soft lockup - CPU#10 stuck for 26s! [qemu-system-aar:29364] [14000.646991] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14000.647059] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14000.647079] Supported: No, Unsupported modules are loaded [14000.647084] CPU: 10 PID: 29364 Comm: qemu-system-aar Tainted: G W N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14000.647086] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14000.647088] pstate: 80000005 (Nzcv daif -PAN -UAO) [14000.647097] pc : queued_spin_lock_slowpath+0x208/0x2d8 [14000.647102] lr : kvm_mmu_notifier_invalidate_range_start+0xdc/0xe0 [14000.647103] sp : ffff80001d6a3940 [14000.647105] x29: ffff80001d6a3940 x28: ffff800010bcaba0 [14000.647107] x27: ffff800010fa5000 x26: ffff800010fc9a28 [14000.647109] x25: ffff800010fa5b50 x24: 0000000000000000 [14000.647111] x23: ffff80001d6a3a90 x22: ffff009552781328 [14000.647113] x21: 0000000000000000 x20: 0000000000740101 [14000.647115] x19: ffff009552780000 x18: 0000000000000000 [14000.647118] x17: 0000000000740101 x16: 0000000000740101 [14000.647120] x15: 0000000000000000 x14: 0000000000000000 [14000.647121] x13: 0000000000000000 x12: 0000000000000000 [14000.647124] x11: 0000000000000000 x10: 0000000000000000 [14000.647125] x9 : 0000000000000007 x8 : ffff009552780002 [14000.647128] x7 : ffff80001134b740 x6 : 0000000000000000 [14000.647130] x5 : ffff001fbb956740 x4 : 00000000002c0000 [14000.647132] x3 : ffff001fbb956740 x2 : 0000000000000000 [14000.647133] x1 : 0000000000000000 x0 : ffff001fbb956748 [14000.647137] Call trace: [14000.647141] queued_spin_lock_slowpath+0x208/0x2d8 [14000.647143] kvm_mmu_notifier_invalidate_range_start+0xdc/0xe0 [14000.647148] __mmu_notifier_invalidate_range_start+0x9c/0x208 [14000.647151] try_to_unmap_one+0x8b4/0xb60 [14000.647153] rmap_walk_anon+0xe4/0x230 [14000.647156] rmap_walk+0x78/0xa0 [14000.647157] try_to_unmap+0xf4/0x130 [14000.647160] migrate_pages+0x9d4/0xc00 [14000.647162] migrate_misplaced_page+0x168/0x248 [14000.647165] __handle_mm_fault+0xf48/0x1148 [14000.647168] handle_mm_fault+0xe0/0x1b0 [14000.647172] do_page_fault+0x200/0x4d0 [14000.647175] do_translation_fault+0xb0/0xc0 [14000.647177] do_mem_abort+0x50/0xb0 [14000.647178] el0_da+0x24/0x28 [14000.699085] watchdog: BUG: soft lockup - CPU#28 stuck for 26s! [qemu-system-aar:29380] [14000.706993] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14000.707043] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14000.707057] Supported: No, Unsupported modules are loaded [14000.707062] CPU: 28 PID: 29380 Comm: qemu-system-aar Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14000.707064] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14000.707065] pstate: 80000005 (Nzcv daif -PAN -UAO) [14000.707072] pc : queued_spin_lock_slowpath+0x250/0x2d8 [14000.707075] lr : kvm_handle_guest_abort+0xaac/0xe88 [14000.707076] sp : ffff80001d60b9f0 [14000.707077] x29: ffff80001d60b9f0 x28: 0000000009b00000 [14000.707080] x27: 0000000000000000 x26: ffff00952c7146e0 [14000.707082] x25: 0000fffedb5d2000 x24: ffff009552780000 [14000.707084] x23: 0000000000000000 x22: 0000000000080000 [14000.707086] x21: 000000009b5d2000 x20: 0000000000bc0101 [14000.707088] x19: ffff009552780000 x18: 0000000000000000 [14000.707090] x17: 0000000000bc0101 x16: 0000000000bc0101 [14000.707092] x15: 0000000000000000 x14: 0000000100000018 [14000.707094] x13: 0000000100000015 x12: 0000000200000014 [14000.707096] x11: 0000000100000013 x10: 0000ffff00000000 [14000.707098] x9 : ffff8000100b3e70 x8 : ffff009552780002 [14000.707100] x7 : ffff001fbb956740 x6 : 0000000000000000 [14000.707102] x5 : ffff001fbbb96740 x4 : 0000000000740000 [14000.707104] x3 : ffff001fbbb96740 x2 : 0000000000000001 [14000.707105] x1 : 0000000000440001 x0 : 0000000000000000 [14000.707108] Call trace: [14000.707111] queued_spin_lock_slowpath+0x250/0x2d8 [14000.707113] kvm_handle_guest_abort+0xaac/0xe88 [14000.707115] handle_exit+0x14c/0x1c8 [14000.707119] kvm_arch_vcpu_ioctl_run+0x29c/0x8a8 [14000.707122] kvm_vcpu_ioctl+0x490/0x8b0 [14000.707126] ksys_ioctl+0xb4/0xd0 [14000.707129] __arm64_sys_ioctl+0x28/0x38 [14000.707132] el0_svc_common.constprop.0+0x84/0x218 [14000.707134] el0_svc_handler+0x34/0x90 [14000.707137] el0_svc+0x10/0x14 [14000.759108] watchdog: BUG: soft lockup - CPU#46 stuck for 26s! [qemu-system-aar:29382] [14000.767019] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14000.767095] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14000.767116] Supported: No, Unsupported modules are loaded [14000.767122] CPU: 46 PID: 29382 Comm: qemu-system-aar Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14000.767123] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14000.767125] pstate: 80000005 (Nzcv daif -PAN -UAO) [14000.767136] pc : invalidate_icache_range+0x28/0x50 [14000.767140] lr : kvm_handle_guest_abort+0xc98/0xe88 [14000.767141] sp : ffff80001daf3a10 [14000.767142] x29: ffff80001daf3a10 x28: 0000000009b00000 [14000.767144] x27: 0000000000000000 x26: ffff0096d76ec6e0 [14000.767146] x25: 0000fffedb5d1000 x24: ffff009552780000 [14000.767148] x23: 0000000000000000 x22: 0000000000080000 [14000.767149] x21: 000000009b5d1000 x20: 000000009b5d1000 [14000.767151] x19: ffff800011799000 x18: 0000000000000000 [14000.767153] x17: 06ffff800001000c x16: 0000000000000000 [14000.767154] x15: 0000000000000000 x14: 0000000000000000 [14000.767156] x13: 0000000000000000 x12: 0000000000000000 [14000.767157] x11: 0000000000000000 x10: 0000ffff00000000 [14000.767159] x9 : 0000000000000000 x8 : ffff009552780002 [14000.767161] x7 : 0000000ffc000000 x6 : 0000000000000018 [14000.767162] x5 : ffff800011a669f8 x4 : ffff800011a668d8 [14000.767164] x3 : ffff009b2b7fbb00 x2 : 0000000000000040 [14000.767165] x1 : ffff009b40000000 x0 : ffff009b00000000 [14000.767168] Call trace: [14000.767170] invalidate_icache_range+0x28/0x50 [14000.767175] handle_exit+0x14c/0x1c8 [14000.767180] kvm_arch_vcpu_ioctl_run+0x29c/0x8a8 [14000.767184] kvm_vcpu_ioctl+0x490/0x8b0 [14000.767189] ksys_ioctl+0xb4/0xd0 [14000.767191] __arm64_sys_ioctl+0x28/0x38 [14000.767196] el0_svc_common.constprop.0+0x84/0x218 [14000.767198] el0_svc_handler+0x34/0x90 [14000.767200] el0_svc+0x10/0x14 [14004.769068] watchdog: BUG: soft lockup - CPU#48 stuck for 22s! [migration/48:252] [14004.776549] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14004.776626] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14004.776648] Supported: No, Unsupported modules are loaded [14004.776656] CPU: 48 PID: 252 Comm: migration/48 Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14004.776658] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14004.776660] pstate: 60000005 (nZCv daif -PAN -UAO) [14004.776676] pc : multi_cpu_stop+0xf8/0x190 [14004.776678] lr : multi_cpu_stop+0xec/0x190 [14004.776679] sp : ffff800013f13d50 [14004.776680] x29: ffff800013f13d50 x28: 0000000000000000 [14004.776682] x27: 0000000000000000 x26: 0000000000000060 [14004.776684] x25: 0000000000000000 x24: 0000000000000000 [14004.776686] x23: ffff800010bd8130 x22: 0000000000000001 [14004.776687] x21: ffff800022e73674 x20: ffff800022e73650 [14004.776689] x19: 0000000000000001 x18: 0000000000000000 [14004.776690] x17: 0000000000000001 x16: 0000000000000000 [14004.776692] x15: 0000000000000000 x14: 0000000000000000 [14004.776693] x13: 0000000000000000 x12: 0000000000000000 [14004.776695] x11: 0000000000000000 x10: 0000000000001a50 [14004.776696] x9 : ffff800013f13d30 x8 : ffff00903c8bf630 [14004.776698] x7 : 0000000000000030 x6 : 0000000000000000 [14004.776699] x5 : 0000000000000000 x4 : 0000000000001ac0 [14004.776701] x3 : ffff009faba08b40 x2 : ffff00903c8bdb80 [14004.776703] x1 : ffff800022e73674 x0 : ffff8000101f4e8c [14004.776706] Call trace: [14004.776709] multi_cpu_stop+0xf8/0x190 [14004.776711] cpu_stopper_thread+0xd8/0x170 [14004.776715] smpboot_thread_fn+0x184/0x1b8 [14004.776718] kthread+0x130/0x138 [14004.776722] ret_from_fork+0x10/0x18 [14008.649039] watchdog: BUG: soft lockup - CPU#13 stuck for 22s! [qemu-system-aar:29381] [14008.656947] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14008.657014] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14008.657034] Supported: No, Unsupported modules are loaded [14008.657038] CPU: 13 PID: 29381 Comm: qemu-system-aar Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14008.657040] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14008.657042] pstate: 80000005 (Nzcv daif -PAN -UAO) [14008.657051] pc : queued_spin_lock_slowpath+0x208/0x2d8 [14008.657054] lr : kvm_handle_guest_abort+0xaac/0xe88 [14008.657055] sp : ffff80001daeb9f0 [14008.657056] x29: ffff80001daeb9f0 x28: 0000000009e40000 [14008.657058] x27: 0000000000000000 x26: ffff00952c712370 [14008.657060] x25: 0000ffff014e4000 x24: ffff009552780000 [14008.657062] x23: 0000000000000000 x22: 00000000000c0000 [14008.657064] x21: 00000000c14e4000 x20: 00000000002c0001 [14008.657066] x19: ffff009552780000 x18: 0000000000000001 [14008.657067] x17: 00000000002c0001 x16: 00000000002c0001 [14008.657069] x15: 0000000000000000 x14: ffffffffffffffe8 [14008.657070] x13: 0000ffff00000000 x12: 000000037f52c2da [14008.657072] x11: 00002b8130000000 x10: 0000ffff40000000 [14008.657075] x9 : 0000000000000018 x8 : ffff009552780002 [14008.657077] x7 : ffff80001134b740 x6 : 0000000000000000 [14008.657078] x5 : ffff001fbb9b6740 x4 : 0000000000380000 [14008.657080] x3 : ffff001fbb9b6740 x2 : 0000000000000000 [14008.657082] x1 : 0000000000000000 x0 : ffff001fbb9b6748 [14008.657085] Call trace: [14008.657089] queued_spin_lock_slowpath+0x208/0x2d8 [14008.657091] kvm_handle_guest_abort+0xaac/0xe88 [14008.657094] handle_exit+0x14c/0x1c8 [14008.657097] kvm_arch_vcpu_ioctl_run+0x29c/0x8a8 [14008.657100] kvm_vcpu_ioctl+0x490/0x8b0 [14008.657104] ksys_ioctl+0xb4/0xd0 [14008.657106] __arm64_sys_ioctl+0x28/0x38 [14008.657110] el0_svc_common.constprop.0+0x84/0x218 [14008.657113] el0_svc_handler+0x34/0x90 [14008.657116] el0_svc+0x10/0x14 [14008.749042] watchdog: BUG: soft lockup - CPU#42 stuck for 22s! [migration/42:222] [14008.756520] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14008.756601] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14008.756622] Supported: No, Unsupported modules are loaded [14008.756629] CPU: 42 PID: 222 Comm: migration/42 Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14008.756631] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14008.756634] pstate: 60000005 (nZCv daif -PAN -UAO) [14008.756650] pc : stop_machine_yield+0x14/0x20 [14008.756652] lr : multi_cpu_stop+0xec/0x190 [14008.756653] sp : ffff800013cabd40 [14008.756655] x29: ffff800013cabd40 x28: 0000000000000000 [14008.756657] x27: 0000000000000000 x26: 0000000000000060 [14008.756660] x25: 0000000000000000 x24: 0000000000000000 [14008.756661] x23: ffff800010bd7d70 x22: 0000000000000001 [14008.756664] x21: ffff8000359e37b4 x20: ffff8000359e3790 [14008.756665] x19: 0000000000000001 x18: 0000000000000000 [14008.756667] x17: 0000000000000001 x16: 0000000000000000 [14008.756669] x15: 0000000000000000 x14: 00000000000026b3 [14008.756671] x13: ffff001fbbbf5b80 x12: 0000000000000020 [14008.756673] x11: ffff0016e3034480 x10: 0000000000001a50 [14008.756675] x9 : ffff800013cabd30 x8 : ffff00903cfff630 [14008.756677] x7 : 0000000000000001 x6 : 000000000000002a [14008.756679] x5 : 0000000000000000 x4 : 0000000000001ac0 [14008.756680] x3 : ffff009fab948b40 x2 : ffff00903cffdb80 [14008.756683] x1 : ffff8000359e37b4 x0 : ffff8000101f4e8c [14008.756685] Call trace: [14008.756688] stop_machine_yield+0x14/0x20 [14008.756690] multi_cpu_stop+0xec/0x190 [14008.756693] cpu_stopper_thread+0xd8/0x170 [14008.756697] smpboot_thread_fn+0x184/0x1b8 [14008.756701] kthread+0x130/0x138 [14008.756705] ret_from_fork+0x10/0x18 [14008.789035] watchdog: BUG: soft lockup - CPU#54 stuck for 26s! [migration/54:282] [14008.796506] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14008.796556] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14008.796571] Supported: No, Unsupported modules are loaded [14008.796574] CPU: 54 PID: 282 Comm: migration/54 Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14008.796575] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14008.796577] pstate: 60000005 (nZCv daif -PAN -UAO) [14008.796583] pc : multi_cpu_stop+0xec/0x190 [14008.796585] lr : multi_cpu_stop+0xec/0x190 [14008.796586] sp : ffff800014133d50 [14008.796587] x29: ffff800014133d50 x28: 0000000000000000 [14008.796589] x27: 0000000000000000 x26: 0000000000000060 [14008.796591] x25: 0000000000000000 x24: 0000000000000000 [14008.796592] x23: ffff800010bd8010 x22: 0000000000000001 [14008.796594] x21: ffff800035283a84 x20: ffff800035283a60 [14008.796596] x19: 0000000000000001 x18: 0000000000000000 [14008.796597] x17: 0000000000000001 x16: 0000000000000000 [14008.796598] x15: 0000000000000000 x14: 0000000000000000 [14008.796600] x13: 0000000000000000 x12: 0000000000000000 [14008.796601] x11: 0000000000000000 x10: 0000000000001a50 [14008.796603] x9 : ffff800014133d30 x8 : ffff00903c9bb930 [14008.796605] x7 : 0000000000000001 x6 : 0000000000000036 [14008.796607] x5 : 0000000000000000 x4 : 0000000000001ac0 [14008.796608] x3 : ffff009fabac8b40 x2 : ffff00903c9b9e80 [14008.796610] x1 : ffff800035283a84 x0 : ffff8000101f4e8c [14008.796612] Call trace: [14008.796614] multi_cpu_stop+0xec/0x190 [14008.796616] cpu_stopper_thread+0xd8/0x170 [14008.796618] smpboot_thread_fn+0x184/0x1b8 [14008.796621] kthread+0x130/0x138 [14008.796623] ret_from_fork+0x10/0x18 [14012.619015] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [qemu-system-aar:32507] [14012.626836] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14012.626900] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14012.626919] Supported: No, Unsupported modules are loaded [14012.626924] CPU: 3 PID: 32507 Comm: qemu-system-aar Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14012.626925] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14012.626927] pstate: 80000005 (Nzcv daif -PAN -UAO) [14012.626935] pc : queued_spin_lock_slowpath+0x208/0x2d8 [14012.626939] lr : kvm_mmu_notifier_invalidate_range_start+0xdc/0xe0 [14012.626940] sp : ffff800024ab3bd0 [14012.626941] x29: ffff800024ab3bd0 x28: ffff800010bcaba0 [14012.626943] x27: ffff800010fa5000 x26: ffff800010fc9a28 [14012.626945] x25: ffff800010fa5b50 x24: 0000000000000000 [14012.626947] x23: ffff800024ab3ce8 x22: ffff009552781328 [14012.626948] x21: 0000000000000000 x20: 0000000000380001 [14012.626950] x19: ffff009552780000 x18: 0000000000000000 [14012.626952] x17: 0000000000380001 x16: 0000000000380001 [14012.626953] x15: 0000000000000000 x14: 0000000000000000 [14012.626955] x13: 0000000000000000 x12: 0000000000000000 [14012.626956] x11: 0000000000000000 x10: 0000fffe6bf10000 [14012.626958] x9 : 0000000000000000 x8 : ffff009552780002 [14012.626960] x7 : ffff80001134b740 x6 : 0000000000000000 [14012.626962] x5 : ffff001fbb876740 x4 : 0000000000100000 [14012.626964] x3 : ffff001fbb876740 x2 : 0000000000000000 [14012.626965] x1 : 0000000000000000 x0 : ffff001fbb876748 [14012.626968] Call trace: [14012.626971] queued_spin_lock_slowpath+0x208/0x2d8 [14012.626973] kvm_mmu_notifier_invalidate_range_start+0xdc/0xe0 [14012.626977] __mmu_notifier_invalidate_range_start+0x9c/0x208 [14012.626980] zap_page_range+0x154/0x170 [14012.626983] __arm64_sys_madvise+0x56c/0x9e0 [14012.626986] el0_svc_common.constprop.0+0x84/0x218 [14012.626988] el0_svc_handler+0x34/0x90 [14012.626990] el0_svc+0x10/0x14 [14012.729016] watchdog: BUG: soft lockup - CPU#38 stuck for 26s! [migration/38:202] [14012.736493] Modules linked in: af_packet nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nft_reject nft_ct tun nfnetlink_cttimeout iscsi_ibft iscsi_boot_sysfs rfkill nft_chain_nat openvswitch nsh nf_conncount nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_ssif marvell ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce joydev ses gf128mul enclosure sha1_ce ipmi_si efi_pstore(N) sbsa_gwdt ipmi_devintf ipmi_msghandler button hns_dsaf hns_enet_drv hns_mdio hnae fuse btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq sd_mod t10_pi hid_generic usbhid hibmc_drm drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec [14012.736572] hisi_sas_v2_hw ehci_platform rc_core hisi_sas_main drm_ttm_helper libsas ehci_hcd ttm scsi_transport_sas drm usbcore libata sha2_ce sha256_arm64 i2c_designware_platform i2c_designware_core overlay sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs [14012.736593] Supported: No, Unsupported modules are loaded [14012.736599] CPU: 38 PID: 202 Comm: migration/38 Tainted: G W L N 5.3.18-150300.59.49-default #1 SLE15-SP3 [14012.736601] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [14012.736603] pstate: 60000005 (nZCv daif -PAN -UAO) [14012.736619] pc : stop_machine_yield+0x0/0x20 [14012.736621] lr : multi_cpu_stop+0xec/0x190 [14012.736623] sp : ffff800013b23d50 [14012.736624] x29: ffff800013b23d50 x28: 0000000000000000 [14012.736626] x27: 0000000000000000 x26: 0000000000000060 [14012.736628] x25: 0000000000000000 x24: 0000000000000000 [14012.736630] x23: ffff800010bd86d0 x22: 0000000000000001 [14012.736631] x21: ffff800032ec3a84 x20: ffff800032ec3a60 [14012.736633] x19: 0000000000000001 x18: 0000000000000000 [14012.736634] x17: 0000000000000001 x16: 0000000000000000 [14012.736637] x15: 0000aaaaed676290 x14: 0000aaaaed676290 [14012.736638] x13: 0000aaaaef858b30 x12: 0000aaaaed676020 [14012.736640] x11: 0000000000000000 x10: 0000000000001a50 [14012.736642] x9 : ffff800013b23d30 x8 : ffff00903cf5b930 [14012.736644] x7 : 0000000000000026 x6 : 00000000000000be [14012.736646] x5 : 0000000000000000 x4 : 0000000000001ac0 [14012.736648] x3 : ffff009fab8c8b40 x2 : ffff00903cf59e80 [14012.736649] x1 : ffff800032ec3a84 x0 : ffff800010bd86d0 [14012.736652] Call trace: [14012.736656] stop_machine_yield+0x0/0x20 [14012.736659] cpu_stopper_thread+0xd8/0x170 [14012.736663] smpboot_thread_fn+0x184/0x1b8 [14012.736666] kthread+0x130/0x138 [14012.736671] ret_from_fork+0x10/0x18
Related issues
History
#1
Updated by ggardet_arm over 1 year ago
- Related to action #54914: [Leap-15.4][qe-core][functional] Mistyping in some modules: vlc, x_vt (formerly xorg_vt), etc. added
#3
Updated by okurz over 1 year ago
- Subject changed from openqa-aarch64 - kernel traces in dmesg to openqa-aarch64 - kernel traces in dmesg "watchdog: BUG: soft lockup - CPU#.* stuck for .*s! [qemu-system-aar:.*]"
- Priority changed from Normal to Low
- Target version set to future
Yes, could be. But we should keep in mind that we have such messages since long. journalctl | grep 'BUG: soft lockup'
shows for example an entry from Sep 30 18:39:26
. And journalctl | grep 'BUG: soft lockup' | wc -l
yields 312.
ggardet_arm I wonder, can you if you see the same pattern on other machines as well?
I suggest to do a web research what can be done about such problems. The "soft lockup" itself is not a bug that can be fixed but rather an alert coming from the kernel about a deeper problem. Also you yourself reported the same problem e.g. in https://bugzilla.suse.com/show_bug.cgi?id=1177624
#4
Updated by ggardet_arm about 1 year ago
okurz wrote:
Yes, could be. But we should keep in mind that we have such messages since long.
journalctl | grep 'BUG: soft lockup'
shows for example an entry fromSep 30 18:39:26
. Andjournalctl | grep 'BUG: soft lockup' | wc -l
yields 312.ggardet_arm I wonder, can you if you see the same pattern on other machines as well?
I cannot see that on other aarch64 workers.
I suggest to do a web research what can be done about such problems. The "soft lockup" itself is not a bug that can be fixed but rather an alert coming from the kernel about a deeper problem. Also you yourself reported the same problem e.g. in https://bugzilla.suse.com/show_bug.cgi?id=1177624
I am trying to reduce the number of workers on openqa-aarch64
from 16 to 14 to check if there is any improvement.