Project

General

Profile

Actions

action #174580

open

[FIRING:1] Failed systemd services alert

Added by jbaier_cz about 6 hours ago. Updated 43 minutes ago.

Status:
In Progress
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-11-19
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services

Last update Host Failing units
2024-12-19 05:21:10 openqa openqa-gru
2024-12-18 11:49:00 openqaworker16 openqa-worker-cacheservice-minion
2024-12-18 12:02:00 openqaworker17 auto-update
2024-12-19 09:20:10 unreal6 var-lib-openqa-share.mount

openqaworker16+17 might be caused by #157975


Related issues 1 (1 open0 closed)

Copied to openQA Project (public) - action #174601: openqa-gru.service journal filled with openqa-trigger-bisect-jobs stack tracesNew

Actions
Actions #1

Updated by jbaier_cz about 6 hours ago

  • Subject changed from [FIRING:1] (Failed systemd services alert (except openqa.suse.de) Salt Uk02cifVkz) to [FIRING:1] Failed systemd services alert
Actions #2

Updated by nicksinger about 2 hours ago

  • Status changed from New to In Progress
  • Assignee set to nicksinger
Actions #3

Updated by nicksinger about 2 hours ago

  • Copied to action #174601: openqa-gru.service journal filled with openqa-trigger-bisect-jobs stack traces added
Actions #4

Updated by nicksinger 43 minutes ago

Last update Host Failing units Solution
2024-12-19 05:21:10 openqa openqa-gru https://progress.opensuse.org/issues/174601
2024-12-18 11:49:00 openqaworker16 openqa-worker-cacheservice-minion Worker up since 23h, cache-service working since 23h, I assume this was due to #157975
2024-12-18 12:02:00 openqaworker17 auto-update Last auto-update logs 11h ago look fine, uptime 24h so I think this is #157975 again

Not so sure about unreal6… nfs just continues to fail. I enabled debugging with rpcdebug -m nfs -s all and found a lot of:

[  674.514205] nfs_create_rpc_client: cannot create RPC client. Error = -111
[  674.514209] NFS4: Couldn't follow remote path
[  674.514210] <-- nfs4_try_get_tree() = -111 [error]

research showed that this might be caused by new kernels using NFSv4 as default. I tried using v3 but that didn't work either but only reduced log output to basically zero.
Trying around a little more showed:

[  674.406756] --> nfs4_try_get_tree()
[  674.411108] RPC:       set up xprt to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049) via tcp
[  674.420352] RPC:       Couldn't create auth handle (flavor 390004)
[  674.427489] nfs_create_rpc_client: cannot create RPC client. Error = -22
[  674.427510] RPC:       set up xprt to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049) via tcp
[  674.443965] RPC:        destroy backchannel transport
[  674.444087] RPC:       xs_connect scheduled xprt 00000000927f8b5e
[  674.449738] RPC:        backchannel list empty= true
[  674.456586] RPC:       xs_bind 0000:0000:0000:0000:0000:0000:0000:0000:739: ok (0)
[  674.456590] RPC:       worker connecting xprt 00000000927f8b5e via tcp to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049)
[  674.462304] RPC:       xs_destroy xprt 00000000b8e7f740
[  674.462306] RPC:       xs_close xprt 00000000b8e7f740
[  674.493785] RPC:       00000000927f8b5e connect status 115 connected 0 sock state 2
[  674.498381] RPC:       xs_tcp_state_change client 00000000927f8b5e...
[  674.509340] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  674.509343] RPC:       xs_error_report client 00000000927f8b5e, error=111...
[  674.509399] RPC:       xs_close xprt 00000000927f8b5e
[  674.509492] nfs_create_rpc_client: cannot create RPC client. Error = -111
[  674.509498] NFS4: Couldn't follow remote path
[  674.509499] <-- nfs4_try_get_tree() = -111 [error]
[  674.509579] --> nfs4_try_get_tree()
[  674.509586] RPC:       set up xprt to 10.145.10.207 (port 2049) via tcp
[  674.509625] RPC:       Couldn't create auth handle (flavor 390004)
[  674.509638] nfs_create_rpc_client: cannot create RPC client. Error = -22
[  674.509641] RPC:       set up xprt to 10.145.10.207 (port 2049) via tcp
[  674.509692] RPC:       xs_connect scheduled xprt 000000006b8fe44b
[  674.509707] RPC:       xs_bind 0.0.0.0:910: ok (0)
[  674.509711] RPC:       worker connecting xprt 000000006b8fe44b via tcp to 10.145.10.207 (port 2049)
[  674.509734] RPC:        destroy backchannel transport
[  674.509735] RPC:        backchannel list empty= true
[  674.509736] RPC:       xs_destroy xprt 000000001574dd41
[  674.509737] RPC:       xs_close xprt 000000001574dd41
[  674.509743] RPC:       000000006b8fe44b connect status 115 connected 0 sock state 2
[  674.514148] RPC:       xs_tcp_state_change client 000000006b8fe44b...
[  674.514150] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  674.514153] RPC:       xs_error_report client 000000006b8fe44b, error=111...
[  674.514166] RPC:       xs_close xprt 000000006b8fe44b
[  674.514168] RPC:       xs_tcp_state_change client 000000006b8fe44b...
[  674.514171] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  674.514205] nfs_create_rpc_client: cannot create RPC client. Error = -111
[  674.514209] NFS4: Couldn't follow remote path
[  674.514210] <-- nfs4_try_get_tree() = -111 [error]
[  674.514230] RPC:        destroy backchannel transport
[  674.514231] RPC:        backchannel list empty= true
[  674.514231] RPC:       xs_destroy xprt 000000006b8fe44b
[  674.514232] RPC:       xs_close xprt 000000006b8fe44b
[  674.717807] RPC:       xs_tcp_state_change client 00000000927f8b5e...
[  674.717808] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  674.717887] RPC:        destroy backchannel transport
[  674.737780] RPC:        backchannel list empty= true
[  674.737781] RPC:       xs_destroy xprt 00000000927f8b5e
[  674.737782] RPC:       xs_close xprt 00000000927f8b5e
[  681.533874] --> nfs4_try_get_tree()
[  681.538229] RPC:       set up xprt to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049) via tcp
[  681.547455] RPC:       Couldn't create auth handle (flavor 390004)
[  681.554438] nfs_create_rpc_client: cannot create RPC client. Error = -22
[  681.554442] RPC:       set up xprt to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049) via tcp
[  681.570873] RPC:        destroy backchannel transport
[  681.570874] RPC:        backchannel list empty= true
[  681.570875] RPC:       xs_destroy xprt 00000000fc277b31
[  681.570876] RPC:       xs_close xprt 00000000fc277b31
[  681.570925] RPC:       xs_connect scheduled xprt 00000000f1d115d3
[  681.600956] RPC:       xs_bind 0000:0000:0000:0000:0000:0000:0000:0000:719: ok (0)
[  681.600961] RPC:       worker connecting xprt 00000000f1d115d3 via tcp to 2a07:de40:b203:12:0:ff:fe4f:7c2b (port 2049)
[  681.600988] RPC:       00000000f1d115d3 connect status 115 connected 0 sock state 2
[  681.609285] RPC:       xs_tcp_state_change client 00000000f1d115d3...
[  681.609287] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  681.609291] RPC:       xs_error_report client 00000000f1d115d3, error=111...
[  681.609402] nfs_create_rpc_client: cannot create RPC client. Error = -111
[  681.620816] RPC:       xs_close xprt 00000000f1d115d3
[  681.629138] NFS4: Couldn't follow remote path
[  681.629140] <-- nfs4_try_get_tree() = -111 [error]
[  681.629247] --> nfs4_try_get_tree()
[  681.636307] RPC:       xs_tcp_state_change client 00000000f1d115d3...
[  681.643295] RPC:       set up xprt to 10.145.10.207 (port 2049) via tcp
[  681.651056] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  681.658622] RPC:       Couldn't create auth handle (flavor 390004)
[  681.664435] RPC:        destroy backchannel transport
[  681.669468] nfs_create_rpc_client: cannot create RPC client. Error = -22
[  681.674951] RPC:        backchannel list empty= true
[  681.674953] RPC:       xs_destroy xprt 00000000f1d115d3
[  681.679179] RPC:       set up xprt to 10.145.10.207 (port 2049) via tcp
[  681.686329] RPC:       xs_close xprt 00000000f1d115d3
[  681.745581] RPC:        destroy backchannel transport
[  681.745600] RPC:       xs_connect scheduled xprt 00000000bbccf499
[  681.751356] RPC:        backchannel list empty= true
[  681.758213] RPC:       xs_bind 0.0.0.0:763: ok (0)
[  681.763867] RPC:       xs_destroy xprt 00000000558eb3a1
[  681.769391] RPC:       worker connecting xprt 00000000bbccf499 via tcp to 10.145.10.207 (port 2049)
[  681.775334] RPC:       xs_close xprt 00000000558eb3a1
[  681.790927] RPC:       00000000bbccf499 connect status 115 connected 0 sock state 2
[  681.799315] RPC:       xs_tcp_state_change client 00000000bbccf499...
[  681.799317] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  681.799320] RPC:       xs_error_report client 00000000bbccf499, error=111...
[  681.799382] RPC:       xs_close xprt 00000000bbccf499
[  681.799431] nfs_create_rpc_client: cannot create RPC client. Error = -111
[  681.799438] NFS4: Couldn't follow remote path
[  681.799439] <-- nfs4_try_get_tree() = -111 [error]
[  681.845140] RPC:       xs_tcp_state_change client 00000000bbccf499...
[  681.845141] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  681.845213] RPC:        destroy backchannel transport
[  681.865096] RPC:        backchannel list empty= true
[  681.865097] RPC:       xs_destroy xprt 00000000bbccf499
[  681.865098] RPC:       xs_close xprt 00000000bbccf499

so maybe related to some kind of firewall? Have to check further.

Actions

Also available in: Atom PDF