Project

General

Profile

Actions

action #179032

open

machine netboot.qe.prg2.suse.org can randomly fail "srv-tftpboot-mnt-openqa.mount"-unit

Added by nicksinger 2 months ago. Updated about 1 month ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Regressions/Crashes
Start date:
2025-03-17
Due date:
% Done:

0%

Estimated time:

Description

Observation

After VMs on qamaster got recovered, we received an alert about failing services on netboot: https://monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?viewPanel=panel-6&orgId=1&from=2025-03-17T10:31:21.141Z&to=2025-03-17T12:46:04.738Z&timezone=UTC
This was about srv-tftpboot-mnt-openqa.mount failing since 2025-03-16 3:30:

netboot:~ # systemctl --failed
  UNIT                          LOAD   ACTIVE SUB    DESCRIPTION
● srv-tftpboot-mnt-openqa.mount loaded failed failed /srv/tftpboot/mnt/openqa

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
netboot:~ # systemctl status srv-tftpboot-mnt-openqa.mount
× srv-tftpboot-mnt-openqa.mount - /srv/tftpboot/mnt/openqa
     Loaded: loaded (/etc/fstab; generated)
     Active: failed (Result: timeout) since Sun 2025-03-16 03:33:26 UTC; 1 day 8h ago
      Where: /srv/tftpboot/mnt/openqa
       What: openqa.suse.de:/factory
       Docs: man:fstab(5)
             man:systemd-fstab-generator(8)
        CPU: 16ms

Mar 16 03:31:56 netboot systemd[1]: Mounting /srv/tftpboot/mnt/openqa...
Mar 16 03:33:26 netboot systemd[1]: srv-tftpboot-mnt-openqa.mount: Mounting timed out. Terminating.
Mar 16 03:33:26 netboot systemd[1]: srv-tftpboot-mnt-openqa.mount: Mount process exited, code=killed, status=15/TERM
Mar 16 03:33:26 netboot systemd[1]: srv-tftpboot-mnt-openqa.mount: Failed with result 'timeout'.
Mar 16 03:33:26 netboot systemd[1]: Failed to mount /srv/tftpboot/mnt/openqa.
netboot:~ # uptime
 12:31:27  up 1 day  8:59,  1 user,  load average: 0.00, 0.00, 0.00

Acceptance criteria

  • AC1: netboot.qe.prg2.suse.org can reboot without any services failing afterwards

Suggestions


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure (public) - action #178972: [s390x][s390zl13][tools] nfs mount to openqa.suse.de is missing size:SResolvedmkittler2025-03-17

Actions
Actions #1

Updated by nicksinger 2 months ago

  • Tags set to infra, salt, reactive work
  • Description updated (diff)
  • Category set to Regressions/Crashes
Actions #2

Updated by okurz 2 months ago

  • Target version set to Tools - Next
  • Parent task set to #162350
Actions #3

Updated by nicksinger 2 months ago

  • Related to action #178972: [s390x][s390zl13][tools] nfs mount to openqa.suse.de is missing size:S added
Actions

Also available in: Atom PDF