Project

General

Profile

Actions

action #89551

closed

NFS mount fails after boot (reproducible on some OSD workers)

Added by mkittler almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2021-03-05
Due date:
2021-03-31
% Done:

0%

Estimated time:

Description

problem

  • The NFS mount for /var/lib/openqa/share fails when booting. This has been reproduced on openqaworker13 (see #88900), openqaworker2 (see #89551#note-5) and possibly openqaworker-arm-2 (see #75016).
  • Technically all workers might be affected because the problem is quite generic: The systemd service for the NFS mount does not wait until the ethernet connection is established.

impact

The mounting is actually re-attempted automatically after a few minutes so it shouldn't be a big deal. However, since the systemd unit for the mount stays failed for a few minutes false alerts are triggered which should be prevented.

acceptance criteria

  • AC1: No false alerts are triggered if the NFS mountpoint fails just for a few minutes after booting.

notes


Related issues 5 (0 open5 closed)

Related to openQA Infrastructure (public) - action #88191: openqaworker2 boot ends in emergency shellResolvedmkittler2021-01-25

Actions
Related to openQA Infrastructure (public) - action #68053: powerqaworker-qam-1 fails to come up on reboot (repeatedly)Resolvedokurz2020-06-14

Actions
Related to openQA Infrastructure (public) - action #75016: [osd-admins][alert] Failed systemd services alert (workers): os-autoinst-openvswitch.service (and var-lib-openqa-share.mount) on openqaworker-arm-2 and othersResolvedmkittler2020-10-21

Actions
Related to openQA Infrastructure (public) - action #88900: openqaworker13 was unreachableResolvedmkittler2021-02-22

Actions
Copied to openQA Infrastructure (public) - action #92302: NFS mount var-lib-openqa-share.mount often fails after boot of some workersResolvedokurz2021-06-11

Actions
Actions

Also available in: Atom PDF