Project

General

Profile

Actions

action #92302

closed

NFS mount var-lib-openqa-share.mount often fails after boot of some workers

Added by okurz almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
Due date:
2021-06-11
% Done:

0%

Estimated time:

Description

Observation

The nfs mount point systemd unit failed recently (and then turned to ok again) on one of our ARM workers. Likely the problem happens when an ARM machine is rebooted multiple times so that eventually we hit an alert window

Acceptance criteria

  • AC1: No alert about failed systemd services related to NFS mount failing on ARM workers

Suggestions

Rollback

  • DONE: ssh openqaworker-arm-3 "sudo systemctl enable --now salt-minion"
  • DONE: ssh osd "salt-key -y -a openqaworker-arm-3.suse.de && sudo salt 'openqaworker-arm-3*' state.apply"
  • DONE: Unpause alerts for openqaworker-arm-3

Related issues 3 (0 open3 closed)

Related to openQA Infrastructure - action #64941: after every reboot openqaworker7 is missing var-lib-openqa-share.mount , check dependencies of service with openqaworker1Resolvedokurz2020-03-272021-06-11

Actions
Copied from openQA Infrastructure - action #89551: NFS mount fails after boot (reproducible on some OSD workers)Resolvedmkittler2021-03-052021-03-31

Actions
Copied to openQA Infrastructure - action #92969: Failing service os-autoinst-openvswitch after boot of some workersResolvedokurz2021-05-23

Actions
Actions

Also available in: Atom PDF