action #128822
closed
processes on qanet slow to execute despite low load, e.g. htop - do we have outdated addresses pointing to wotan where we should use different hosts?
Added by okurz over 1 year ago.
Updated over 1 year ago.
Description
Observation¶
nicksinger and me logged into qanet and called "htop" as okurz observed a slow ssh login when trying to work on #128654. Also time is off by 5m. But first things first. We found that the load is low, CPU usage is low, MEM low, I/O low but then we found that it's related to NFS and identified ypbind as culprit. After stopping the service ypbind the system was snappy again. But grep -R 10.160. /etc/
revealed some mentions of 10.160.0.1. Do we need to use a newer address in more places?
- Status changed from New to In Progress
- Assignee set to okurz
- Due date set to 2023-05-19
- Status changed from In Progress to Feedback
Reviewed our salt controlled machines if they have maybe an outdated ypserver entry:
okurz@openqa:~> sudo salt --no-color \* cmd.run 'grep 10.160 /etc/yp.conf'
worker9.oqa.suse.de:
domain suse.de server 10.160.0.1
worker2.oqa.suse.de:
domain suse.de server 10.160.0.1
powerqaworker-qam-1.qa.suse.de:
domain suse.de server 10.160.0.1
malbec.arch.suse.de:
worker5.oqa.suse.de:
domain suse.de server 10.160.0.1
openqaw5-xen.qa.suse.de:
domain suse.de server 10.160.0.1
QA-Power8-4-kvm.qa.suse.de:
domain suse.de server 10.160.0.1
storage.oqa.suse.de:
domain suse.de server 10.160.0.1
baremetal-support:
domain suse.de server 10.160.0.1
openqaworker1.qe.nue2.suse.org:
domain suse.de server 10.160.0.1
qamasternue.qa.suse.de:
worker11.oqa.suse.de:
domain suse.de server 10.160.0.1
worker6.oqa.suse.de:
domain suse.de server 10.160.0.1
worker10.oqa.suse.de:
domain suse.de server 10.160.0.1
openqaworker18.qa.suse.cz:
openqaworker16.qa.suse.cz:
worker13.oqa.suse.de:
domain suse.de server 10.160.0.1
QA-Power8-5-kvm.qa.suse.de:
domain suse.de server 10.160.0.1
worker12.oqa.suse.de:
domain suse.de server 10.160.0.1
worker3.oqa.suse.de:
domain suse.de server 10.160.0.1
grenache-1.qa.suse.de:
domain suse.de server 10.160.0.1
worker8.oqa.suse.de:
domain suse.de server 10.160.0.1
openqaworker14.qa.suse.cz:
openqa-monitor.qa.suse.de:
domain suse.de server 10.160.0.1
openqaworker17.qa.suse.cz:
backup.qa.suse.de:
domain suse.de server 10.160.0.150
domain suse.de server 10.160.0.1
schort-server:
domain suse.de server 10.160.0.1
tumblesle:
domain suse.de server 10.160.0.1
jenkins.qa.suse.de:
domain suse.de server 10.160.0.150
domain suse.de server 10.160.0.1
openqa-piworker.qa.suse.de:
domain suse.de server 10.160.0.1
openqaworker-arm-2.suse.de:
domain suse.de server 10.160.0.1
openqaworker-arm-1.suse.de:
domain suse.de server 10.160.0.1
openqa.suse.de:
openqaworker-arm-3.suse.de:
domain suse.de server 10.160.0.1
ERROR: Minions returned with non-zero exit code
okurz@openqa:~> sudo salt --no-color \* cmd.run 'grep ypserver /etc/yp.conf'
storage.oqa.suse.de:
worker3.oqa.suse.de:
openqaworker17.qa.suse.cz:
openqaworker18.qa.suse.cz:
openqaworker1.qe.nue2.suse.org:
openqaworker16.qa.suse.cz:
worker5.oqa.suse.de:
worker2.oqa.suse.de:
openqaw5-xen.qa.suse.de:
openqaworker14.qa.suse.cz:
worker6.oqa.suse.de:
powerqaworker-qam-1.qa.suse.de:
QA-Power8-5-kvm.qa.suse.de:
qamasternue.qa.suse.de:
jenkins.qa.suse.de:
baremetal-support:
openqa-monitor.qa.suse.de:
QA-Power8-4-kvm.qa.suse.de:
tumblesle:
schort-server:
worker11.oqa.suse.de:
malbec.arch.suse.de:
worker12.oqa.suse.de:
worker10.oqa.suse.de:
backup.qa.suse.de:
worker13.oqa.suse.de:
openqa.suse.de:
grenache-1.qa.suse.de:
openqa-piworker.qa.suse.de:
openqaworker-arm-1.suse.de:
worker9.oqa.suse.de:
openqaworker-arm-2.suse.de:
worker8.oqa.suse.de:
openqaworker-arm-3.suse.de:
ERROR: Minions returned with non-zero exit code
looks good so nothing changing there. In the dhcpd config we found some IPv4 addresses specified and compared that to what we have on walter1.qe.nue2.suse.org where the setting is "option nis-servers wotan.suse.de,amor.suse.de;". We used "option nis-servers wotan.suse.de,amor.suse.de,midgard2.suse.de;", committed, pushed, etc.
- Due date deleted (
2023-05-19)
- Status changed from Feedback to Resolved
No problems regarding ypserv reported, qanet is still snappy, we are good.
Also available in: Atom
PDF