action #160598
Updated by livdywan 7 months ago
## Observation Could be related to #158170? Did we allow to much instances? ``` Summary System Load too high for a longer time, see https://progress.opensuse.org/issues/150983 Description System Load is considered too high for a longer time. Machine possibly overloaded. Especially when there are too many openQA worker instances configured openQA tests would become flaky and showing lost characters or repeated characters in VNC typing. Take a look which processes make the machine busy and look for corresponding openQA tests failing due to this situation and handle accordingly, e.g. retrigger the openQA tests after mitigating the root cause. See https://progress.opensuse.org/issues/150983 for details. Values B=79.57516129032257 C=1 Labels alertname s390zl12: CPU load alert grafana_folder openQA host s390zl12 hostname s390zl12 origin salt rule_uid cpu_load_alert_s390zl12 type worker ``` https://stats.openqa-monitor.qa.suse.de/d/WDs390zl12/worker-dashboard-s390zl12?orgId=1&from=1715990201255&to=1716033674370 The issue is resolved at this moment, so no rollback steps needed and normal priority for now. ## Suggestions * Consider reducing the worker slots * Check that the alert threshold is good, or adjust it * Take a look at the logs from the timeframe of the alert firing