coordination #96447
Updated by livdywan over 3 years ago
## Observation
- Alerts `Disk I/O time for /dev/vdd (/results)`
- ~~Alerts for `Job age (scheduled)`~~
- Alerts for `Failed systemd services`
## Suggestion
- Bump our thresholds
- Investigate if our average load has increased immensely e.g. new test groups being scheduled
- Look at systemd journal while the alert is running (short of having #96551)
- Check if we have data on reduced heat/ power in server room 2
- ~~`Job age (scheduled) (median)` is likely due to issues with the `WORKER_CLASS` of https://openqa.suse.de/tests/6513484~~