action #107257
Updated by okurz almost 3 years ago
## Observation From grafana: [Alerting] Apache Response Time alert The apache response time exceeded the alert threshold. * Check the load of the web UI host * Consider restarting the openQA web UI service and/or apache Also see https://progress.opensuse.org/issues/73633 Metric name Value Min 2565671.000 view alert rule: http://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=84&orgId=1 ## Reproducible Multiple alerts since at least 2022-02-22, likely even the past days. ## Suggestions * okurz already restarted the apache service because it was running for longer than the time since the labs was moved. But since then we had multiple other alerts * Likely the problem is not apache itself but either the network is problematic or our openQA service * It seems we are smoothing over not that long time so maybe we don't have enough data due to the data outages. So we should look into #107437 first * Look back how it looks after #107437 is resolved * Optional: Reconsider how we alert on response times when we actually do not have that many responses ## Rollback steps * okurz paused the alert for https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&editPanel=84&tab=alert , unpause if everything is good again