Project

General

Profile

action #167257

Updated by okurz 3 months ago

## Observation 

 Trying to open [Grafana](https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz?viewPanel=6) to check alerts I found it's not available and showing a white page with an error code: 

 ``` 
 502 Bad Gateway 
 nginx/1.21.5 
 ``` 

 ## Acceptance criteria 
 * **AC1:** During expected deployments of grafana a proper user-facing status is shown instead of "bad gateway" is shown 
 * **AC2:** We still ensure that grafana related config updates are applied 
 * **AC3:** We are alerted if grafana refuses to start up at all (e.g. failing systemd service triggering alert) 

 ## Suggestions 
 * Check ssh access 
 * `systemctl status grafana-server`. As needed restart Restart grafana 
 * Look into custom bad gateway pages for nginx, e.g. https://stackoverflow.com/questions/7796237/custom-bad-gateway-page-with-nginx or https://serverfault.com/questions/185637/custom-page-on-502-bad-gateway-error/194301#194301 
 * Consider notifying nginx about pending grafana restarts, e.g. preexec call in custom systemd service override 
 * Can we just trigger a reload of grafana instead of restarting? 
 * Inspect pipelines for alert conflicts, see also #166979

Back