Actions
action #162533
opencoordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances
coordination #108209: [epic] Reduce load on OSD
[alert] OSD nginx yields 502 responses rather than being more resilient of e.g. openqa-webui restarts size:S
Description
Motivation¶
Similar as in #160877 but concerning the server rather than tooling.
Did we managed do DoS the server? Do we need to tweak the nginx even more?
Acceptance criteria¶
- AC1: No significant numbers of 502 responses recorded in https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&viewPanel=80 even during openQA-webui services restarts
Suggestions¶
- Wait for #162611 "Easy local development setup for comparing apache2+nginx as openQA web proxy"
- Check and unsilence web UI: Too many 5xx HTTP responses alert after making nginx more resilient
- Look into older tickets where ideas and limitations have been mentioned already
- If it's not possible with nginx switch back to apache or traefik or whatever is fitting
- Look into https://nginx.org/en/docs/http/ngx_http_upstream_module.html
- Have a look if the caddy webserver can do what we want.
Rollback steps¶
- Unsilence
rule_uid=http_response_5xx_alert
silence on https://monitor.qa.suse.de/alerting/silences
Actions