Project

General

Profile

Actions

action #130636

open

coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances

coordination #108209: [epic] Reduce load on OSD

high response times on osd - Try nginx on OSD size:S

Added by livdywan 11 months ago. Updated 4 days ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Apache in prefork mode uses a lot of resources to provide mediocre performance.

Acceptance criteria

  • AC1: Nginx has been deployed successfully on OSD
  • AC2: No alerts regarding "oh no, apache is down" ;)

Suggestions

  • Make sure there is an easy way to switch back to Apache in case something goes wrong
  • See #129490 for results from O3
  • Adapt OSD nginx config for HTTP + HTTPS (O3 only requires HTTP)
  • We can prepare the deployment of nginx in parallel to apache, have it deployed and at any time decide when to switch by just disabling/enabling services accordingly. The deployment needs to consider dehydrated+nginx as well. We can switch OSD to nginx to gather realtime data before we suggest to use nginx as default in our openQA documentation and CI infrastructure.
  • Add changes to salt-states-openqa excluding monitoring
  • Ensure that we have no alerts regarding "oh no, apache is down" ;)
  • If there are any bigger issues observed then just revert and note down in follow-up tickets what needs to be solved first (to limit the ticket to size:S)

Out of scope

  • It is known if Nginx rate limiting features work for our use cases
  • Full monitoring integration

Related issues 4 (1 open3 closed)

Related to openQA Infrastructure - action #157081: OSD unresponsive or significantly slow for some minutes 2024-03-12 08:30ZResolvedokurz2024-03-12

Actions
Related to openQA Infrastructure - action #158059: OSD unresponsive or significantly slow for some minutes 2024-03-26 13:34ZResolvedokurz

Actions
Copied from openQA Project - action #129490: high response times on osd - Try nginx on o3 with enabled load limiting or load balancing featuresResolvedkraih

Actions
Copied to openQA Project - action #159651: high response times on osd - nginx with enabled rate limiting features size:SWorkable2024-04-26

Actions
Actions

Also available in: Atom PDF