Project

General

Profile

Actions

action #96713

closed

Slow grep in openqa-label-known-issues leads to high CPU usage

Added by tinita over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
Due date:
2021-09-10
% Done:

0%

Estimated time:

Description

Observation

On 2021-08-10 we experienced problems on osd like a high load, and one of the reasons seemed to be slow grep commands called by openqa-label-known-issues.

For example this regex seemed to be problematic:

grep -qPzo (?s)2021-.*T.*Error connecting to <root@s390p.*.suse.de>: No route to host /tmp/tmp.gy3iDKQDrV

Running over 4 minutes sometimes. (Removing the (?s) made it run in only a few milliseconds on a file with about 500kb.)

We are now lowering the timeout for the post fail hook
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/538

But that will mean that the whole process can timeout because of one bad regex.

Suggestions

  • Study current grep logic in openqa-label-known-issues
  • The mentioned regex comes from this issue: #93119
  • Maybe the regex can be improved
  • Add a timeout to the grep call
  • Make sure we are informed in some way of timed out grep calls so we can improve the regexes
  • Reduce limit=200 (maximum number of jobs to check)

Related issues 3 (0 open3 closed)

Related to openQA Infrastructure - action #96807: Web UI is slow and Apache Response Time alert got triggeredResolvedokurz2021-08-122021-10-01

Actions
Copied from openQA Infrastructure - action #96710: Error `Can't call method "write" on an undefined value` shows up in worker log leading to incompletesResolvedmkittler2021-08-102021-08-31

Actions
Copied to openQA Infrastructure - action #97943: Increase number of CPU cores on OSD VM due to high usage size:SResolvedmkittler

Actions
Actions

Also available in: Atom PDF