Project

General

Profile

Actions

action #97109

open

openqa-review: Cache fetched urls

Added by tinita over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2021-08-18
Due date:
% Done:

0%

Estimated time:

Description

Observation

The generation of openqa_review pages fetches a lot of equal urls. The more reports we generate, the bigger the problem will become. As we know, our webui is not very fast at the moment.

These statistics are from today. According to the access_log, generation of all osd pages took almost 4 hours (from 2:00 to 5:52). No significant gaps between the several piipeline jobs. About 5200 requests were made, but only 1605 different urls:

grep '"openqa-review ' access_log | perl -MData::Dumper -nwE'$Data::Dumper::Sortkeys = 1;if (m/"GET (\S+) HTTP\S+ (\d+)/) { $urls{$1}++ } END { say Dumper \%urls }' | wc -l
1605

This shows how many urls were called n number of times:

1: 19
2: 8
3: 1278
4: 152
5: 141
6: 3

Time (according to access_log) in seconds for fetching the urls (by number of requests):

1: 47s
2: 33s
3: 8165s
4: 1348s
5: 1237s
6: 160s

Suggestions

Actions #1

Updated by tinita over 2 years ago

  • Description updated (diff)
Actions #2

Updated by tinita over 2 years ago

  • Description updated (diff)
Actions #3

Updated by tinita over 2 years ago

  • Priority changed from Normal to High

IMHO the priority should be High, because the generation still times out from time to time, and we want to add new pages in the near future, which increases the chances of timeouts.

Actions #4

Updated by okurz over 2 years ago

The existing "skip passed" and todo are maybe alternatives and we might want to have only one of them preserved. I could think of reports for limited job groups or some search parameters but they should have quite limited scope. Also I doubt we can ever replace in-time reports of openQA so if it takes hours to process a daily report so be it. So really "High"?

Actions #5

Updated by tinita over 2 years ago

Also I doubt we can ever replace in-time reports of openQA so if it takes hours to process a daily report so be it.

I don't understand that sentence in context of this ticket.

Actions #6

Updated by okurz over 2 years ago

tinita wrote:

Also I doubt we can ever replace in-time reports of openQA so if it takes hours to process a daily report so be it.

I don't understand that sentence in context of this ticket.

What I want to say is that I am ok with openqa-review taking long. If anyone wants to have live-data they should use the openQA webUI directly. So from my point of view I don't consider the priority of this ticket "High"

Actions #7

Updated by tinita over 2 years ago

okurz wrote:

What I want to say is that I am ok with openqa-review taking long.

It's not only (but also) about "taking long".
It's also about using CPU time needlessly.

For the example day from above, the total response time for all OSD Apache requests was 3 hours and 3 minutes.
If the reponses had been cached, the total response time would have been 56 minutes.

We increased the timeout time more than once already, and still the pipeline fails from time to time.

If we add more pages in the future, it will become more likely that it fails.

We already have a high load on OSD often, and while the openqa-review pipeline might not be a major factor for that, it adds to it.

Actions

Also available in: Atom PDF