action #93925
closedcoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
coordination #92854: [epic] limit overload of openQA webUI by heavy requests
Optimize SQL on /tests/overview
Description
For /tests/overview
pages with a lot of jobs, retrieving data from our database is quite inefficient.
For every job several SQL statements are fired.
Some of them can be combined into one statement.
Also in several statements, too much columns are selected, where we need only a few. That includes costly DateTime columns.
Identify statements which can be combined, combine them and measure performance.
Updated by okurz over 3 years ago
- Priority changed from Normal to Low
Let's see that as something that can help and is certainly related but can't prevent an "overload" so reducing prio to "Low". But besides that no further impact, keep on the good work :)
Updated by kraih over 3 years ago
Might also be possible to replace DateTime
with a less heavy module like DateTime::Tiny
in some cases.
Updated by mkittler over 3 years ago
Also take note of #92854#note-13 regarding the DateTime
overhead.
Updated by tinita over 3 years ago
I made the following config changes on o3:
# /etc/apache2/global.conf
CustomLog /space/logs/apache2/access_log combined_with_time
# /etc/apache2/vhosts.d/openqa.conf
CustomLog /space/logs/apache2/openqa.access_log combined_with_time
# /etc/apache2/mod_log_config.conf
LogFormat "%h %l %u %t \"%r\" %>s %b \
\"%{Referer}i\" \"%{User-Agent}i\" %{ms}T" combined_with_time
Added the files to /etc/logrotate.d/apache2
This way we can compare request times before and after a deployment.
Updated by tinita over 3 years ago
I did the same changes on osd (except access_log was already enabled), though I had to touch different files.
I replaced combined
with combined_with_time
in /etc/apache2/global.conf
and added it to /etc/apache2/mod_log_config.conf
I added combined_with_time
to /etc/sysconfig/apache2
and reloaded apache, which generated /etc/apache2/sysconfig.d/global.conf
.
Now we can see the time in /var/log/apache2/access_log
Updated by tinita over 3 years ago
- Status changed from In Progress to Feedback
Updated by tinita over 3 years ago
- Related to action #94111: Optimize /api/v1/jobs added
Updated by tinita over 3 years ago
- Status changed from Feedback to In Progress
I will gather some statistics from the logs in the next days and report them here.
So far on o3 the average Tumbleweed build page took 46% of the time before the deployment.
Updated by tinita over 3 years ago
- Status changed from In Progress to Resolved
Unfortunately I didn't look to closely on the load average when doing my test queries before the rollout.
I did the test queries today when the load was not too high (between 10-15) on osd.
# Yast
% perl -nwE'if (m{ /tests/overview\?distri=sle&version=15-SP3&build=187.1&groupid=129 } and m{(\d+)$}) { $count++; $sum += $1 } END { say sprintf "count: %d | sum: %d ms | avg: %.2f ms", $count, $sum, $sum/$count }' access_log-20210617
count: 17 | sum: 127184 ms | avg: 7481.41 ms
% perl -nwE'if (m{ /tests/overview\?distri=sle&version=15-SP3&build=187.1&groupid=129 } and m{(\d+)$}) { $count++; $sum += $1 } END { say sprintf "count: %d | sum: %d ms | avg: %.2f ms", $count, $sum, $sum/$count }' access_log-today
count: 46 | sum: 142907 ms | avg: 3106.67 ms
% perl -wE'say 3106.67/7481.41'
0.415251937803168
# HA
% perl -nwE'if (m{ /tests/overview\?distri=sle&version=15-SP3&build=178.1&groupid=143 } and m{(\d+)$}) { $count++; $sum += $1 } END { say sprintf "count: %d | sum: %d ms | avg: %.2f ms", $count, $sum, $sum/$count }' access_log-20210617 count: 20 | sum: 147886 ms | avg: 7394.30 ms
% perl -nwE'if (m{ /tests/overview\?distri=sle&version=15-SP3&build=178.1&groupid=143 } and m{(\d+)$}) { $count++; $sum += $1 } END { say sprintf "count: %d | sum: %d ms | avg: %.2f ms", $count, $sum, $sum/$count }' access_log-today
count: 40 | sum: 153192 ms | avg: 3829.80 ms
% perl -wE'say 3829.80/7394.30'
0.517939493934517
So for those relatively big groups the average load time decreased to 42% (Yast) / 52% (HA).
Updated by okurz over 3 years ago
A significant and important improvement. Thanks!
Updated by okurz over 3 years ago
- Related to action #94354: Optimize /dashboard_build_results and /group_overview/* pages added
Updated by okurz over 3 years ago
- Related to deleted (action #94354: Optimize /dashboard_build_results and /group_overview/* pages)