action #94667
closedcoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
coordination #92854: [epic] limit overload of openQA webUI by heavy requests
Optimize products/machines/test_suites API calls size:M
Description
Motivation¶
The SQL to fetch the data is inefficient. For each item in the result set another SELECT FROM job_settings
is executed. For https://openqa.suse.de/admin/test_suites that means about 2500 SELECTs currently.
Also the costly DateTime columns are fetched, although they are not needed.
Acceptance criteria¶
- AC1: Significant performance improvements
Suggestions¶
- The number of individual SQL queries doesn't go up with the number of items in the table anymore
- Less columns are fetched
- All three
/api/v1/products
,/api/v1/machines
,/api/v1/test_suites
are handled by the same code inlib/OpenQA/WebAPI/Controller/API/V1/Table.pm
- Enable
DBIC_TRACE=1
environment variable to see all SQL - https://github.com/os-autoinst/openQA/pull/3969 has a similar optimizations
- Look at the existing patch from #94667#note-7
Updated by tinita over 3 years ago
- Status changed from Workable to New
Set to New because not estimated yet
Updated by okurz about 3 years ago
- Subject changed from Optimize products/machines/test_suites API calls to [easy][beginner] Optimize products/machines/test_suites API calls
Updated by tinita about 3 years ago
- Subject changed from [easy][beginner] Optimize products/machines/test_suites API calls to Optimize products/machines/test_suites API calls
I had a look into this on the weekend, since the actual change is simple. But then a test failed unexpectedly:
t/40-script_load_dump_templates.t
So I don't consider it a beginner task anymore.
The order of the products/machines seems to be different than before, and this is also visible in the WebUI.
So the join with the *_settings table results in a different order (which means so far we simply relied on the default postgresql order).
Since I will be on vacation I'll post the actual patch for prefetching the settings here:
diff --git a/lib/OpenQA/WebAPI/Controller/API/V1/Table.pm b/lib/OpenQA/WebAPI/Controller/API/V1/Table.pm
index a8591f721..f86e9116c 100644
--- a/lib/OpenQA/WebAPI/Controller/API/V1/Table.pm
+++ b/lib/OpenQA/WebAPI/Controller/API/V1/Table.pm
@@ -108,15 +108,30 @@ sub list {
}
if ($have) {
for my $par (@$key) {
- $search{$par} = $self->param($par);
+ $search{"me.$par"} = $self->param($par);
}
}
}
my @result;
+ my $sql_field = {
+ Machines => 'machine',
+ TestSuites => 'test_suite',
+ Products => 'product',
+ }->{$table};
eval {
my $rs = $self->schema->resultset($table);
- @result = %search ? $rs->search(\%search) : $rs->all;
+ @result = $rs->search(
+ \%search,
+ {
+ # 'prefetch' would select too much unneeded columns
+ # https://metacpan.org/pod/DBIx::Class::ResultSet#collapse
+ join => 'settings',
+ collapse => 1,
+ '+select' => [
+ qw(settings.id settings.key settings.value),
+ "settings.${sql_field}_id"
+ ]});
};
my $error = $@;
if ($error) {
Updated by livdywan over 2 years ago
- Subject changed from Optimize products/machines/test_suites API calls to Optimize products/machines/test_suites API calls size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by kraih about 2 years ago
Yes, not a beginner task. Tina's approach was pretty close to optimal already though. I did some benchmarking with different approaches on the O3 database and was a little surprised how little impact prefetching the settings timestamps actually has.
# current version
@result = %search ? $rs->search(\%search) : $rs->all;
# prefetch version
@result = $rs->search(keys %search ? \%search : undef, {prefetch => 'settings', order_by => 'me.id'});
# join/collapse version
@result = $rs->search(keys %search ? \%search : undef,
{
join => 'settings',
'+select' => [qw(settings.id settings.key settings.value), "settings.${sql_field}_id"],
collapse => 1,
order_by => 'me.id'
});
All three of these work with all tests passing. The baseline for performance is 5.03
requests per second with the current version. With the optimized versions it is 14.85
requests per second for the prefetch version. And 15.58
requests per second with the join/collapse version. Out of these the prefetch version probably has the best balance between performance and maintainability. But given that we are all becoming quite proficient with DBIx::Class optimizations, i'll open a PR for the join/collapse version first.