action #34531: Improvements to the job_module_needles table - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #34531

closed

Improvements to the job_module_needles table

Added by szarate almost 7 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

coolo

Category:

Feature requests

Target version:

Done

Start date:

2018-04-09

Due date:

% Done:

Estimated time:

Description

Currently the table job_module_needles has 59M rows and the table is about 3GB (Plus the indexes that can be also quite scary but a Vacumm FULL might help there).

The indexes itself are not the problem, the need is to actually figure a better way to calculate the last time a needle was used without keeping that much information.

Also a point here: Use this task to evaluate the feasibility to start mixing Mojo::Pg with DBIx (Since we don't really rely on SQLite anymore)

openqa=# SELECT nspname || '.' || relname AS "relation",
openqa-#     pg_size_pretty(pg_relation_size(C.oid)) AS "size"
openqa-#   FROM pg_class C
openqa-#   LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
openqa-#   WHERE nspname NOT IN ('pg_catalog', 'information_schema')
openqa-#   ORDER BY pg_relation_size(C.oid) DESC
openqa-#   LIMIT 20;
relation                      |  size   
---------------------------------------------------+---------
public.job_module_needles_needle_id_job_module_id | 4716 MB
public.job_module_needles_idx_needle_id           | 3878 MB
public.job_module_needles                         | 3021 MB
public.job_module_needles_idx_job_module_id       | 2501 MB

Actions

Copy link

Updated by szarate almost 7 years ago

Description updated (diff)

Actions

Copy link

Updated by mkittler almost 7 years ago

When I don't miss something, we use this entire table only to update the columns last_matched_module_id, first_seen_module_id and last_seen_module_id of the needles table itself.

This could likely also be done directly at the same place where we currently update the job_module_needles table (in JobModules::store_needle_infos). The (matched) needle IDs (%needles hash) and the job module ID ($self->id) are available at this point, so the following could be done:

Update last_matched_module_id of needles where needle_id is in @matched_needle_ids and last_matched_module_id < $job_module_id.
Update first_seen_module_id of needles where needle_id is in @needle_ids and first_seen_module_id > $self->id $job_module_id.
Update last_seen_module_id of needles where needle_id is in @needle_ids and last_seen_module_id < $job_module_id.