coordination #109846: [epic] Ensure all our database tables accomodate enough data, e.g. bigint for ids - openQA Project (public) - openSUSE Project Management Tool

Description

Motivation¶

See #109836 where we found all jobs on OSD are incomplete with the reason 'Reason: api failure: 400 response: OpenQA::Schema::Result::Jobs::insert_module(): DBI Exception: DBD::Pg::st execute failed: ERROR: integer out of range [for Statement " INSERT INTO job_modules ( job_id, name, category, script, milestone, important, fatal, always_rollback, t_created, t_u…' on 2022-04-12 morning due to hitting the integer limit of the job module id. This was mitigated for now on OSD with manual database alterations but needs to be accomodated in our table creations including migrations of existing instances.

Acceptance criteria¶

AC1: New instances of openQA can handle multi-billion job module entries
AC2: Existing instances of openQA are migrated/upgraded according to AC1
AC3: No severe performance regression
AC4: The migration of existing instances finishes in a reasonable time (downtime of some minutes is ok)

Ideas¶

What okurz did in #109836 manually effectively is in SQL:

ALTER TABLE job_modules ALTER COLUMN id TYPE bigint;
ALTER TABLE needles ALTER COLUMN last_seen_module_id TYPE bigint;
ALTER TABLE needles ALTER COLUMN last_matched_module_id TYPE bigint;

and then restart services:

systemctl restart openqa-webui openqa-scheduler

coolo: "We should do the same in the package and then update the schema version manually on osd database. After that we can check what other tables deserve the same treatment and create one more schema version"

From #109864: Add alerts for any IDs nearing a limit: Research the industry standard for postgreSQL admins best practices -> #110136
From #109864: We should research how much additional space and performance cost bigint would really mean, maybe negligible -> #110136
From #109864: Recovery took multiple hours because the team isn't currently spread across multiple timezones: This was communicated when the team composition changed and we should remind about that -> Done with a chat notice in https://suse.slack.com/archives/C02CANHLANP/p1650450978321009

Some references that might help:

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries