Project

General

Profile

action #112265

coordination #109846: [epic] Ensure all our database tables accomodate enough data, e.g. bigint for ids

Just use bigint in all our database tables (for auto-incremented ID-columns) size:M

Added by okurz 2 months ago. Updated 14 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2022-06-10
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

See #110136#note-12

Acceptance criteria

  • AC1: All our database tables in openQA use bigint (or are reviewed if it makes sense)

Suggestions

  • Prepare pull requests for individual tables and merge one by one and monitor production
  • Fix production fallouts
  • Conduct lesson learned meetings for everything that has gone wrong :)

Related issues

Related to openQA Infrastructure - coordination #112718: [alert][osd] openqa.suse.de is not reachable anymore, response times > 30s, multiple alerts over the weekendResolved2022-06-22

History

#1 Updated by mkittler 2 months ago

  • Assignee set to mkittler

#2 Updated by kraih 2 months ago

Btw. i don't think you have to care about the sequences at all, they seem to already be 64 bit. So just the int fields themselves need to be upgraded to bigint.

#3 Updated by mkittler 2 months ago

  • Subject changed from Just use bigint in all our database tables to Just use bigint in all our database tables (for auto-incremented ID-columns)

kraih I suppose. For the one table we've already converted there was no extra effort required in that regard.

#4 Updated by mkittler 2 months ago

I've just checked myself (out of curiosity) and openqa=> SELECT * FROM information_schema.sequences ORDER BY sequence_name; shows indeed that all sequences are bigint with a max. value of 9223372036854775807. That's at least true for OSD and o3. However, my local database has actually still integer sequences - also after running the migration that changes the corresponding IDs:

openqa-local-bigint=> SELECT * FROM information_schema.sequences ORDER BY data_type LIMIT 6;
  sequence_catalog   | sequence_schema |     sequence_name     | data_type | numeric_precision | numeric_precision_radix | numeric_scale | start_value | minimum_value |    maximum_value    | increment | cycle_option 
---------------------+-----------------+-----------------------+-----------+-------------------+-------------------------+---------------+-------------+---------------+---------------------+-----------+--------------
 openqa-local-bigint | public          | minion_workers_id_seq | bigint    |                64 |                       2 |             0 | 1           | 1             | 9223372036854775807 | 1         | NO
 openqa-local-bigint | public          | minion_locks_id_seq   | bigint    |                64 |                       2 |             0 | 1           | 1             | 9223372036854775807 | 1         | NO
 openqa-local-bigint | public          | minion_jobs_id_seq    | bigint    |                64 |                       2 |             0 | 1           | 1             | 9223372036854775807 | 1         | NO
 openqa-local-bigint | public          | assets_id_seq         | integer   |                32 |                       2 |             0 | 1           | 1             | 2147483647          | 1         | NO
 openqa-local-bigint | public          | audit_events_id_seq   | integer   |                32 |                       2 |             0 | 1           | 1             | 2147483647          | 1         | NO
 openqa-local-bigint | public          | gru_tasks_id_seq      | integer   |                32 |                       2 |             0 | 1           | 1             | 2147483647          | 1         | NO

Only the sequences from Minion are bigint here. I suppose to cover "legacy"¹ setups the migration should handle sequences after all.

¹Not sure what makes my setup special.

#5 Updated by mkittler 2 months ago

  • Status changed from New to In Progress

#6 Updated by mkittler 2 months ago

  • Status changed from In Progress to Feedback

#7 Updated by kraih 2 months ago

mkittler wrote:

However, my local database has actually still integer sequences - also after running the migration that changes the corresponding IDs:

Same here actually. My local O3 clone has 8 byte sequences, while the empty dev database mostly still has 4 byte sequences.

#8 Updated by okurz 2 months ago

https://github.com/os-autoinst/openQA/pull/4701 merged. The deployment on o3 was triggered soon after and completed after a mere 5 minutes, see https://github.com/os-autoinst/openQA/pull/4701#issuecomment-1159404851, triggering a deployment on OSD explicitly so that I can monitor while I am available: https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/416342. Otherwise the next deployment would be triggered tomorrow morning, likely also fine.

https://openqa.opensuse.org/tests/2421090#step/start_test/7 just hit the short time window where o3 wasn't available. I labeled and retriggered the job so that the OSD deployment will find all green openQA-in-openQA tests.

On OSD sudo journalctl --since="2022-06-18 11:39" -u openqa-webui shows

Jun 18 12:39:27 openqa openqa[4691]: [warn] [pid:4691] Job IDs will be converted to bigint. That may take multiple hours on big databases.  It is safe to stop the service (and start from >
Jun 18 12:57:31 openqa openqa[4691]: [info] Loading external plugin AMQP

so 18m for migration. Some alerts about minion jobs have been triggered but shortly went back to ok again. I guess such alerts are ok for such exceptional migrations.

#9 Updated by okurz about 2 months ago

  • Related to coordination #112718: [alert][osd] openqa.suse.de is not reachable anymore, response times > 30s, multiple alerts over the weekend added

#11 Updated by okurz about 1 month ago

  • Due date set to 2022-07-15

#12 Updated by okurz about 1 month ago

  • Subject changed from Just use bigint in all our database tables (for auto-incremented ID-columns) to Just use bigint in all our database tables (for auto-incremented ID-columns) size:M
  • Description updated (diff)

#13 Updated by okurz about 1 month ago

merged yesterday. o3 was auto-deployed and migrated and it looked fine. Today OSD was auto-deployed and also looks fine. https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=1657431899773&to=1657433289012&viewPanel=78 shows the migration for http response time. Database rows in https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=1657429493800&to=1657436565913&viewPanel=89 also show the immediate effect of migration but nothing significant after the migration finished. https://monitor.qa.suse.de/alerting/list?state=not_ok shows no alerts.

EDIT: Reviewed some QA SLE maintenance update tests with retriggering and such and everything works as expected

#14 Updated by mkittler about 1 month ago

Yes, seems like everything works as expected. Just to be sure, I've ran VACUUM ANALYZE on all affected tables. That shouldn't harm and previously ensured the query planner uses indexes correctly.

#15 Updated by mkittler about 1 month ago

  • Status changed from Feedback to Resolved

It looks still good.

#16 Updated by okurz 14 days ago

  • Due date deleted (2022-07-15)

Also available in: Atom PDF