Project

General

Profile

action #121054

bigint conversion fails due to idx_job_id_value_settings index being too wide size:S

Added by AdamWill 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Concrete Bugs
Target version:
Start date:
2022-11-28
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

When I update Fedora's staging openQA instance to a newer version of openQA with the bigint database conversion - https://progress.opensuse.org/issues/112265 - the conversion fails:

Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: failed to run SQL in /usr/share/openqa/script/../dbicdh/PostgreSQL/upgrade/94-95/001-auto.sql: DBIx::Class::DeploymentHandler::DeployMethod::SQL::Translator::try {...} (): DBI Exception: DBD::Pg::db do failed: ERROR:  index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_job_id_value_settings"
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: DETAIL:  Index row references tuple (149183,23) in relation "job_settings".
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: Consider a function index of an MD5 hash of the value, or use full text indexing. at inline delegation in DBIx::Class::DeploymentHandler for deploy_method->upgrade_single_step (attribute declared in /usr/share/perl5/vendor_perl/DBIx/Class/DeploymentHandler/WithApplicatorDumple.pm at line 51) line 18
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]:  (running line 'ALTER TABLE job_settings ALTER COLUMN job_id TYPE bigint') at /usr/share/perl5/vendor_perl/DBIx/Class/DeploymentHandler/DeployMethod/SQL/Translator.pm line 263.
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa line 0
Nov 28 18:32:54 openqa-lab01.iad2.fedoraproject.org openqa-gru[1042531]: DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa line 0

I'm no database expert, but after a morning of googling, I think basically we're running into a true hard limit in postgres here. As it says, "Values larger than 1/3 of a buffer page cannot be indexed" (in a btree index - this is a limitation of btree indexes specifically).

I did check I can recreate the same index with current postgresql and the job_id still as 'integer' type, and it works:

openqa-stg=> \d job_settings
                                        Table "public.job_settings"
  Column   |            Type             | Collation | Nullable |                 Default                  
-----------+-----------------------------+-----------+----------+------------------------------------------
 id        | integer                     |           | not null | nextval('job_settings_id_seq'::regclass)
 key       | text                        |           | not null | 
 value     | text                        |           | not null | 
 job_id    | integer                     |           | not null | 
 t_created | timestamp without time zone |           | not null | 
 t_updated | timestamp without time zone |           | not null | 

openqa-stg=> CREATE INDEX idx_job_id_value_settings_test on job_settings (job_id, key, value);
CREATE INDEX
openqa-stg=> 

I guess changing the integer type requires the index to be recreated and makes the index entries larger, and now at least one of them is just too big, or something.

I suspect probably only a few rows are problematic here, and it's likely ones for tests of this update: https://bodhi.fedoraproject.org/updates/FEDORA-2022-23a0a34ea5 . As you can see at e.g. https://openqa.stg.fedoraproject.org/tests/2283227#settings , one of the settings for our update tests (ADVISORY_NVRS) is a list of all the packages in the update, and for an update with a lot of packages in it, that's pretty long. It seems likely it's the rows with that specific setting in it that are triggering this problem.

The indexes were added in 43bac479f37b64437e24e57a65a2ed88cd4f4dfb , which was a part of https://github.com/os-autoinst/openQA/pull/521 , but it seems it was "really" supposed to be part of https://github.com/os-autoinst/openQA/pull/534 (521 just happened to be touching the db schema at the same time so the index additions got rolled into it as well). That suggests the idea was to "Improve query_jobs performance", and coolo said the indexes were "needed" for that.

From that PR, it looks like the query we were trying to speed up is the one that is now query_for_settings in lib/Openqa/Schema/ResultSet/JobSettings.pm - or rather, queries which use that subquery, of which there are currently three, all of which do something similar to this:

    my $subquery = $schema->resultset('JobSettings')->query_for_settings(\%precond);
    $cond{'me.id'} = {-in => $subquery->get_column('job_id')->as_query};

so they basically wind up involving the job_id, key and value, just as the index expects.

It looks to me like that query is usually a strict equality one, except when the query string has colons in it, when it's made into a "like" query (see d83fc0ec16c4868b92915eda1b8af0ee2145ea0f ). If I'm understanding all of this correctly, when that's a strict equality query, the index will help; when it's a "like" query, it probably won't (I think only a full-text index would help there). We don't seem to be doing any queries that use the range capabilities of a btree index, AFAICT, so in theory we could use a hash index, except that hash indexes can't be multicolumn, only B-tree, GiST, GIN, and BRIN indexes.

I did find this discussion of a similar problem: https://www.spinics.net/lists/pgsql/msg212499.html
which suggests creating a btree index in such a way that a hash of the concatenation of the items in each row is used. This would almost certainly solve the size issue, but I don't know if such an index would actually be any use for speeding up the query - would postgresql know it could use such an index to handle our queries? I don't know, but it seems somehow unlikely.

Poking through the postgresql docs, another option I guess we could use is a partial index: https://www.postgresql.org/docs/current/indexes-partial.html . We could make the index only contain settings actually likely to be used in these queries. There is one path where, technically, any arbitrary setting at all might be in the query; that's cancelling a job by settings via the API. API/V1/Job.pm cancel(), if the query doesn't contain a job id, takes all parameters from the query and treats them as settings for a call to cancel_by_settings (which uses query_for_settings) - so you can cancel all jobs with any arbitrary setting(s) you specify. But it feels to me like this probably isn't used a lot. The other two paths that use query_for_settings both specifically limit the settings they will actually query for. _prepare_complex_query_search_args (in ResultSet/Jobs.pm) will only query for qw(ISO HDD_1 WORKER_CLASS). destroy (in API/V1/Iso.pm) will only query for ISO. So we could actually use a partial index that only indexes entries whose "key" is ISO, HDD_1 or WORKER_CLASS, and that should cover most queries. I think. We could add other "standard" settings to the list for the purposes of cancel_by_settings, I guess.

Going the opposite way we could use a partial index that indexes all entries except those whose "key" is ADVISORY_NVRS , I guess. Which would be an extremely specific exemption to carry upstream, but I guess would do the trick...

History

#1 Updated by AdamWill 2 months ago

It seems to me like the suggestion in https://www.spinics.net/lists/pgsql/msg212499.html really creates a "functional index" - https://www.postgresql.org/docs/7.3/indexes-functional.html - which basically indexes the result of the function. So it's only useful when a query does that same function. The example in the postgres docs is a functional index for the function lower(col1), which is only useful when a query also does lower(col1).

so if we did a functional index like the post suggests:

CREATE INDEX idx_job_id_value_settings ON job_settings (sha256(job_id::bytea || key::bytea || value::bytea));

or something like that, we'd also have to make the query use the same hash function...right?

#2 Updated by okurz 2 months ago

  • Target version set to Ready

#3 Updated by AdamWill 2 months ago

Oh, I forgot to write that there are some other paths to cancel_by_settings which do use specific settings, so we'd want to make sure to index the settings listed in those as well. There's API/V1/Iso.pm cancel, which again only does ISO. And there's Schema/Result/ScheduledProducts.pm _schedule_iso, which uses qw(DISTRI VERSION FLAVOR ARCH) and sometimes BUILD. So we'd want to make sure at least those were indexed.

#4 Updated by AdamWill 2 months ago

Aha, yeah, so I can reproduce this in a test db. I created a test db with a job_settings table exactly like openQA's:

CREATE TABLE job_settings (
  id bigserial NOT NULL,
  key text NOT NULL,
  value text NOT NULL,
  job_id bigint NOT NULL,
  t_created timestamp NOT NULL,
  t_updated timestamp NOT NULL,
  PRIMARY KEY (id)
);

then I added a fairly normal test line and the index:

insert into job_settings(id,key,value,job_id,t_created,t_updated) VALUES(41776574,'HDD_1','disk_f36_server_3_x86_64.qcow2',1630484,now(),now());
CREATE INDEX idx_job_id_value_settings_test on job_settings (job_id, key, value);

and that worked fine. Then I tried to add a new row copied from a row for the big KDE update, and boom:

insert into job_settings(id,key,value,job_id,t_created,t_updated) VALUES(62195067,'ADVISORY_NVRS','analitza-22.08.2-1.fc37 artikulate-22.08.2-1.fc37 baloo-widgets-22.08.2-1.fc37 blinken-22.08.2-1.fc37 bluedevil-5.26.2-1.fc37 bovo-22.08.2-1.fc37 breeze-gtk-5.26.2-1.fc37 breeze-icon-theme-5.99.0-1.fc37 cantor-22.08.2-1.fc37 cervisia-22.08.2-1.fc37 dolphin-22.08.2-1.fc37 dolphin-plugins-22.08.2-1.fc37 dragon-22.08.2-1.fc37 elisa-player-22.08.2-1.fc37 extra-cmake-modules-5.99.0-1.fc37 filelight-22.08.2-1.fc37 granatier-22.08.2-1.fc37 grub2-breeze-theme-5.26.2-1.fc37 gwenview-22.08.2-1.fc37 juk-22.08.2-1.fc37 k3b-22.08.2-1.fc37 kaccounts-integration-22.08.2-1.fc37 kaccounts-providers-22.08.2-1.fc37 kactivitymanagerd-5.26.2-1.fc37 kajongg-22.08.2-1.fc37 kalgebra-22.08.2-1.fc37 kalzium-22.08.2-1.fc37 kamera-22.08.2-1.fc37 kamoso-22.08.2-1.fc37 kanagram-22.08.2-1.fc37 kapman-22.08.2-1.fc37 kapptemplate-22.08.2-1.fc37 kate-22.08.2-1.fc37 katomic-22.08.2-1.fc37 kbackup-22.08.2-1.fc37 kblocks-22.08.2-1.fc37 kbruch-22.08.2-1.fc37 kcachegrind-22.08.2-1.fc37 kcalc-22.08.2-1.fc37 kcharselect-22.08.2-1.fc37 kcolorchooser-22.08.2-1.fc37 kcron-22.08.2-1.fc37 kdebugsettings-22.08.2-1.fc37 kde-cli-tools-5.26.2-1.fc37 kde-connect-22.08.2-1.fc37 kdecoration-5.26.2-1.fc37 kde-dev-scripts-22.08.2-1.fc37 kde-dev-utils-22.08.2-1.fc37 kdeedu-data-22.08.2-1.fc37 kdegraphics-mobipocket-22.08.2-1.fc37 kdegraphics-thumbnailers-22.08.2-1.fc37 kde-gtk-config-5.26.2-1.fc37 kdenetwork-filesharing-22.08.2-1.fc37 kde-partitionmanager-22.08.2-1.fc37 kdeplasma-addons-5.26.2-1.fc37 kde-print-manager-22.08.2-1.fc37 kdesdk-thumbnailers-22.08.2-1.fc37 kdf-22.08.2-1.fc37 kdialog-22.08.2-1.fc37 keditbookmarks-22.08.2-1.fc37 kf5-5.99.0-1.fc37 kf5-akonadi-mime-22.08.2-1.fc37 kf5-akonadi-notes-22.08.2-1.fc37 kf5-akonadi-search-22.08.2-1.fc37 kf5-akonadi-server-22.08.2-1.fc37 kf5-attica-5.99.0-1.fc37 kf5-audiocd-kio-22.08.2-1.fc37 kf5-baloo-5.99.0-1.fc37 kf5-bluez-qt-5.99.0-1.fc37 kf5-frameworkintegration-5.99.0-1.fc37 kf5-kactivities-5.99.0-1.fc37 kf5-kactivities-stats-5.99.0-1.fc37 kf5-kapidox-5.99.0-1.fc37 kf5-karchive-5.99.0-1.fc37 kf5-kauth-5.99.0-1.fc37 kf5-kbookmarks-5.99.0-1.fc37 kf5-kcalendarcore-5.99.0-1.fc37 kf5-kcalendarutils-22.08.2-1.fc37 kf5-kcmutils-5.99.0-1.fc37 kf5-kcodecs-5.99.0-1.fc37 kf5-kcompletion-5.99.0-1.fc37 kf5-kconfig-5.99.0-1.fc37 kf5-kconfigwidgets-5.99.0-1.fc37 kf5-kcontacts-5.99.0-1.fc37 kf5-kcoreaddons-5.99.0-1.fc37 kf5-kcrash-5.99.0-1.fc37 kf5-kdav-5.99.0-1.fc37 kf5-kdbusaddons-5.99.0-1.fc37 kf5-kdeclarative-5.99.0-1.fc37 kf5-kded-5.99.0-1.fc37 kf5-kdelibs4support-5.99.0-1.fc37 kf5-kdesignerplugin-5.99.0-1.fc37 kf5-kdesu-5.99.0-1.fc37 kf5-kdewebkit-5.99.0-1.fc37 kf5-kdnssd-5.99.0-1.fc37 kf5-kdoctools-5.99.0-1.fc37 kf5-kemoticons-5.99.0-1.fc37 kf5-kfilemetadata-5.99.0-1.fc37 kf5-kglobalaccel-5.99.0-1.fc37 kf5-kguiaddons-5.99.0-1.fc37 kf5-kholidays-5.99.0-1.fc37 kf5-khtml-5.99.0-1.fc37 kf5-ki18n-5.99.0-1.fc37 kf5-kiconthemes-5.99.0-1.fc37 kf5-kidentitymanagement-22.08.2-1.fc37 kf5-kidletime-5.99.0-1.fc37 kf5-kimageformats-5.99.0-2.fc37 kf5-kimap-22.08.2-1.fc37 kf5-kinit-5.99.0-1.fc37 kf5-kio-5.99.0-1.fc37 kf5-kipi-plugins-22.08.2-1.fc37 kf5-kirigami2-5.99.0-1.fc37 kf5-kitemmodels-5.99.0-1.fc37 kf5-kitemviews-5.99.0-1.fc37 kf5-kitinerary-22.08.2-1.fc37 kf5-kjobwidgets-5.99.0-1.fc37 kf5-kjs-5.99.0-1.fc37 kf5-kjsembed-5.99.0-1.fc37 kf5-kldap-22.08.2-1.fc37 kf5-kmailtransport-22.08.2-1.fc37 kf5-kmbox-22.08.2-1.fc37 kf5-kmediaplayer-5.99.0-1.fc37 kf5-kmime-22.08.2-1.fc37 kf5-knewstuff-5.99.0-1.fc37 kf5-knotifications-5.99.0-1.fc37 kf5-knotifyconfig-5.99.0-1.fc37 kf5-kontactinterface-22.08.2-1.fc37 kf5-kpackage-5.99.0-1.fc37 kf5-kparts-5.99.0-1.fc37 kf5-kpeople-5.99.0-1.fc37 kf5-kpimtextedit-22.08.2-1.fc37 kf5-kpkpass-22.08.2-1.fc37 kf5-kplotting-5.99.0-1.fc37 kf5-kpty-5.99.0-1.fc37 kf5-kquickcharts-5.99.0-1.fc37 kf5-kross-5.99.0-1.fc37 kf5-krunner-5.99.0-1.fc37 kf5-kservice-5.99.0-1.fc37 kf5-ksmtp-22.08.2-1.fc37 kf5-ktexteditor-5.99.0-1.fc37 kf5-ktextwidgets-5.99.0-1.fc37 kf5-ktnef-22.08.2-1.fc37 kf5-kunitconversion-5.99.0-1.fc37 kf5-kwallet-5.99.0-1.fc37 kf5-kwayland-5.99.0-1.fc37 kf5-kwidgetsaddons-5.99.0-1.fc37 kf5-kwindowsystem-5.99.0-1.fc37 kf5-kxmlgui-5.99.0-1.fc37 kf5-kxmlrpcclient-5.99.0-1.fc37 kf5-libkcddb-22.08.2-1.fc37 kf5-libkcompactdisc-22.08.2-1.fc37 kf5-libkdcraw-22.08.2-1.fc37 kf5-libkdepim-22.08.2-1.fc37 kf5-libkexiv2-22.08.2-1.fc37 kf5-libkipi-22.08.2-1.fc37 kf5-libkleo-22.08.2-1.fc37 kf5-libktorrent-22.08.2-1.fc37 kf5-modemmanager-qt-5.99.0-1.fc37 kf5-networkmanager-qt-5.99.0-1.fc37 kf5-plasma-5.99.0-1.fc37 kf5-prison-5.99.0-1.fc37 kf5-purpose-5.99.0-1.fc37 kf5-solid-5.99.0-1.fc37 kf5-sonnet-5.99.0-1.fc37 kf5-syndication-5.99.0-1.fc37 kf5-syntax-highlighting-5.99.0-1.fc37 kf5-threadweaver-5.99.0-1.fc37 kfind-22.08.2-1.fc37 kfloppy-22.08.2-1.fc37 kgamma-5.26.2-1.fc37 kgeography-22.08.2-1.fc37 kget-22.08.2-1.fc37 kgpg-22.08.2-1.fc37 khangman-22.08.2-1.fc37 khelpcenter-22.08.2-1.fc37 khotkeys-5.26.2-1.fc37 kig-22.08.2-1.fc37 kinfocenter-5.26.2-1.fc37 kio-extras-22.08.2-1.fc37 kio-gdrive-22.08.2-1.fc37 kirigami-gallery-22.08.2-1.fc37 kiriki-22.08.2-1.fc37 kiten-22.08.2-1.fc37 kleopatra-22.08.2-1.fc37 klettres-22.08.2-1.fc37 klines-22.08.2-1.fc37 kmag-22.08.2-1.fc37 kmenuedit-5.26.2-1.fc37 kmines-22.08.2-1.fc37 kmix-22.08.2-1.fc37 kmousetool-22.08.2-1.fc37 kmouth-22.08.2-1.fc37 kmplot-22.08.2-1.fc37 knavalbattle-22.08.2-1.fc37 knetwalk-22.08.2-1.fc37 kollision-22.08.2-1.fc37 kolourpaint-22.08.2-1.fc37 kompare-22.08.2-1.fc37 konquest-22.08.2-1.fc37 konsole5-22.08.2-1.fc37 konversation-22.08.2-1.fc37 kopete-22.08.2-1.fc37 kpat-22.08.2-1.fc37 kpipewire-5.26.2-1.fc37 kpmcore-22.08.2-1.fc37 kqtquickcharts-22.08.2-1.fc37 krdc-22.08.2-1.fc37 krfb-22.08.2-1.fc37 kruler-22.08.2-1.fc37 kscreen-5.26.2-1.fc37 kscreenlocker-5.26.2-1.fc37 ksirk-22.08.2-1.fc37 kspaceduel-22.08.2-1.fc37 ksquares-22.08.2-1.fc37 ksshaskpass-5.26.2-1.fc37 ksudoku-22.08.2-1.fc37 ksystemlog-22.08.2-1.fc37 ksystemstats-5.26.2-1.fc37 kteatime-22.08.2-1.fc37 ktimer-22.08.2-1.fc37 ktorrent-22.08.2-1.fc37 ktouch-22.08.2-1.fc37 kturtle-22.08.2-1.fc37 kwalletmanager5-22.08.2-1.fc37 kwave-22.08.2-1.fc37 kwayland-integration-5.26.2-1.fc37 kwin-5.26.2.1-1.fc37 kwordquiz-22.08.2-1.fc37 kwrited-5.26.2-1.fc37 layer-shell-qt-5.26.2-1.fc37 libkdegames-22.08.2-1.fc37 libkeduvocdocument-22.08.2-1.fc37 libkgapi-22.08.2-1.fc37 libkmahjongg-22.08.2-1.fc37 libkomparediff2-22.08.2-1.fc37 libkscreen-qt5-5.26.2-1.fc37 libksysguard-5.26.2-1.fc37 lokalize-22.08.2-1.fc37 lskat-22.08.2-1.fc37 marble-22.08.2-1.fc37 minuet-22.08.2-1.fc37 okular-22.08.2-1.fc37 oxygen-icon-theme-5.99.0-1.fc37 oxygen-sounds-5.26.2-1.fc37 palapeli-22.08.2-1.fc37 pam-kwallet-5.26.2-1.fc37 parley-22.08.2-1.fc37 picmi-22.08.2-1.fc37 plasma-breeze-5.26.2-1.fc37 plasma-browser-integration-5.26.2-1.fc37 plasma-desktop-5.26.2-1.fc37 plasma-discover-5.26.2-3.fc37 plasma-disks-5.26.2-1.fc37 plasma-drkonqi-5.26.2-1.fc37 plasma-firewall-5.26.2-1.fc37 plasma-integration-5.26.2-1.fc37 plasma-milou-5.26.2-1.fc37 plasma-mobile-5.26.2-1.fc37 plasma-nano-5.26.2-1.fc37 plasma-nm-5.26.2-1.fc37 plasma-oxygen-5.26.2-1.fc37 plasma-pa-5.26.2-1.fc37 plasma-sdk-5.26.2-1.fc37 plasma-systemmonitor-5.26.2-1.fc37 plasma-systemsettings-5.26.2-1.fc37 plasma-thunderbolt-5.26.2-1.fc37 plasma-vault-5.26.2-1.fc37 plasma-workspace-5.26.2-1.fc37 plasma-workspace-wallpapers-5.26.2-1.fc37 plymouth-kcm-5.26.2-1.fc37 plymouth-theme-breeze-5.26.2-1.fc37 polkit-kde-5.26.2-1.fc37 powerdevil-5.26.2-1.fc37 poxml-22.08.2-1.fc37 qqc2-breeze-style-5.26.2-1.fc37 qqc2-desktop-style-5.99.0-1.fc37 rocs-22.08.2-1.fc37 sddm-kcm-5.26.2-1.fc37 signon-kwallet-extension-22.08.2-1.fc37 spectacle-22.08.2-1.fc37 step-22.08.2-1.fc37 svgpart-22.08.2-1.fc37 sweeper-22.08.2-1.fc37 umbrello-22.08.2-1.fc37 xdg-desktop-portal-kde-5.26.2-1.fc37 yakuake-22.08.2-1.fc37 zanshin-22.08.2-1.fc37',2283227,now(),now());
ERROR:  index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_job_id_value_settings_test"
DETAIL:  Index row references tuple (0,3) in relation "job_settings".
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

so yeah, that definitely seems to be the problem. If I drop the index, I can add the row. If I then try to recreate the index, I get the same error:

CREATE INDEX idx_job_id_value_settings_test on job_settings (job_id, key, value);
ERROR:  index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_job_id_value_settings_test"
DETAIL:  Index row references tuple (0,4) in relation "job_settings".
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

so, let's try a partial index:

CREATE INDEX idx_job_id_value_settings_test on job_settings (job_id, key, value) WHERE key IN ('DISTRI', 'VERSION', 'FLAVOR', 'ARCH', 'BUILD', 'ISO', 'HDD_1', 'WORKER_CLASS');
CREATE INDEX

...hey, that worked. Let's remove our 'bad' row then try a couple things (I've edited out the very long value text for conciseness, but of course I used the full text in the real queries):

DELETE FROM job_settings WHERE key='ADVISORY_NVRS';
DELETE 1
openqa=# insert into job_settings(id,key,value,job_id,t_created,t_updated) VALUES(62195067,'ADVISORY_NVRS','verylongvaluehere',2283227,now(),now());
INSERT 0 1
openqa=# DELETE FROM job_settings WHERE key='ADVISORY_NVRS';
DELETE 1
openqa=# insert into job_settings(id,key,value,job_id,t_created,t_updated) VALUES(62195067,'WORKER_CLASS','verylongvaluehere',2283227,now(),now());
ERROR:  index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_job_id_value_settings_test"
DETAIL:  Index row references tuple (0,6) in relation "job_settings".
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

so that's as I expected: we can happily add a row with a very long value if it's for a non-indexed key, but if we make it be for one of the indexed keys, we get the error again.

This supports my idea that using a partial index could be a solution here, so I guess I'll try and work up a PR for that.

#5 Updated by AdamWill 2 months ago

Ugggghhh. Of course, SQLTranslator can't actually add a partial index; that would be far too easy. Just doesn't have the feature. DBIx docs say "Therefore [sqlt_deploy_hook] can be used only to manipulate the SQL::Translator::Schema object before it is turned into SQL fed to the database. If you want to execute post-deploy statements which can not be generated by SQL::Translator, the currently suggested method is to overload "deploy" and use dbh_do."

which sounds like just so much fun. I guess I'll come back to this again tomorrow.

edit: well humm, but maybe: https://metacpan.org/pod/SQL::Translator::Schema::Index#options

#6 Updated by AdamWill 2 months ago

  • Status changed from New to In Progress

#7 Updated by cdywan 2 months ago

  • Assignee set to AdamWill

AdamWill wrote:

https://github.com/os-autoinst/openQA/pull/4936

I assume you're working on this

#8 Updated by AdamWill 2 months ago

Um. I'm working on this issue, yes - the PR is intended to resolve it. I'm not "still working on" the PR, though, it is ready for review so far as I'm concerned...

#9 Updated by openqa_review 2 months ago

  • Due date set to 2022-12-15

Setting due date based on mean cycle time of SUSE QE Tools

#10 Updated by robert.richardson 2 months ago

  • Subject changed from bigint conversion fails due to idx_job_id_value_settings index being too wide to bigint conversion fails due to idx_job_id_value_settings index being too wide size:S

#11 Updated by kraih 2 months ago

Given how much we rely on these indexes, changes should be done very carefully:

 relid | indexrelid | schemaname |   relname    |       indexrelname        | idx_scan | idx_tup_read | idx_tup_fetch
-------+------------+------------+--------------+---------------------------+----------+--------------+---------------
 16513 |      17780 | public     | job_settings | idx_value_settings        |    38833 |      6342596 |         57192
 16513 |    2342927 | public     | job_settings | idx_job_id_value_settings | 42807954 |   1528570202 |    1036367663
 relid | indexrelid | schemaname |   relname    |       indexrelname        | idx_scan  | idx_tup_read | idx_tup_fetch
-------+------------+------------+--------------+---------------------------+-----------+--------------+---------------
 16504 |      16846 | public     | job_settings | idx_value_settings        |    621401 |     84112275 |       3568127
 16504 |    1955885 | public     | job_settings | idx_job_id_value_settings | 134659634 |   2792988853 |    1923142741

#12 Updated by AdamWill 2 months ago

Per https://github.com/os-autoinst/openQA/pull/4936#issuecomment-1334606962 , I decided to sort this specific case out downstream by, uh, putting some lies in our database. :D I've also tweaked how we pass these lists of builds into the test settings so it should not be possible for this to happen again (unless, I guess, someone comes up with a really long package name or version string).

I'm thinking about proposing a PR with some kinda max length of the key+value for a test setting. Of course, in a sense, the index width is already that...but if you run into it, you get a rather obscure database error which is a bit hard to translate to "that string's too long", and of course this whole "encountering it on the integer type conversion" thing was a curveball (though one that seems unlikely to happen again).

#13 Updated by okurz about 2 months ago

  • Due date deleted (2022-12-15)
  • Status changed from In Progress to Resolved

As I haven't heard from you again I assume we can resolve. Of course the suggestions you mentioned would be possible but I wouldn't bother now

Also available in: Atom PDF