Actions
action #121777
closedPostgreSQL update ordering deadlock on jobs table
Description
Observation¶
Seen in the O3 PostgreSQL logs:
2022-12-09 06:21:08.416 UTC openqa geekotest [6931]ERROR: deadlock detected
2022-12-09 06:21:08.416 UTC openqa geekotest [6931]DETAIL: Process 6931 waits for ShareLock on transaction 243609936; blocked by process 5779.
Process 5779 waits for ShareLock on transaction 243609932; blocked by process 6931.
Process 6931: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
Process 5779: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
2022-12-09 06:54:32.659 UTC openqa geekotest [28685]ERROR: deadlock detected
2022-12-09 06:54:32.659 UTC openqa geekotest [28685]DETAIL: Process 28685 waits for ShareLock on transaction 243819997; blocked by process 30309.
Process 30309 waits for ShareLock on transaction 243819999; blocked by process 28685.
Process 28685: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
Process 30309: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
2022-12-09 10:01:48.090 UTC openqa geekotest [21755]ERROR: deadlock detected
2022-12-09 10:01:48.090 UTC openqa geekotest [21755]DETAIL: Process 21755 waits for ShareLock on transaction 244660614; blocked by process 22220.
Process 22220 waits for ShareLock on transaction 244660621; blocked by process 21755.
Process 21755: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
Process 22220: UPDATE jobs SET result = $1, t_updated = $2 WHERE id = $3
This is probably a very similar problem to #120891, and i noticed it while checking the logs to verify the fix for that works.
Acceptance criteria¶
- AC1: No more deadlocks on jobs table
Suggestions¶
- Try to get the jobs table updates into a consistent order across all processes (similar to https://progress.opensuse.org/issues/120891#note-21)
Updated by kraih almost 2 years ago
- Related to action #120891: Product not scheduled: DBD::Pg::st execute failed: ERROR: deadlock detected size:M added
Updated by kraih almost 2 years ago
There is a possibility that this might be the cause for the state getting stuck at running
issue from #121768.
Updated by kraih almost 2 years ago
Worth mentioning that of the deadlocks currently in the O3 logs, only the newest 3 are for the jobs table:
ariel:/var/lib/pgsql/data/log # grep deadlock *.log
postgresql-2022-11-24_000000.log:2022-11-24 09:35:05.878 UTC openqa geekotest [32110]ERROR: deadlock detected
postgresql-2022-11-24_000000.log:2022-11-24 09:35:06.881 UTC openqa geekotest [32298]ERROR: deadlock detected
postgresql-2022-11-27_000000.log:2022-11-27 04:31:55.788 UTC openqa geekotest [10404]ERROR: deadlock detected
postgresql-2022-12-01_000000.log:2022-12-01 03:46:12.925 UTC openqa geekotest [7300]ERROR: deadlock detected
postgresql-2022-12-01_000000.log:2022-12-01 22:08:49.641 UTC openqa geekotest [26387]ERROR: deadlock detected
postgresql-2022-12-01_000000.log:2022-12-01 23:57:27.173 UTC openqa geekotest [25245]ERROR: deadlock detected
postgresql-2022-12-06_000000.log:2022-12-06 01:38:27.615 UTC openqa geekotest [10906]ERROR: deadlock detected
postgresql-2022-12-09_000000.log:2022-12-09 06:21:08.416 UTC openqa geekotest [6931]ERROR: deadlock detected
postgresql-2022-12-09_000000.log:2022-12-09 06:54:32.659 UTC openqa geekotest [28685]ERROR: deadlock detected
postgresql-2022-12-09_000000.log:2022-12-09 10:01:48.090 UTC openqa geekotest [21755]ERROR: deadlock detected
Updated by mkittler almost 2 years ago
Possibly https://github.com/os-autoinst/openQA/pull/4955 improves some cases. Also see #121768#note-10.
Updated by kraih almost 2 years ago
Can't see any new cases on O3, maybe Marius did resolve it with the PR.
Updated by okurz almost 2 years ago
- Related to action #121768: openQA jobs have been restarted but are stuck in "running" size:M added
Updated by kraih almost 2 years ago
We have a new one on the assets table after the last round of PRs (#122791):
2022-12-30 10:50:18.466 CET openqa geekotest [16610]DETAIL: Process 16610 waits for ShareLock on transaction 1570088837; blocked by process 16606.
Process 16606 waits for ShareLock on transaction 1570088839; blocked by process 16610.
Process 16610: SELECT me.id, me.type, me.name, me.size, me.checksum, me.last_use_job_id, me.fixed, me.t_created, me.t_updated FROM assets me WHERE ( ( me.name = $1 AND me.type = $2 ) ) FOR UPDATE
Process 16606: SELECT me.id, me.type, me.name, me.size, me.checksum, me.last_use_job_id, me.fixed, me.t_created, me.t_updated FROM assets me WHERE ( ( me.name = $1 AND me.type = $2 ) ) FOR UPDATE
Updated by livdywan almost 2 years ago
- Status changed from New to Blocked
Let's consider this blocked by #120891 and maybe it'll be resolved afterall
Updated by kraih over 1 year ago
- Status changed from Blocked to Resolved
PostgreSQL logs are free of deadlocks.
Actions