Actions
action #64298
closedpostgres error "duplicate key value violates unique constraint "job_modules_job_id_name_category_script" ... INSERT INTO job_modules" filling up postgres server log files quickly
Start date:
2020-03-03
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
Similar as in #60926 , reported originally in the osd infrastructure ticket #64096 .
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?fullscreen&edit&tab=alert&panelId=74&orgId=1&from=1580815452576&to=1583211256414 shows that there is a rapid increase in space usage since 2020-02-17 until we hit the alert threshold. On 2020-03-02 22:00 suddenly the space usage was gone. The postgres log rotate mitigated this but the underlying problem is an error in postgres logs, like:
2020-03-08 03:12:02.042 CET openqa geekotest [12783]ERROR: duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.042 CET openqa geekotest [12783]DETAIL: Key (job_id, name, category, script)=(3967446, pthread_barrier_init_3-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.042 CET openqa geekotest [12783]STATEMENT: INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.045 CET openqa geekotest [12783]ERROR: duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.045 CET openqa geekotest [12783]DETAIL: Key (job_id, name, category, script)=(3967446, pthread_barrier_init_4-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.045 CET openqa geekotest [12783]STATEMENT: INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.053 CET openqa geekotest [12783]ERROR: duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.053 CET openqa geekotest [12783]DETAIL: Key (job_id, name, category, script)=(3967446, pthread_barrier_wait_1-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.053 CET openqa geekotest [12783]STATEMENT: INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.058 CET openqa geekotest [12783]ERROR: duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.058 CET openqa geekotest [12783]DETAIL: Key (job_id, name, category, script)=(3967446, pthread_barrier_wait_2-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.058 CET openqa geekotest [12783]STATEMENT: INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
which is the second most common entry already mentioned in #60923#note-3
Acceptance criteria¶
- AC1: Above error is not filling up postgres logs rapidly
Suggestions¶
- Look into what https://github.com/os-autoinst/openQA/pull/2617 already tries to do and where it's stuck
- Optimize job module insertion to prevent the problem or report error to the test running scope, e.g. abort test early to inform test writers what they did wrong, or catch and prevent the error in os-autoinst scope
Actions