Project

General

Profile

Actions

action #64096

closed

partition /srv was nearly depleted but now fixed (itself?)

Added by okurz about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
-
Start date:
2020-03-03
Due date:
2020-03-20
% Done:

0%

Estimated time:

Description

Observation

Received an alert email notification in http://mailman.suse.de/mailman/private/osd-admins/2020-March/000958.html at Mon Mar 2 06:52:02 UTC 2020.

https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?fullscreen&edit&tab=alert&panelId=74&orgId=1&from=1580815452576&to=1583211256414 shows that there is a rapid increase in space usage since 2020-02-17 until we hit the alert threshold. On 2020-03-02 22:00 suddenly the space usage was gone. Maybe a cleanup job took care. Still, this looks worrysome and should be investigated from logs on osd.


Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #60923: [alert] /srv about to run full, postgres logs very big due to repeated error "duplicate key value violates unique constraint "screenshots_filename", Key (filename)=(8ca/3c9/98a00d8bb2ccba5a2de1d403b5.png) already exists. INSERT INTO screenshots …"Resolvedokurz2019-12-11

Actions
Copied to openQA Infrastructure - action #64298: postgres error "duplicate key value violates unique constraint "job_modules_job_id_name_category_script" ... INSERT INTO job_modules" filling up postgres server log files quicklyResolvedmkittler2020-03-03

Actions
Actions #1

Updated by okurz about 4 years ago

  • Related to action #60923: [alert] /srv about to run full, postgres logs very big due to repeated error "duplicate key value violates unique constraint "screenshots_filename", Key (filename)=(8ca/3c9/98a00d8bb2ccba5a2de1d403b5.png) already exists. INSERT INTO screenshots …" added
Actions #2

Updated by okurz about 4 years ago

This seems to also affect system journal log retention periods as I could not find a long history within journalctl -u logrotate which impacted me trying to debug #62306

openqa:/srv # du --max-depth=3 -BG | grep -v '^[01]G'
5G  ./log/journal/84f4f4f356b525388b60f0ae547597e0
5G  ./log/journal
6G  ./log
31G ./PSQL10/data/base
24G ./PSQL10/data/log
54G ./PSQL10/data
54G ./PSQL10
61G .

So again postgres logs growing big?

Logs are full with entries like:

2020-03-08 03:12:02.042 CET openqa geekotest [12783]ERROR:  duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.042 CET openqa geekotest [12783]DETAIL:  Key (job_id, name, category, script)=(3967446, pthread_barrier_init_3-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.042 CET openqa geekotest [12783]STATEMENT:  INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.045 CET openqa geekotest [12783]ERROR:  duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.045 CET openqa geekotest [12783]DETAIL:  Key (job_id, name, category, script)=(3967446, pthread_barrier_init_4-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.045 CET openqa geekotest [12783]STATEMENT:  INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.053 CET openqa geekotest [12783]ERROR:  duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.053 CET openqa geekotest [12783]DETAIL:  Key (job_id, name, category, script)=(3967446, pthread_barrier_wait_1-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.053 CET openqa geekotest [12783]STATEMENT:  INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id
2020-03-08 03:12:02.058 CET openqa geekotest [12783]ERROR:  duplicate key value violates unique constraint "job_modules_job_id_name_category_script"
2020-03-08 03:12:02.058 CET openqa geekotest [12783]DETAIL:  Key (job_id, name, category, script)=(3967446, pthread_barrier_wait_2-1, kernel, tests/kernel/run_ltp.pm) already exists.
2020-03-08 03:12:02.058 CET openqa geekotest [12783]STATEMENT:  INSERT INTO job_modules ( always_rollback, category, fatal, important, job_id, milestone, name, script, t_created, t_updated) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10 ) RETURNING id

which is the second most common entry already mentioned in #60923#note-3

Actions #3

Updated by okurz about 4 years ago

  • Copied to action #64298: postgres error "duplicate key value violates unique constraint "job_modules_job_id_name_category_script" ... INSERT INTO job_modules" filling up postgres server log files quickly added
Actions #4

Updated by okurz about 4 years ago

  • Status changed from Workable to Blocked
  • Assignee set to okurz
  • Priority changed from Urgent to Low

reported problem in #64298 , will see if there is any remaining alerts or if the postgres log rotation prevents /srv depletion.

Actions #5

Updated by okurz about 4 years ago

  • Due date set to 2020-03-20
  • Status changed from Blocked to Feedback

fix for #64298 merged and showing good effect on o3. Waiting for deployment on osd tomorrow.

Actions

Also available in: Atom PDF