action #80408
closedopenQA Project - coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues
openQA Project - coordination #62420: [epic] Distinguish all types of incompletes
revert longer timeout override for openQA services as we could not see less problems with corrupted worker cache
0%
Description
Motivation¶
To find out if worker cache services corrupt the sqlite database due to being killed on systemd service termination we enlarged the timeout on o3 and osd of all relevant worker systemd services temporarily in #80106 . As mkittler reported (confirm!) that neither helped with getting rid of corrupted cache nor did it prevent the killing of services but now the shutdown of systems can take much longer as we still have #62441
Acceptance criteria¶
- AC1: openQA worker hosts shut down within less than 2m again
Suggestions¶
Revert all actions from #80106
Updated by okurz almost 4 years ago
- Copied from action #80106: corrupted worker cache sqlite: Enlarge systemd service kill timeout temporarily added
Updated by nicksinger almost 4 years ago
Removed the file from all workers on OSD and reloaded systemd. A quick peak with systemctl cat $service
showed success. Now the o3 workers
Updated by nicksinger almost 4 years ago
- Status changed from Workable to Resolved
Also deleted on all o3 workers