action #78169
coordination #39719: [saga][epic] Detect "known failures" and mark jobs as such to make tests more stable, reviewing test results and tracking known issues easier
coordination #62420: [epic] Distinguish all types of incompletes
after osd-deploy 2020-11-18 incompletes with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry
0%
Description
Observation¶
For example https://openqa.suse.de/tests/5020506 showing
"Reason: setup failure: No workers active in the cache service"
with autoinst-log.txt:
[2020-11-18T06:40:55.0106 CET] [info] [pid:12060] +++ setup notes +++ [2020-11-18T06:40:55.0106 CET] [info] [pid:12060] Running on openqaworker9:17 (Linux 4.12.14-lp151.28.79-default #1 SMP Wed Nov 11 08:17:16 UTC 2020 (472d149) x86_64) [2020-11-18T06:40:55.0147 CET] [info] [pid:12060] +++ worker notes +++ [2020-11-18T06:40:55.0147 CET] [info] [pid:12060] End time: 2020-11-18 05:40:55 [2020-11-18T06:40:55.0147 CET] [info] [pid:12060] Result: setup failure [2020-11-18T06:40:55.0152 CET] [info] [pid:13158] Uploading autoinst-log.txt
and worker-log.txt:
[2020-11-18T06:40:55.0106 CET] [debug] [pid:12060] Preparing Mojo::IOLoop::ReadWriteProcess::Session … [2020-11-18T06:40:55.0111 CET] [error] [pid:12060] Unable to setup job 5020506: No workers active in the cache service [2020-11-18T06:40:55.0111 CET] [debug] [pid:12060] Stopping job 5020506 from openqa.suse.de: 05020506-sle-12-SP2-Server-DVD-Incidents-Kernel-KOTD-x86_64-Build4.4.121-261.1.g13f6b6d-ltp_syscalls_pre12sp4@64bit - reason: setup failure [2020-11-18T06:40:55.0112 CET] [debug] [pid:12060] REST-API call: POST http://openqa.suse.de/api/v1/jobs/5020506/status [2020-11-18T06:40:55.0152 CET] [info] [pid:13158] Uploading autoinst-log.txt [2020-11-18T06:40:55.0207 CET] [info] [pid:13158] Uploading worker-log.txt
Steps to reproduce¶
Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
for example to look this ticket #78169 call openqa-query-for-job-label poo#78169
Suggestions¶
- Crosscheck what the osd deployment 2020-11-18 could have brought in as changes explaining the problems
- Lookup in source code what this message could mean
Related issues
History
#1
Updated by okurz 2 months ago
- Copied from action #78165: infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry added
#3
Updated by okurz 2 months ago
- Subject changed from after osd-deploy 2020-11-18 incompletes with auto_review:"setup failure: No workers active in the cache service":retry to after osd-deploy 2020-11-18 incompletes with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry
#7
Updated by okurz 2 months ago
- Status changed from Workable to Blocked
- Assignee set to mkittler
ok, could you please track as blocked by #67000 and check after deployment of all relevant changes that all according errors are gone, e.g. by looking into the "summary" steps of the "auto-review" gitlab CI pipelines?
#8
Updated by okurz about 2 months ago
- Copied to action #80202: jobs incomplete with auto_review:"setup failure: No workers active in the cache service":retry added
#9
Updated by okurz about 2 months ago
- Copied to action #80356: incompletes with auto_review:"Cache service.*error: Connection refused":retry added
#10
Updated by okurz about 2 months ago
- Parent task set to #62420
#11
Updated by mkittler about 2 months ago
- Status changed from Blocked to Resolved
Closing because #67000 has been resolved. I don't see any workarounds mentioned in this ticket which needed to be reverted.