Project

General

Profile

action #78165

infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry

Added by okurz 8 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
Start date:
2020-11-18
Due date:
% Done:

0%

Estimated time:

Description

Observation

Many incompletes with "Cache service status error 500: Internal Server Error"

An example:
https://openqa.suse.de/tests/5020263

The worker-log.txt only shows:

[2020-11-18T06:56:31.0997 CET] [debug] [pid:32153] REST-API call: POST http://openqa.suse.de/api/v1/jobs/5019106/status
[2020-11-18T06:56:32.0037 CET] [error] [pid:32153] Unable to setup job 5019106: Cache service status error 500: Internal Server Error
[2020-11-18T06:56:32.0037 CET] [debug] [pid:32153] Stopping job 5019106 from openqa.suse.de: 05019106-sle-15-SP3-Online-x86_64-Build81.1-xfstests_btrfs-generic-001-100@64bit-smp - reason: setup failure

Related issues

Copied from openQA Project - action #78163: After OSD upgrade, many jobs incomplete with "Cache service status error 500: Internal Server Error"Closed2020-11-18

Copied to openQA Project - action #78169: after osd-deploy 2020-11-18 incompletes with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retryResolved2020-11-18

History

#1 Updated by okurz 8 months ago

  • Copied from action #78163: After OSD upgrade, many jobs incomplete with "Cache service status error 500: Internal Server Error" added

#2 Updated by okurz 8 months ago

I suggest stop all workers, clean the cache dir and restart. I suggest on osd salt -C 'G@roles:worker' cmd.run 'systemctl stop openqa-worker.target openqa-worker-cacheservice openqa-worker-cacheservice-minion && rm -rf /var/lib/openqa/cache/* && systemctl start openqa-worker.target openqa-worker-cacheservice openqa-worker-cacheservice-minion'. I did that now on osd.

#3 Updated by okurz 8 months ago

  • Copied to action #78169: after osd-deploy 2020-11-18 incompletes with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry added

#4 Updated by okurz 8 months ago

  • Subject changed from infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service status error 500: Internal Server Error" to infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service.*error 500: Internal Server Error":retry

https://stats.openqa-monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&editPanel=17&tab=alert shows the problem, I paused the alert for now.

I needed to update the auto_review regex. I see "Cache service info error 500: Internal Server Error", not "status error"

#5 Updated by okurz 8 months ago

  • Subject changed from infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service.*error 500: Internal Server Error":retry to infrastructure task: After osd deployment 2020-11-18 many jobs incomplete with auto_review:"Cache service (status error from API|.*error 500: Internal Server Error)":retry

#6 Updated by okurz 8 months ago

  • Status changed from In Progress to Resolved

I have re-enabled the two alerts about "incompletes from last 24h" and also auto-review from today is fine.

host=osd openqa-query-for-job-label 78165 shows reports after my change but not for openqaworker8:

5032329|2020-11-19 05:33:24|done|incomplete|qam-minimal-full|setup failure: Cache service info error 500: Internal Server Error|QA-Power8-5-kvm
5032151|2020-11-19 05:33:18|done|incomplete|offline_sles12sp4_ltss_media_sdk-lp-asmm-contm-lgm-tcm-wsm_all_full|setup failure: Cache service info error 500: Internal Server Error|QA-Power8-5-kvm
5032150|2020-11-19 05:33:05|done|incomplete|offline_sles12sp3_ltss_media_sdk-lp-asmm-contm-lgm-tcm-wsm_all_full|setup failure: Cache service info error 500: Internal Server Error|QA-Power8-5-kvm
5025118|2020-11-18 08:21:22|done|incomplete|home_encrypted|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025241|2020-11-18 08:21:19|done|incomplete|migration_zypper_sle15sp1_ha_alpha_node02|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025208|2020-11-18 08:21:17|done|incomplete|migration_online_zypper_sles4sap15sp2|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025231|2020-11-18 08:21:14|done|incomplete|migration_media+scc_sle12sp5_ha_alpha_node01|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025233|2020-11-18 08:21:14|done|incomplete|autoyast_sles4sap_hana|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025120|2020-11-18 08:21:13|done|incomplete|migration_online_zypper_sles4sap15|setup failure: Cache service status error 500: Internal Server Error|openqaworker8
5025230|2020-11-18 08:21:12|done|incomplete|ha_textmode_extended|setup failure: Cache service status error 500: Internal Server Error|openqaworker8

moving the "auto_review" keyword back to #78169

Also available in: Atom PDF