Project

General

Profile

Actions

action #71185

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #62420: [epic] Distinguish all types of incompletes

job incompletes with auto_review:"setup failure: Cache service status error: Premature connection close":retry and does not retry, should we just automatically retry the connection?

Added by okurz over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2020-09-10
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://openqa.suse.de/tests/4663520 is incomplete, reason is "setup failure: Cache service status error: Premature connection close" , the worker log
https://openqa.suse.de/tests/4663520/file/worker-log.txt gives more details:

[2020-09-09T12:48:28.0234 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:28.0344 CEST] [debug] [pid:5715] Linked asset "/var/lib/openqa/cache/openqa.suse.de/SLES-15-SP2-aarch64-Installtest.qcow2" to "/var/lib/openqa/pool/3/SLES-15-SP2-aarch64-Installtest.qcow2"
[2020-09-09T12:48:33.0422 CEST] [debug] [pid:5715] Updating status so job 4663520 is not considered dead.
[2020-09-09T12:48:33.0423 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:33.0515 CEST] [debug] [pid:5715] Linked asset "/var/lib/openqa/cache/openqa.suse.de/SLE-15-SP2-Installer-DVD-aarch64-GM-DVD1.iso" to "/var/lib/openqa/pool/3/SLE-15-SP2-Installer-DVD-aarch64-GM-DVD1.iso"
[2020-09-09T12:48:38.0603 CEST] [debug] [pid:5715] Updating status so job 4663520 is not considered dead.
[2020-09-09T12:48:38.0604 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:38.0716 CEST] [debug] [pid:5715] Linked asset "/var/lib/openqa/cache/openqa.suse.de/SLES-15-SP2-aarch64-Installtest-uefi-vars.qcow2" to "/var/lib/openqa/pool/3/SLES-15-SP2-aarch64-Installtest-uefi-vars.qcow2"
[2020-09-09T12:48:43.0759 CEST] [debug] [pid:5715] Updating status so job 4663520 is not considered dead.
[2020-09-09T12:48:43.0760 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:48.0809 CEST] [debug] [pid:5715] Updating status so job 4663520 is not considered dead.
[2020-09-09T12:48:48.0810 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:48.0844 CEST] [error] [pid:5715] Unable to setup job 4663520: Cache service status error: Premature connection close
[2020-09-09T12:48:48.0844 CEST] [debug] [pid:5715] Stopping job 4663520 from openqa.suse.de: 04663520-sle-15-SP2-Server-DVD-Incidents-Install-aarch64-Build:15836:openssl-1_1-qam-incidentinstall@aarch64-virtio - reason: setup failure
[2020-09-09T12:48:48.0845 CEST] [debug] [pid:5715] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4663520/status
[2020-09-09T12:48:48.0917 CEST] [info] [pid:14619] Uploading autoinst-log.txt
[2020-09-09T12:48:48.0968 CEST] [info] [pid:14619] Uploading worker-log.txt

but then the job incompletes and also is not automatically retriggered. It is unclear to the user what should be done

Acceptance criteria

  • AC1: "Cache service status error: Premature connection close" is prevented or handled with retries (either within job or by retriggering the complete job)

Suggestions

  • Look into the cache service implementation if we can have retries in this situation. If not, maybe mark job as incomplete with proper reason and ensure it is automatically retriggered.
Actions

Also available in: Atom PDF