Project

General

Profile

action #103762

gitlab CI pipeline failed with Error cleaning up pod: Delete ... connect: connection refused Job failed (system failure): prepare environment: waiting for pod running ... i/o timeout. Check ... for more information

Added by cdywan 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2021-08-13
Due date:
% Done:

0%

Estimated time:

Description

Observation

I saw ... failing for the first time:

ERROR: Error cleaning up pod: Delete "https://caasp-master.suse.de:6443/api/v1/namespaces/gitlab/pods/runner-shadmilv-project-3530-concurrent-0j7ztr": dial tcp 10.160.1.78:6443: connect: connection refused
15ERROR: Job failed (system failure): prepare environment: waiting for pod running: Get "https://caasp-master.suse.de:6443/api/v1/namespaces/gitlab/pods/runner-shadmilv-project-3530-concurrent-0j7ztr": dial tcp 10.160.1.78:6443: i/o timeout. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

Suggestions

  • Maybe it's just because it's Thursday and the maintenance window is causing machines to be temporarily available on live systems that depend on each other?
  • Try and re-trigger and see if it happens again
  • File an infra ticket

Related issues

Copied from QA - action #96827: gitlab CI pipeline failed with Job failed: pod status is Failed size:SResolved2021-08-13

History

#1 Updated by cdywan 5 months ago

  • Copied from action #96827: gitlab CI pipeline failed with Job failed: pod status is Failed size:S added

#2 Updated by cdywan 5 months ago

12292

cdywan wrote:

  • Try and re-trigger and see if it happens again

It doesn't seem like I can retrigger it. I only see "show complete raw" and "erase job log".

#3 Updated by jbaier_cz 5 months ago

  • Status changed from New to Resolved
  • Assignee set to jbaier_cz

cdywan wrote:

cdywan wrote:

  • Try and re-trigger and see if it happens again

It doesn't seem like I can retrigger it. I only see "show complete raw" and "erase job log".

You do not see the retry button, because the job was already retriggered 3 hours ago and passed. The quoted error message clearly suggests, there were communication errors between gitlab and caasp cluster during the maintenance window. There is nothing more to do here.

Also, on the job page, one can see "There has been a runner system failure, please try again" error message.

Also available in: Atom PDF