Project

General

Profile

Actions

action #103762

closed

gitlab CI pipeline failed with Error cleaning up pod: Delete ... connect: connection refused Job failed (system failure): prepare environment: waiting for pod running ... i/o timeout. Check ... for more information

Added by livdywan about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Start date:
2021-08-13
Due date:
% Done:

0%

Estimated time:

Description

Observation

I saw ... failing for the first time:

ERROR: Error cleaning up pod: Delete "https://caasp-master.suse.de:6443/api/v1/namespaces/gitlab/pods/runner-shadmilv-project-3530-concurrent-0j7ztr": dial tcp 10.160.1.78:6443: connect: connection refused
15ERROR: Job failed (system failure): prepare environment: waiting for pod running: Get "https://caasp-master.suse.de:6443/api/v1/namespaces/gitlab/pods/runner-shadmilv-project-3530-concurrent-0j7ztr": dial tcp 10.160.1.78:6443: i/o timeout. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

Suggestions

  • Maybe it's just because it's Thursday and the maintenance window is causing machines to be temporarily available on live systems that depend on each other?
  • Try and re-trigger and see if it happens again
  • File an infra ticket

Files


Related issues 1 (0 open1 closed)

Copied from QA (public) - action #96827: gitlab CI pipeline failed with Job failed: pod status is Failed size:SResolvedlivdywan2021-08-13

Actions
Actions #1

Updated by livdywan about 3 years ago

  • Copied from action #96827: gitlab CI pipeline failed with Job failed: pod status is Failed size:S added
Actions #2

Updated by livdywan about 3 years ago

cdywan wrote:

  • Try and re-trigger and see if it happens again

It doesn't seem like I can retrigger it. I only see "show complete raw" and "erase job log".

Actions #3

Updated by jbaier_cz about 3 years ago

  • Status changed from New to Resolved
  • Assignee set to jbaier_cz

cdywan wrote:

cdywan wrote:

  • Try and re-trigger and see if it happens again

It doesn't seem like I can retrigger it. I only see "show complete raw" and "erase job log".

You do not see the retry button, because the job was already retriggered 3 hours ago and passed. The quoted error message clearly suggests, there were communication errors between gitlab and caasp cluster during the maintenance window. There is nothing more to do here.

Also, on the job page, one can see "There has been a runner system failure, please try again" error message.

Actions

Also available in: Atom PDF