Project

General

Profile

Actions

action #128669

closed

gitlab CI jobs fail when using docker executor with "ERROR: Failed to remove network for build"

Added by okurz over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Start date:
2023-05-04
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

From https://gitlab.suse.de/openqa/openqa-review/-/jobs/1550012

Running with gitlab-runner 15.11.0 (436955cb)
  on gitlab-worker4:sle15.3 sHAdmiLV, system ID: s_d2d8982b55c6
Preparing the "docker" executor 00:21
ERROR: Failed to remove network for build
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:780:12s)
Will be retried in 3s ...
ERROR: Failed to remove network for build
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:780:0s)
Will be retried in 3s ...
ERROR: Failed to remove network for build
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:780:0s)
Will be retried in 3s ...
ERROR: Job failed (system failure): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:780:0s)

Steps to reproduce

Might be runner specific, reproduced in https://gitlab.suse.de/openqa/openqa-review/-/jobs/1550017, same runner id. So far not reproduced on other runners.

Workaround

Likely a manual retriggering will work if a different worker is found

Expected result

Pipeline owners should not be concerned with runner problems but only runner admins should be notified.

Impact

Scheduling, monitoring and reporting of SLE product validation and SLE maintenance update testing is impacted and needs manual handling of failed pipeline jobs causing misses and delays.

Actions #1

Updated by okurz over 1 year ago

  • Description updated (diff)
  • Status changed from New to Blocked
Actions #3

Updated by okurz over 1 year ago

  • Status changed from Blocked to Resolved

Problem was fixed by restart of the gitlab runner. After my suggestion the relevant hosts and gitlab runner were added to zabbix monitoring by Jiri Novak (Eng-Infra). SD ticket was resolved.

Actions

Also available in: Atom PDF