Project

General

Profile

Actions

action #179038

closed

coordination #154777: [saga][epic] Shareable os-autoinst and test distribution plugins

coordination #162131: [epic] future version control related features in openQA

Gracious handling of longer remote git clones outages size:S

Added by robert.richardson 2 months ago. Updated 13 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2025-03-17
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Currently, git_clone minion jobs fail when GitLab is temporarily unreachable (see #178492), by introducing a proper error-handling mechanism, we can ensure:

  • Temporary outages do not cause unnecessary job failures or alerts.

User Story

"As a test engineer and openQA operator,
i want openQA to handle short-lived GitLab outages without causing mass Minion job failures,
so that users do not experience unnecessary disruption

Acceptance Criteria

  • AC1: Temporary remote git outages don't cause failing minion jobs
  • AC2: An update of remote git repositories is still ensured on shorter failed requests, e.g. in range of seconds

Suggestions

  • Damage is likely limited. If we can't sync needles nobody can edit needles.
  • Jobs end up incomplete if there's an on-going issue with git_clone minion jobs
  • We could decide to eventually give up and continue anyway and let jobs run

Related issues 4 (1 open3 closed)

Related to openQA Infrastructure (public) - action #178492: [alert] Many failing `git_clone` Minion jobs auto_review:"Error detecting remote default branch name":retry size:SResolvedrobert.richardson2025-03-07

Actions
Related to openQA Infrastructure (public) - action #182021: [alert] web UI: Too many Minion job failures alertResolvedtinita2025-01-23

Actions
Copied to openQA Project (public) - action #179185: Detection of long-time remote git clone outages size:SWorkable2025-03-17

Actions
Copied to openQA Project (public) - action #180863: Conduct lessons learned "Five Why" analysis for "Gracious handling of longer remote git clones outages" size:SResolvedlivdywan

Actions
Actions

Also available in: Atom PDF