action #179038
Updated by robert.richardson 17 days ago
## Motivation Currently, git_clone minion jobs fail when GitLab is temporarily unreachable (see #178492), #178564), by introducing a proper error-handling mechanism, we can ensure: - Temporary outages do not cause unnecessary job failures or alerts. - Jobs completely ignore short outages or mark themselves as "skipped" instead of failing. - A simple tracking mechanism to detect and alert on when there are longer GitLab downtimes. ### User Story ``` "As a test engineer and openQA operator, i want openQA to handle short-lived GitLab outages without causing mass Minion job failures, so that users do not experience unnecessary disruption, while prolonged outages are still detected and reported effectively ``` ## Acceptance Criteria * **AC1:** Temporary GitLab outages don't cause failing minion jobs * **AC2:** A mechanism exists to discover longer-term outages of GitLab ## Suggestions * Damage is likely limited. If we can't sync needles nobody can edit needles. * Add a temporary file as part of the git clone OR somewhere in /var/lib/ where we have existing caching mechanisms. * Check if the file is n seconds old to determine the state.