action #133457: salt-states-openqa gitlab CI pipeline aborted with error after 2h of execution size:M - QA (public) - openSUSE Project Management Tool

Actions

Copy link

action #133457

closed

salt-states-openqa gitlab CI pipeline aborted with error after 2h of execution size:M

Added by livdywan over 1 year ago. Updated over 1 year ago.

Status:

Resolved

Priority:

High

Assignee:

okurz

Target version:

openQA Project (public) - Ready

Start date:

Due date:

% Done:

Estimated time:

Tags:

alert, infra, gitlab CI

Description

Observation¶

https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/1714239

  Name: /etc/systemd/system/auto-update.service - Function: file.managed - Result: Clean - Started: 21:29:26.689214 - Duration: 359.255 ms
  Name: service.systemctl_reload - Function: module.run - Result: Clean - Started: 21:29:27.053802 - Duration: 0.018 ms
  Name: auto-upgrade.service - Function: service.dead - Result: Clean - Started: 21:29:27.054218 - Duration: 61.444 ms
  Name: auto-upgrade.timer - Function: service.dead - Result: Clean - Started: 21:29:27.116368 - Duration: 82.058 ms
  Name: auto-update.timer - Function: service.running - Result: Clean - Started: 21:29:27.203488 - Duration: 255.774 ms
Summary for openqa.suse.de
--------------
Succeeded: 345 (changed=30)
Failed:      0
--------------
Total states run:     345
Total run time:   383.468 s.++ echo -n .
++ true
++ sleep 1
.++ echo -n .

[...]
++ true
++ sleep 1
.++ echo -n .
++ true
++ sleep 1
ERROR: Job failed: execution took longer than 2h0m0s seconds
 took longer than 2h0m0s seconds

Acceptance criteria¶

AC1: jobs commonly don't run into the 2h gitlab CI timeout
AC2: We can identify the faulty salt minion (because very likely it's one of those being stuck)

Suggestions¶

look up an older ticket and read what we did there about this
check if there are actually artifacts uploaded or not
check if machines can be reached over salt
check usual runtimes of salt state apply
try if it is reproducible
research upstream if there is anything better we can do to prevent to run into the seemingly hardcoded gitlab 2h timeout
run the internal command of salt apply with a timeout well below 2h, e.g. in https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/deploy.yml#L43 just prepend "timeout 1h …"

Related issues 5 (3 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public)

Tags

Custom queries

action #133457

salt-states-openqa gitlab CI pipeline aborted with error after 2h of execution size:M

Observation¶

Acceptance criteria¶

Suggestions¶

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by okurz over 1 year ago

Updated by livdywan over 1 year ago

Updated by livdywan over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by nicksinger over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago

Updated by okurz over 1 year ago