Project

General

Profile

Actions

action #132488

closed

gitlab CI shows showing no logs or are getting stuck (was: qem-bot sync aggregates gitlab CI job times out after 2h) size:M

Added by okurz 10 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2023-07-10
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1680349 times out after 2h:

output of the bot-ng call:

++ ./qem-bot/bot-ng.py -c /etc/openqabot --token [MASKED] aggr-sync-results
++ tee bot_aggr-sync-results_0.log
2023-07-10 00:21:25 INFO     Config /etc/openqabot/bot.yml does not have aggregate
…
2023-07-10 00:22:01 INFO     Ignoring job '11543518' in development group 'Test Security'

and then nothing. Seems like a lot of output is happening 00:21-00:22. The job overall shows "Duration: 119 minutes 12 seconds, Finished: 3 hours ago. Queued: 1 second, Timeout: 1h (from project)" so one thing surprising is that it ran for 2h but the timeout is 1h. And second thing is that apparently there is no output anymore for more than one hour and the call is just stuck somewhere.

Rollback actions

  • For qa-maintenance/openQABot and qa-maintenance/bot-ng: Visibility, project features, permissions -> Disable email notifications

Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #132500: NUE1-SRV2, .qa.suse.de, aarch64 workers offline due to heat-related SRV2 shutdown size:MResolvednicksinger2023-07-27

Actions
Actions #1

Updated by okurz 10 months ago

  • Subject changed from qem-bot sync aggregates gitlab CI job times out after 2h to gitlab CI shows showing no logs or are getting stuck (was: qem-bot sync aggregates gitlab CI job times out after 2h)
  • Status changed from New to Blocked
  • Assignee set to okurz

Other gitlab CI jobs are also affected by stuck or missing logs. Created https://sd.suse.com/servicedesk/customer/portal/1/SD-126410

Actions #2

Updated by okurz 10 months ago

  • Related to action #132500: NUE1-SRV2, .qa.suse.de, aarch64 workers offline due to heat-related SRV2 shutdown size:M added
Actions #3

Updated by jbaier_cz 10 months ago

Until the mentioned SD ticket is handled, I am disabling e-mail notifications for openQABot/bot-ng to prevent fatigue from the failed scheduled pipelines

Actions #4

Updated by jbaier_cz 10 months ago

  • Description updated (diff)
Actions #5

Updated by okurz 10 months ago

The SD ticket was resolved with "there is new worker on PRG2. there is stricter firewall, so if you see any jobs failing for some access problem, shoot us an SD ticket for the network access".

Most recent jobs in https://gitlab.suse.de/qa-maintenance/bot-ng/ are good, e.g. https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1682979 . Many others are "waiting", maybe there is a bigger backlog in gitlab CI. I will monitor further. https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules had all schedules disabled, I enabled all again along with email notification.

@Jan Baier how did you disable email notifications in https://gitlab.suse.de/qa-maintenance/bot-ng/edit ?

I also enabled the schedule again in https://gitlab.suse.de/qa-maintenance/openQABot/-/pipeline_schedules

I observed a problem with openqa-review based on https://gitlab.suse.de/openqa/openqa-review/-/pipeline_schedules on the new gitlab CI runners unable to reach relay.suse.de, reported https://sd.suse.com/servicedesk/customer/portal/1/SD-126756

Actions #7

Updated by livdywan 10 months ago

  • Subject changed from gitlab CI shows showing no logs or are getting stuck (was: qem-bot sync aggregates gitlab CI job times out after 2h) to gitlab CI shows showing no logs or are getting stuck (was: qem-bot sync aggregates gitlab CI job times out after 2h) size:M
Actions #8

Updated by jbaier_cz 10 months ago

okurz wrote:

@Jan Baier how did you disable email notifications in https://gitlab.suse.de/qa-maintenance/bot-ng/edit ?

See the updated description: Under settings -> Visibility, project features, permissions -> Disable email notifications

Notifications are again re-enabled.

Actions #9

Updated by okurz 10 months ago

jbaier_cz wrote:

okurz wrote:

@Jan Baier how did you disable email notifications in https://gitlab.suse.de/qa-maintenance/bot-ng/edit ?

See the updated description: Under settings -> Visibility, project features, permissions -> Disable email notifications

permission problem, fixed by making me "Maintainer->Owner", thx

Actions #10

Updated by okurz 10 months ago

  • Due date deleted (2023-07-24)
  • Status changed from Feedback to Blocked
Actions #11

Updated by okurz 9 months ago

  • Status changed from Blocked to Resolved

both SD tickets resolved. email notifications enabled again in https://gitlab.suse.de/qa-maintenance/bot-ng/edit. https://gitlab.suse.de/qa-maintenance/openQABot/edit was already on.

Actions

Also available in: Atom PDF