Project

General

Profile

Actions

action #165033

closed

Alert Worker .* has no heartbeat (900 seconds), restarting (see FAQ for more) on o3 - again

Added by livdywan 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2023-10-25
Due date:
% Done:

0%

Estimated time:

Description

Observation

OpenQA logreport says

[2024-08-06T14:36:10.374591Z] [error] Worker 6665 has no heartbeat (900 seconds), restarting (see FAQ for more)
[2024-08-06T13:46:59.133918Z] [error] Worker 2888 has no heartbeat (900 seconds), restarting (see FAQ for more)

Acceptance criteria

  • AC1:

Suggestions


Related issues 1 (0 open1 closed)

Copied from openQA Infrastructure (public) - action #138536: Alert Worker .* has no heartbeat (900 seconds), restarting (see FAQ for more) on o3 size:SResolvedmkittler2023-10-25

Actions
Actions #1

Updated by livdywan 4 months ago

  • Copied from action #138536: Alert Worker .* has no heartbeat (900 seconds), restarting (see FAQ for more) on o3 size:S added
Actions #2

Updated by mkittler 4 months ago

Probably I did something wrong in https://github.com/os-autoinst/openqa-logwarn/pull/49/files but I currently cannot spot the mistake.

Actions #3

Updated by tinita 4 months ago · Edited

One thing I spotted is that currently

prove test_logwarn

fails, but in the CI we are only running ./test_logwarn but we don't check the exit code. Which prove would do for us.
But it fails for a different test.

edit: actually, the exit code 1 should also fail the github action,right?

Actions #4

Updated by tinita 4 months ago

  • Status changed from New to In Progress
  • Assignee set to tinita
Actions #5

Updated by tinita 4 months ago

  • Status changed from In Progress to Feedback
Actions #6

Updated by tinita 4 months ago · Edited

  • Status changed from Feedback to Resolved

Merged. I'll resolve it. We will get an email if it's still not working.

Actions

Also available in: Atom PDF