Project

General

Profile

Actions

action #19732

closed

[tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …"

Added by okurz almost 7 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Low
Assignee:
-
Category:
Regressions/Crashes
Target version:
-
Start date:
2017-03-28
Due date:
% Done:

0%

Estimated time:

Description

observation

[Sun Jun 11 00:34:56 2017] [scheduler:warn] openqaworker5:20 wants to grab a new job - killing the old one: 993795
[Sun Jun 11 00:34:56 2017] [scheduler:warn] openqaworker7:20 wants to grab a new job - killing the old one: 993749
[Sun Jun 11 00:34:58 2017] [scheduler:warn] openqaworker7:10 wants to grab a new job - killing the old one: 993785
[Sun Jun 11 00:34:58 2017] [scheduler:warn] openqaworker6:15 wants to grab a new job - killing the old one: 993807
[and many more…]

suggestion

  • check if it is really acceptable behaviour that we just do not care about the old job and kill it
  • if acceptable, downgrade warn to info/debug

workaround

monitoring silenced with https://github.com/okurz/openqa_monitoring/pull/12


Related issues 1 (0 open1 closed)

Copied from openQA Project - action #19730: [tools][openqa-monitoring] "can't remove <needle_path>"Resolvedmkittler2017-03-28

Actions
Actions #1

Updated by okurz almost 7 years ago

  • Copied from action #19730: [tools][openqa-monitoring] "can't remove <needle_path>" added
Actions #2

Updated by okurz almost 7 years ago

  • Description updated (diff)
Actions #3

Updated by coolo over 6 years ago

  • Status changed from New to Resolved

No, this is not acceptable - but in case it happens there is no harm done as the system cured itself. If this happens in masses, we have some problem though.

I'm not sure how you want to handle such cases in the monitoring - I can tell you, it didn't happen the last days. So I close the issue

Actions #4

Updated by okurz over 6 years ago

  • Status changed from Resolved to In Progress
  • Assignee set to okurz

Ok, I will see what I can do about "forward message only if appearing more than a few"

Actions #5

Updated by okurz about 6 years ago

  • Assignee deleted (okurz)

I checked the monitoring log messages and there are very seldomly or never the same message twice. It's at least always another instance on the same worker.

Actions #6

Updated by EDiGiacinto about 6 years ago

  • Status changed from In Progress to Resolved

I think this can be closed now - the code it's not executing that path anymore [1] [2] (it was left to keep the old behavor intact and let us run test against scheduling logic/priority)

1: https://github.com/os-autoinst/openQA/blob/2dbaf4975c6a3e8f1639f34df4784e31c6ac4d72/lib/OpenQA/Scheduler/Scheduler.pm#L250
2: https://github.com/os-autoinst/openQA/blob/2dbaf4975c6a3e8f1639f34df4784e31c6ac4d72/lib/OpenQA/Scheduler/Scheduler.pm#L618

Actions #7

Updated by okurz about 6 years ago

  • Subject changed from [tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …" to [tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …"

shouldn't we remove the check from the logwarn monitoring script then? If you are not interested in that way of monitoring we can also think about putting it to rest for good.

Actions #8

Updated by EDiGiacinto about 6 years ago

Yeah sure, we can remove it but i'm not receiving emails from any monitoring service.. :)

Actions #9

Updated by okurz about 6 years ago

hm, I think we should be more explicit: Would you like to update https://github.com/os-autoinst/openqa-logwarn ?

If you want I could add you as an email recipient to the monitoring alerts. Side-note: I don't recall a recent message received by email so I think currently there are none sent at all. Probably the mail sending is broken or disabled on osd.

Actions

Also available in: Atom PDF