action #19732
closed
[tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …"
Added by okurz over 7 years ago.
Updated almost 7 years ago.
Category:
Regressions/Crashes
Description
observation¶
[Sun Jun 11 00:34:56 2017] [scheduler:warn] openqaworker5:20 wants to grab a new job - killing the old one: 993795
[Sun Jun 11 00:34:56 2017] [scheduler:warn] openqaworker7:20 wants to grab a new job - killing the old one: 993749
[Sun Jun 11 00:34:58 2017] [scheduler:warn] openqaworker7:10 wants to grab a new job - killing the old one: 993785
[Sun Jun 11 00:34:58 2017] [scheduler:warn] openqaworker6:15 wants to grab a new job - killing the old one: 993807
[and many more…]
suggestion¶
- check if it is really acceptable behaviour that we just do not care about the old job and kill it
- if acceptable, downgrade warn to info/debug
workaround¶
monitoring silenced with https://github.com/okurz/openqa_monitoring/pull/12
- Copied from action #19730: [tools][openqa-monitoring] "can't remove <needle_path>" added
- Description updated (diff)
- Status changed from New to Resolved
No, this is not acceptable - but in case it happens there is no harm done as the system cured itself. If this happens in masses, we have some problem though.
I'm not sure how you want to handle such cases in the monitoring - I can tell you, it didn't happen the last days. So I close the issue
- Status changed from Resolved to In Progress
- Assignee set to okurz
Ok, I will see what I can do about "forward message only if appearing more than a few"
I checked the monitoring log messages and there are very seldomly or never the same message twice. It's at least always another instance on the same worker.
- Status changed from In Progress to Resolved
- Subject changed from [tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …" to [tools][openqa-monitoring] "openqaworker<> wants to grab a new job - killing the old one: …"
shouldn't we remove the check from the logwarn monitoring script then? If you are not interested in that way of monitoring we can also think about putting it to rest for good.
Yeah sure, we can remove it but i'm not receiving emails from any monitoring service.. :)
hm, I think we should be more explicit: Would you like to update https://github.com/os-autoinst/openqa-logwarn ?
If you want I could add you as an email recipient to the monitoring alerts. Side-note: I don't recall a recent message received by email so I think currently there are none sent at all. Probably the mail sending is broken or disabled on osd.
Also available in: Atom
PDF