Project

General

Profile

Actions

action #7190

closed

The worker died due to the scheduler not responding

Added by mlin7442 about 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2015-04-09
Due date:
% Done:

0%

Estimated time:

Description

See this issue on openqa.oo, the log can found https://openqa.opensuse.org/tests/55761 <- 3 months life


Files

openqa-20150409.xz (1.24 MB) openqa-20150409.xz mlin7442, 2015-04-10 07:49
Actions #1

Updated by mlin7442 about 9 years ago

looks I pasted the wrong log, the log is here, http://susepaste.org/95635732

Actions #2

Updated by mlin7442 about 9 years ago

  • Subject changed from The worker died due to the scheduler doesn't reponding to The worker died due to the scheduler doesn't responding
Actions #3

Updated by oholecek about 9 years ago

This https://github.com/os-autoinst/openQA/pull/329 should make worker stop complaining on undefined vars and quit peacefully. Worker quitting is in fact by design when scheduler starts to return 4XX result codes.

To further investigate I would need scheduler part of logs. If it wasn't on opensuse.o.o I would guess Demo account API keys timed out.

Actions #4

Updated by mlin7442 about 9 years ago

assign to Ondřej.

and attached the openqa log.

Actions #5

Updated by oholecek about 9 years ago

  • Subject changed from The worker died due to the scheduler doesn't responding to The worker died due to the scheduler not responding
  • Status changed from In Progress to Feedback

Scheduler logs ruled out API key expiry, so auth failure must come from hmac validation failure. Given the inactivity timeout reported by both worker and scheduler, there were indeed network related problems (how about possible MITM attack?).

Because worker is expected to quit when scheduler return 4XX results, I don't think there is anything more to do. Or maybe change systemd unit to restart worker after some time?

Actions #6

Updated by coolo almost 9 years ago

  • Status changed from Feedback to Resolved

I agree

Actions

Also available in: Atom PDF