The worker died due to the scheduler not responding
See this issue on openqa.oo, the log can found https://openqa.opensuse.org/tests/55761 <- 3 months life
#3 Updated by oholecek almost 8 years ago
This https://github.com/os-autoinst/openQA/pull/329 should make worker stop complaining on undefined vars and quit peacefully. Worker quitting is in fact by design when scheduler starts to return 4XX result codes.
To further investigate I would need scheduler part of logs. If it wasn't on opensuse.o.o I would guess Demo account API keys timed out.
#5 Updated by oholecek almost 8 years ago
- Subject changed from The worker died due to the scheduler doesn't responding to The worker died due to the scheduler not responding
- Status changed from In Progress to Feedback
Scheduler logs ruled out API key expiry, so auth failure must come from hmac validation failure. Given the inactivity timeout reported by both worker and scheduler, there were indeed network related problems (how about possible MITM attack?).
Because worker is expected to quit when scheduler return 4XX results, I don't think there is anything more to do. Or maybe change systemd unit to restart worker after some time?