https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842015-04-09T12:25:55ZopenSUSE Project Management ToolopenQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=156722015-04-09T12:25:55Zmlin7442mlin@suse.com
<ul></ul><p>looks I pasted the wrong log, the log is here, <a href="http://susepaste.org/95635732" class="external">http://susepaste.org/95635732</a></p>
openQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=156762015-04-09T12:33:02Zmlin7442mlin@suse.com
<ul><li><strong>Subject</strong> changed from <i>The worker died due to the scheduler doesn't reponding</i> to <i>The worker died due to the scheduler doesn't responding</i></li></ul> openQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=156862015-04-09T13:48:38Zoholecekoholecek@suse.com
<ul></ul><p>This <a href="https://github.com/os-autoinst/openQA/pull/329" class="external">https://github.com/os-autoinst/openQA/pull/329</a> should make worker stop complaining on undefined vars and quit peacefully. Worker quitting is in fact by design when scheduler starts to return 4XX result codes.</p>
<p>To further investigate I would need scheduler part of logs. If it wasn't on opensuse.o.o I would guess Demo account API keys timed out.</p>
openQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=156942015-04-10T07:50:18Zmlin7442mlin@suse.com
<ul><li><strong>File</strong> <a href="/attachments/1406">openqa-20150409.xz</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/1406/openqa-20150409.xz">openqa-20150409.xz</a> added</li><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>oholecek</i></li><li><strong>Target version</strong> set to <i>Sprint 16</i></li></ul><p>assign to Ondřej.</p>
<p>and attached the openqa log.</p>
openQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=156982015-04-10T12:49:34Zoholecekoholecek@suse.com
<ul><li><strong>Subject</strong> changed from <i>The worker died due to the scheduler doesn't responding</i> to <i>The worker died due to the scheduler not responding</i></li><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><p>Scheduler logs ruled out API key expiry, so auth failure must come from hmac validation failure. Given the inactivity timeout reported by both worker and scheduler, there were indeed network related problems (how about possible MITM attack?).</p>
<p>Because worker is expected to quit when scheduler return 4XX results, I don't think there is anything more to do. Or maybe change systemd unit to restart worker after some time?</p>
openQA Project - action #7190: The worker died due to the scheduler not respondinghttps://progress.opensuse.org/issues/7190?journal_id=157602015-04-23T08:14:48Zcoolocoolo@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>I agree</p>