https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842018-03-29T13:25:24ZopenSUSE Project Management ToolopenQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1070592018-03-29T13:25:24ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Subject</strong> changed from <i>[tools] Worker goes to infinite loop during upload phase in case of writing failure</i> to <i>[tools] Worker goes to infinite loop during upload of screenshots in case of writing failure</i></li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1070682018-03-29T13:32:04ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Low</i></li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1070712018-03-29T13:34:50ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/107071/diff?detail_id=106900">diff</a>)</li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1097862018-04-05T07:26:41Zszarate
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-7 priority-highest closed" href="/issues/34267">action #34267</a>: osd instance unresponsive (HTTP 502)</i> added</li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1097892018-04-05T07:27:28ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/109789/diff?detail_id=109669">diff</a>)</li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1097922018-04-05T07:43:21Zszarate
<ul></ul><p>EDiGiacinto wrote:</p>
<blockquote>
<p>Aside from the disk full issue, imo the worker should not loop forever (maybe set a retrial limit?) in case of persistent error, as it prevented the job to be cleaned up, and left an inconsistent state in the worker.</p>
<p>Also possibly: implement catching errors in all our log_* functions, as they are relied in all other parts in the code and would prevent cleanup phases from being executed.</p>
</blockquote>
<p>+1 </p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1455082018-08-27T08:17:12ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/40220">action #40220</a>: Scheduler died due to Mojo::Log failing to write to log</i> added</li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1483942018-09-11T11:02:24ZEDiGiacintoedigiacinto@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/40862">action #40862</a>: Out of disk space killed the webui (on osd)</i> added</li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1495462018-09-14T13:10:51Zcoolocoolo@suse.com
<ul><li><strong>Target version</strong> set to <i>Current Sprint</i></li></ul> openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1523602018-09-26T14:40:05Zmkittlermarius.kittler@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>mkittler</i></li></ul><p>I suppose the worker can be restarted by systemd if it crashes. So I would just make it stop if an unhanded exception occurs (like most scripts/applications behave).</p>
<p>PR: <a href="https://github.com/os-autoinst/openQA/pull/1809" class="external">https://github.com/os-autoinst/openQA/pull/1809</a></p>
<p>I tested this by adding a die on certain locations.</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1526242018-09-28T07:07:54Zokurzokurz@suse.com
<ul></ul><p>It's not that "systemd crashes" but I guess systemd can be configured to restart crashing services :)</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1535122018-10-02T14:16:06ZEDiGiacintoedigiacinto@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>It's not that "systemd crashes" but I guess systemd can be configured to restart crashing services :)</p>
</blockquote>
<p>In this case the worker went to infinite loop, and the process was still alive so systemd won't even catch that as it sees it as running</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1535212018-10-02T15:46:53Zmkittlermarius.kittler@suse.com
<ul></ul><p>I suppose <a class="user active user-mention" href="https://progress.opensuse.org/users/17668">@okurz</a> though "it" in "[...] the worker can be restarted by systemd if it crashes." refers to systemd. (But of course it refers to the worker.)</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1548772018-10-09T14:28:52Zmkittlermarius.kittler@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><p>PR is merged. Let's see how well it works in production.</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1605022018-10-26T13:03:55Zmkittlermarius.kittler@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>Seems like it didn't break anything. If logging fails on production again and my changes turn out to be insufficient we can reopen the ticket.</p>
openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failurehttps://progress.opensuse.org/issues/34042?journal_id=1655152018-11-16T12:55:27Zcoolocoolo@suse.com
<ul><li><strong>Target version</strong> changed from <i>Current Sprint</i> to <i>Done</i></li></ul>