https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-03-12T14:01:27ZopenSUSE Project Management ToolopenQA Infrastructure - action #89419: Incomplete jobs after OSD deploymenthttps://progress.opensuse.org/issues/89419?journal_id=3912112021-03-12T14:01:27Zmkittlermarius.kittler@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Rejected</i></li><li><strong>Assignee</strong> set to <i>mkittler</i></li></ul><p>Incompletes with the reason <code>quit: worker has been stopped or restarted</code> are <em>expected</em> when a job has been cancelled because the worker was restarted. The jobs you've found have also been automatically cloned which is also expected. Hence the alerts are also not triggered by this behavior.</p>
<p>The only odd thing is of course that I've changed the deployment to avoid this kind of interruption so these kind of jobs shouldn't appear anymore unless one really stops a worker manually. The reason why the jobs you've found have been stopped is that <code>openqa-worker.target</code> was still active at the time on <code>grenache-1</code> because it hasn't been restarted recently:</p>
<pre><code>martchus@grenache-1:~> systemctl status openqa-worker.target
● openqa-worker.target - openQA Worker
Loaded: loaded (/usr/lib/systemd/system/openqa-worker.target; disabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2021-03-03 16:18:21 CET; 1 weeks 1 days ago
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
martchus@grenache-1:~> uptime
14:56:04 40 Tage 11:21 an, 1 Benutzer, Durchschnittslast: 1,43, 1,39, 1,50
</code></pre>
<p>But it is dead now and I've also took care about other workers in the meantime. So there's really nothing to be improved here at this point.</p>