https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842023-02-12T20:51:02ZopenSUSE Project Management ToolopenQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6007972023-02-12T20:51:02Zokurzokurz@suse.com
<ul><li><strong>Tags</strong> set to <i>reactive work</i></li><li><strong>Target version</strong> set to <i>Ready</i></li></ul> openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6012112023-02-13T17:12:42Zmkittlermarius.kittler@suse.com
<ul></ul><p>We have <code>job_done_hook = env host=openqa.suse.de exclude_group_regex='.*(Development|Public Cloud|Released|Others|Kernel|Virtualization).*' grep_timeout=60 nice ionice -c idle /opt/os-autoinst-scripts/openqa-label-known-issues-and-investigate-hook</code> configured <del>so the hook script runs regardless of the job's result. I'm wondering where we take care <em>not</em> to run into the "Investigate retry job: … failed" assumption for passed/softfailed jobs.</del> <del>Since we have no <code>job_done_hook_enable_… = 1</code> settings the hook script is actually only running for <code>failed</code>, <code>incomplete</code> or <code>timeout_exceeded</code> results.</del></p>
<p>Since the job has <code>_TRIGGER_JOB_DONE_HOOK=1</code> the generic hook script is triggered for this particular job after all (regardless of the result). We apparently don't do any extra checks in <code>openqa-label-known-issues-and-investigate-hook</code> to avoid running into the "Investigate retry job: … failed" assumption so this is what's happening. Supposedly we should have an extra check there. I'm not sure where the <code>_TRIGGER_JOB_DONE_HOOK=1</code> job settings comes from and why it was added.</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6012142023-02-13T17:27:35Ztinitatina.mueller+trick-redmine@suse.com
<ul></ul><p>mkittler wrote:</p>
<blockquote>
<p>I'm not sure where the <code>_TRIGGER_JOB_DONE_HOOK=1</code> job settings comes from and why it was added.</p>
</blockquote>
<p>_TRIGGER_JOB_DONE_HOOK=1 was added by me as part of <a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" title="action: Comment about intermittent/sporadic test issues on original job if openqa-investigate retry job p... (Resolved)" href="https://progress.opensuse.org/issues/98862">#98862</a> for <code>investigate:retry</code> jobs.</p>
<p>We need to run the hook script in order to report when a retry job passed.<br>
For that I also needed to enable <code>job_done_hook</code>, and I guess this is now also called for softfailed. Should I rather configure <code>job_done_hook_passed</code> instead?</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6012202023-02-13T17:27:50Ztinitatina.mueller+trick-redmine@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" href="/issues/98862">action #98862</a>: Comment about intermittent/sporadic test issues on original job if openqa-investigate retry job passes size:M</i> added</li></ul> openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6016192023-02-14T10:40:25Zmkittlermarius.kittler@suse.com
<ul><li><strong>Assignee</strong> set to <i>mkittler</i></li></ul><blockquote>
<p>For that I also needed to enable job_done_hook, and I guess this is now also called for softfailed. Should I rather configure job_done_hook_passed instead?</p>
</blockquote>
<p>I don't think so. The "if openqa-investigate retry job passes" part in <a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" title="action: Comment about intermittent/sporadic test issues on original job if openqa-investigate retry job p... (Resolved)" href="https://progress.opensuse.org/issues/98862">#98862</a> is likely also supposed to include softfails.</p>
<p>I suppose I will just add a check to skip writing this comments for passed/softfailed jobs.</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6016202023-02-14T10:41:01Zmkittlermarius.kittler@suse.com
<ul><li><strong>Assignee</strong> deleted (<del><i>mkittler</i></del>)</li></ul><p>Or maybe let's estimate it first.</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6113632023-03-09T10:56:05Zmkittlermarius.kittler@suse.com
<ul></ul><p>It would be good to estimate this with <a class="user active user-mention" href="https://progress.opensuse.org/users/17668">@okurz</a> to clarify whether we can really treat "softfailed" as "passed" here.</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6117202023-03-10T06:27:06Zokurzokurz@suse.com
<ul></ul><p>In general what users commonly expect is that the investigation jobs tell if the*same* issue happens again. We make the assumption that if a job fails again then likely it's the same issue even though that will not be generally true. IMHO that assumption is still fine for the sake of openqa-investigate. Regarding failed, softfailed I assume we only trigger openqa-investigate in the first place for failed jobs hence we want to know if retry jobs <em>fail</em>. So in my understanding all jobs with "ok-result" should be treated the same</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6118972023-03-10T11:10:00Zokurzokurz@suse.com
<ul></ul><p>mkittler wrote:</p>
<blockquote>
<p>I suppose I will just add a check to skip writing this comments for passed/softfailed jobs.</p>
</blockquote>
<p>As discussed in the weekly 2023-03-10 we clarified that we <em>do</em> have the feature to write a comment if a job passes so we should ensure that for any "ok" result we treat it the same. As soft-fail effectively means "known issue" then the reason for job failure can not be the same as the original "new, unreviewed issue".</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6166252023-03-23T10:50:53Zlivdywanliv.dywan@suse.com
<ul><li><strong>Subject</strong> changed from <i>openQA reports non-sporadic issue when retry job just softfailed</i> to <i>openQA reports non-sporadic issue when retry job just softfailed size:M</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/616625/diff?detail_id=579029">diff</a>)</li><li><strong>Status</strong> changed from <i>New</i> to <i>Workable</i></li></ul> openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6167362023-03-23T14:20:11Ztinitatina.mueller+trick-redmine@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>tinita</i></li></ul> openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6167422023-03-23T14:23:05Ztinitatina.mueller+trick-redmine@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><p><a href="https://github.com/os-autoinst/scripts/pull/221" class="external">https://github.com/os-autoinst/scripts/pull/221</a> Treat softfailed as passed in openqa-investigate</p>
openQA Project - action #124274: openQA reports non-sporadic issue when retry job just softfailed size:Mhttps://progress.opensuse.org/issues/124274?journal_id=6169882023-03-24T08:34:14Ztinitatina.mueller+trick-redmine@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>The PR was merged, resolving. <a class="user active user-mention" href="https://progress.opensuse.org/users/28903">@clanig</a> let us know if you see something unexpected again, thanks</p>