Project

General

Profile

Actions

action #152281

closed

coordination #102915: [saga][epic] Automated classification of failures

QA (public) - coordination #94105: [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests

Schedule openQA SLE maintenance bisect jobs with lower priority same as openqa-investigate

Added by okurz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2023-12-08
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In openqa-investigate we already add +100 to the prio value to give production jobs priority. For openqa-trigger-bisect-jobs https://github.com/os-autoinst/scripts/blob/master/openqa-trigger-bisect-jobs we do not do that yet leading to problems as mentioned in https://suse.slack.com/archives/C02CANHLANP/p1702031247953039 by mgriessmeier. So we should also ensure that we can adjust the priority of generated jobs in openqa-trigger-bisect-jobs.

Acceptance criteria

  • AC1: All jobs created by openqa-trigger-bisect-jobs have a prio value of at least 100

Suggestions

Actions #1

Updated by okurz about 1 year ago

  • Parent task set to #94105
Actions #2

Updated by mkittler about 1 year ago

  • Status changed from New to In Progress
Actions #3

Updated by mkittler about 1 year ago

  • Status changed from In Progress to Feedback
Actions #4

Updated by okurz about 1 year ago

  • Assignee set to mkittler
Actions #5

Updated by okurz about 1 year ago

  • Status changed from Feedback to Workable

https://openqa.suse.de/tests?match=:investigate: shows multiple jobs with default prio 50, e.g. https://openqa.suse.de/tests/13042757 which is "qam_alpha_supportserver:investigate:last_good_tests_and_build:f69e77d29d96cab7c9a3e18c5cd2cfb73f371ee4+20231208-1", so not triggered by the bisect script but related. I assume this is related to jobs in a multi-machine cluster where maybe only the initial job gets the +100 prio value and others not. And that problem might apply to both openqa-investigate as well as openqa-bisect.

Actions #6

Updated by mkittler about 1 year ago

  • Status changed from Workable to In Progress

No, this problem only applies to openqa-investigate.

In openqa-trigger-bisect-jobs I implemented setting the prio properly via a loop over all jobs:

        for job_id in sorted(created_job_ids):
            log.info(f"Created {job_id}")
            created += f"* **{test_name}**: {base_url}/t{job_id}\n"
            openqa_set_job_prio(job_id, args.url, prio, args.dry_run)

Only in openqa-investigate we have code that doesn't take into account that we might have cloned multiple jobs:

    # output: { "$id": $clone_id }
    clone_id=$(echo "$out" | runjq -r ".\"$id\"")
    # Create markdown list entry
    echo "* *$name*: t#$clone_id"
    "${client_call[@]}" --json --data "{\"priority\": $((base_prio + prio_add))}" -X PUT jobs/"$clone_id" >/dev/null
Actions #7

Updated by mkittler about 1 year ago

  • Status changed from In Progress to Feedback
Actions #8

Updated by mkittler about 1 year ago

  • Status changed from Feedback to Resolved

The PR has been merged and it looks like it works in production, e.g. https://openqa.suse.de/tests/13098102 and https://openqa.suse.de/tests/13098107.

There are jobs with prio 80 but those have likely been changed manually (because 80 is also not the default prio):

openqa=> select id, priority from jobs where state = 'scheduled' and priority < 150 and test like '%:investigate:%' limit 10;
    id    | priority 
----------+----------
 13097441 |       80
 13097440 |       80
(2 rows)
Actions

Also available in: Atom PDF