Project

General

Profile

Actions

action #154900

open

Statistic frequency for the support images issues

Added by leli 3 months ago. Updated about 1 month ago.

Status:
New
Priority:
Low
Assignee:
-
Target version:
-
Start date:
2024-02-05
Due date:
% Done:

0%

Estimated time:

Description

Motivation

After removal of the copied support images in OSD hdd/fixed/, we'd better to statistic the frequency for the support images issues, such as incomplete, published but cleaned up, and other issues. We can gather/list the problem we find in comments, then evaluate the frequency and decide how to fix these issues.

Acceptance criteria

AC1: Statistic frequency for the support images issues.

Actions #1

Updated by leli 3 months ago

  • Description updated (diff)
Actions #2

Updated by syrianidou_sofia about 2 months ago

s390x published support images deleted 9 hours after publishing

Actions #3

Updated by lmanfredi about 2 months ago

A small automation (bash script) for gathering all missing qcow2 by searching inside incomplete jobs for all ours groups:

#!/usr/bin/env bash
set -e

function GET_RESULT_INCOMPLETE_ALL_GROUPS() {
    declare -a groups=(
      265 # SLE15 / Migration
      535 # YaST & MMU
      510 # Migration
      129 # SLE15 / YaST
      421 # YaST MU
      478 # Migration Misc.
      266 # Migration Milestone
      520 # Yam Support Image
    )

    local params=''
    for group in "${groups[@]}"; do
      params="$params&groupid=$group"
    done

    local API_URL="https://openqa.suse.de/api/v1/jobs/overview?result=incomplete$params"
    declare -a ids=(
        $(curl -k -X GET "$API_URL" 2>/dev/null | jq '.[].id')
    )

    declare -A dict_qcow=()
    for id in "${ids[@]}"; do
        qcow=$(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/${id}/details" 2>/dev/null | jq -r '.job.reason | [scan("\\S+.qcow2")] | first')
        # qcow=$(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/${id}/details" 2>/dev/null | jq -r '.job.reason' | perl -ne 'print "$1" if /Failed to download (\S+)/')
        dict_qcow["$qcow"]=1
    done

    local IFS=$'\n'
    s_qcow=($(sort <<<"${!dict_qcow[*]}"))
    for qcow in "${s_qcow[@]}"; do
        echo "- ${qcow}"
    done

}

GET_RESULT_INCOMPLETE_ALL_GROUPS

Actions #4

Updated by syrianidou_sofia about 2 months ago

We could make a script to make a list with all PUBLISH_HDD from successful jobs in our job groups and then check the /var/lib/openqa/factory/hdd/ from osd to see which ones are deleted

Actions #5

Updated by JERiveraMoya about 2 months ago

syrianidou_sofia wrote in #note-2:

s390x published support images deleted 9 hours after publishing

the only root cause we found was that maintenance was triggered several times since the day before.

Actions #6

Updated by JERiveraMoya about 2 months ago

syrianidou_sofia wrote in #note-4:

We could make a script to make a list with all PUBLISH_HDD from successful jobs in our job groups and then check the /var/lib/openqa/factory/hdd/ from osd to see which ones are deleted

the goal of this ticket was keep track of the problem and identify root causes, but I see that you are trying to go one step forward :) with automation to detect when this happens. I believe this problem happens in really rear circumstances to invest much time on this, but if you find it interesting, we should investigate WHERE to put those scripts, because I can guess that tools team already have some solution for that to monitor jobs and assets.

Actions #7

Updated by JERiveraMoya about 1 month ago

  • Priority changed from Normal to Low
Actions #8

Updated by tinawang123 about 1 month ago

Missed qcow image: /var/lib/openqa/cache/openqa.suse.de/autoyast_SLEHPC-15-SP3-aarch64-DEV-gnome-defpatterns-updated.qcow2" failed: 404 Not Found
Failed job: https://openqa.suse.de/tests/13856251

Actions #9

Updated by lmanfredi about 1 month ago

As suggested by Sofia in #note-4, this make also a list of PUBLISH_HDD_1 related jobs:

#!/usr/bin/env bash
set -e

function SEARCH_INCOMPLETE_ALL_GROUPS() {
    declare -a groups=(
      265 # SLE15 / Migration
      535 # YaST & MMU
      510 # Migration
      129 # SLE15 / YaST
      421 # YaST MU
      478 # Migration Misc.
      266 # Migration Milestone
      520 # Yam Support Image
    )

    local params=''
    for group in "${groups[@]}"; do
      params="$params&groupid=$group"
    done

    local API_URL="https://openqa.suse.de/api/v1/jobs/overview?result=incomplete$params"
    declare -a ids=(
        $(curl -k -X GET "$API_URL" 2>/dev/null | jq '.[].id')
    )

    echo "# List of incomplete jobs:"

    declare -A dict_qcow=()
    for id in "${ids[@]}"; do
        json="$(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/${id}" 2>/dev/null)"
        reason="$(echo $json | jq -r '.job.reason' )"
        echo -e "https://openqa.suse.de/tests/$id\t[$reason]"
        qcow=$(echo "$reason" | perl -ne 'print "$1" if /Failed to download (\S+)/')
        [[ -n "$qcow" ]] && dict_qcow["$qcow"]=1
    done

    [[ -z "${s_qcow[@]}" ]] && return

    echo -e "\n# Failed to download:"

    local IFS=$'\n'
    s_qcow=($(sort <<<"${!dict_qcow[*]}"))

    for qcow in "${s_qcow[@]}"; do
        echo "${qcow}"
    done

    echo -e "\n# Jobs to restart:"

    declare -a ids=(
      $(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/overview?groupid=520" 2>/dev/null | jq '.[].id')
    )
    # declare -a ids=(
    #   $(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/overview?groupid=520&groupid=446&groupid=265&groupid=129" 2>/dev/null | jq '.[].id')
    # )

    for id in "${ids[@]}"; do
      json="$(curl -k -X GET "https://openqa.suse.de/api/v1/jobs/${id}" 2>/dev/null)"
      PUBLISH_HDD_1="$(echo $json | jq -r '.job.settings.PUBLISH_HDD_1 | select( . != null)' )"
      TEST="$(echo $json | jq -r '.job.settings.TEST | select( . != null)' )"
      if [ -n "$PUBLISH_HDD_1" ]; then
        for qcow in "${s_qcow[@]}"; do
            [ "$PUBLISH_HDD_1" == "$qcow" ] && echo "[$TEST]: https://openqa.suse.de/tests/$id for $qcow"
        done
      fi
    done

}

SEARCH_INCOMPLETE_ALL_GROUPS

Actions

Also available in: Atom PDF