Project

General

Profile

action #97733

Bot fails on Failed to query latest publiccloud tools image using {settings['PUBLICCLOUD_TOOLS_IMAGE_QUERY']} and no aggregates are scheduled

Added by pdostal about 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
Start date:
2021-08-30
Due date:
% Done:

0%

Estimated time:

Description

Here is example of failing run.

The PUBLICCLOUD_TOOLS_IMAGE_QUERY variable is set to https://openqa.suse.de/group_overview/276.json.
This variable is used as parameter to get_latest_tools_image function which then returns publiccloud_tools_0020.qcow2.

History

#1 Updated by pdostal about 2 months ago

Here is possible hotfix.

#2 Updated by pdostal about 2 months ago

  • Priority changed from Normal to Urgent

#3 Updated by okurz about 2 months ago

  • Assignee set to okurz
  • Target version set to Ready

pdostal https://progress.opensuse.org/issues/97733 sounds really like a "QE Container & Public Cloud" internal issue. I don't know how SUSE QE Tools is supposed to help here?

#4 Updated by jbaier_cz about 2 months ago

Initial investigation

Subsequent run successfully scheduled at least the rest of the jobs, the public cloud error remains, now with a more specific error message:
ERROR: Failed to query latest publiccloud tools image using https://openqa.suse.de/group_overview/276.json

#5 Updated by jbaier_cz about 2 months ago

okurz wrote:

pdostal https://progress.opensuse.org/issues/97733 sounds really like a "QE Container & Public Cloud" internal issue. I don't know how SUSE QE Tools is supposed to help here?

It is a bug in the bot actually, just found the issue. The bug was introduced by https://gitlab.suse.de/qa-maintenance/bot-ng/-/commit/317bf0bbc011a6b1ce3a07de06d93ea0f430fa37

#6 Updated by jbaier_cz about 2 months ago

  • Status changed from New to Resolved
  • Assignee changed from okurz to jbaier_cz

#7 Updated by jbaier_cz about 2 months ago

  • Status changed from Resolved to Feedback

#8 Updated by jbaier_cz about 2 months ago

  • Status changed from Feedback to Resolved

I see several improvements here we can probably evaluate:

  1. The CI pipeline is still not yet ideal. As there are a lot of runs, we are hitting https://progress.opensuse.org/issues/96827 quite often, that unfortunately hides some of the problems.
  2. For public cloud, there are at least three different prefixes for variables: PUBLICCLOUD_, PUBLIC_CLOUD, PC_; that should be unified.
  3. It would be nice to have at least a basic test suite to better distinguish between metadata error and code error.

#9 Updated by pdostal about 2 months ago

  • Assignee deleted (jbaier_cz)
  • Target version deleted (Ready)

jbaier_cz wrote:

  1. For public cloud, there are at least three different prefixes for variables: PUBLICCLOUD_, PUBLIC_CLOUD, PC_; that should be unified.

I created #97742 for this.

#10 Updated by pdostal about 2 months ago

The bug was in public cloud specific part of the bot but it was affecting all aggregates.

Thank you jbaier_cz for the fix!

#11 Updated by pdostal about 2 months ago

  • Assignee set to jbaier_cz
  • Target version set to Ready

Also available in: Atom PDF