Project

General

Profile

action #116107

openQA-in-OpenQA openqa_from_containers test fails in build size:M

Added by tinita 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Concrete Bugs
Target version:
Start date:
2022-09-01
Due date:
2022-09-16
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_from_containers@64bit-2G fails in
build

# Test died: command 'for i in webui worker; do retry -s 30 -- docker build openQA/container/$i -t openqa_$i; done' failed at openqa//tests/containers/build.pm line 8.

Test suite description

Maintainer: okurz@suse.de Test for running openQA itself from containers. To be used with "openqa" distri. Introduced retry on the job level due to https://progress.opensuse.org/issues/108665 as there can still be sporadic network issues sometimes.

Reproducible

Fails since (at least) Build :TW.12159

Expected result

Last good: :TW.12158 (or more recent)

Further details

Always latest result in this scenario: latest

Suggestions

  • Change the docker call to reveal the exit code
  • Reveal the return code via the testapi
  • Investigate this in a mob session

History

#1 Updated by tinita 3 months ago

  • Subject changed from openQA-in-OpenQA test fails in to openQA-in-OpenQA test fails in build

#2 Updated by tinita 3 months ago

  • Description updated (diff)

#3 Updated by tinita 3 months ago

  • Subject changed from openQA-in-OpenQA test fails in build to openQA-in-OpenQA openqa_from_containers test fails in build

#4 Updated by cdywan 3 months ago

  • Subject changed from openQA-in-OpenQA openqa_from_containers test fails in build to openQA-in-OpenQA openqa_from_containers test fails in build size:M
  • Description updated (diff)
  • Status changed from New to Workable

#5 Updated by mkittler 3 months ago

  • Category deleted (Concrete Bugs)
  • Target version deleted (Ready)

The video shows that some install scripts failed during the installation with zypper. So while the container was built/exited successfully the zypper command that ran within may not have returned with a zero return code.

#6 Updated by tinita 3 months ago

  • Category set to Concrete Bugs
  • Target version set to Ready

#7 Updated by tinita 3 months ago

Apparently openqa_install+publish also fails, and the error is visible:
https://openqa.opensuse.org/tests/2597042#step/openqa_webui/11
It's about an unsigned repomd.xml file.
Maybe that's also what's happening in the container test and we can't see it in the screenshots? Maybe that's what Marius was able to see in the video?

#8 Updated by mkittler 3 months ago

  • Status changed from Workable to In Progress
  • Assignee set to mkittler

PR for improving the error handling: https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/99

Not sure whether we can fix the root cause. Maybe it is also a good idea to increase the timeout or number of retries.

#9 Updated by openqa_review 3 months ago

  • Due date set to 2022-09-16

Setting due date based on mean cycle time of SUSE QE Tools

#10 Updated by mkittler 3 months ago

  • Status changed from In Progress to Resolved

There were already over 10 builds since the PR has been merged and all passed. Not sure whether it makes sense to increase the timeout (as mentioned in the previous comment) so I'll just leave it as-is. Note that in the passing jobs one can now see already much better the result of each contain build (as each command now has its own screenshot at the end). This should make future investigations if something goes wrong easier.

Also available in: Atom PDF