Project

General

Profile

Actions

action #104262

closed

o3 responds with 500 on softfailed for job that "failed to load modules"

Added by livdywan over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-12-22
Due date:
% Done:

0%

Estimated time:

Description

Observation

The openqa-review pipeline failed like so:

++ /usr/bin/openqa-review --host https://openqa.opensuse.org -n -r -T --query-issue-status --no-empty-sections --include-softfails --running-threshold=2 --exclude-job-groups '^(Released|Development|old|EOL)' --reminder-comment-on-issues --save --save-dir /tmp/tmp.JYZsgebM6
[...]
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
417urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url: /api/v1/jobs/2095686/details (Caused by ResponseError('too many 500 error responses'))
[...]
    raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url: /api/v1/jobs/2095686/details (Caused by ResponseError('too many 500 error responses'))

Suggestions

  • Re-try the job and see if the same problem occurs
  • Investigate why openqa.opensuse.org responded with 500

Related issues 1 (0 open1 closed)

Related to openQA Project - action #102332: Unable to read *.json: Can't open file in o3 openQA logs /var/log/openqa size:MResolvedkraih2021-11-122022-01-28

Actions
Actions #1

Updated by livdywan over 2 years ago

  • A re-try did not make a difference and the response on the affected jobs is consistent.
  • https://openqa.opensuse.org/api/v1/jobs/2095686/details yields 500
  • https://openqa.opensuse.org/tests/2095686 softfailed and shows Unable to load test modules
  • Previous jobs seem to be unaffected e.g. https://openqa.opensuse.org//api/v1/jobs//2093638 is softfailed but the API route responds normally
  • For reference, jobs in other groups i.e. not Tumbleweed-MicroOS-Image-ContainerHost also look unaffected regardless of job result
Actions #2

Updated by livdywan over 2 years ago

  • Subject changed from openqa-review pipeline failed with RetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url to o3 responds with 500 on softfailed for job that "failed to load modules"

Renaming the ticket since this is an openQA issue on o3, which simply surfaced in the openqa-review pipeline

Actions #3

Updated by okurz over 2 years ago

  • Tracker changed from coordination to action
  • Category set to Regressions/Crashes
  • Target version set to Ready

maybe related to #102332 ?

Actions #4

Updated by okurz over 2 years ago

  • Related to action #102332: Unable to read *.json: Can't open file in o3 openQA logs /var/log/openqa size:M added
Actions #5

Updated by okurz over 2 years ago

The string "Unable to load" comes from the openQA javascript code

assets/javascripts/test_result.js:        document.createTextNode('Unable to load ' + (tabConfig.descriptiveName || tabName) + '.')
Actions #6

Updated by okurz over 2 years ago

  • Priority changed from High to Urgent
Actions #7

Updated by okurz over 2 years ago

  • Status changed from New to In Progress
  • Assignee set to okurz
curl -s https://patch-diff.githubusercontent.com/raw/os-autoinst/openQA/pull/4412.patch | patch -R -p1 --dir /usr/share/openqa
systemctl restart openqa-webui

fixes it. So reverting the PR for now.

Actions #8

Updated by okurz over 2 years ago

Created https://github.com/os-autoinst/openQA/pull/4424 to merge a revert. Also hotpatched OSD with the same.

Actions #9

Updated by livdywan over 2 years ago

okurz wrote:

Also in https://openqa.opensuse.org/tests/2098355

Looking for potentially related errors, found these:

From worker-log.txt:

[2021-12-22T07:12:45.259301+01:00] [warn] Error uploading clamav-110.txt: Connection error: Can't connect: Temporary failure in name resolution

From autoinst-log.txt:

[2021-12-22T10:01:19.489135+01:00] [warn] !!! backend::baseclass::do_capture: There is some problem with your environment, we detected a stall for 19.3847260475159 seconds
[...]
Argument "\nyast2-kdump-status-0" isn't numeric in numeric eq (==) at opensuse/lib/kdump_utils.pm line 295.
    kdump_utils::configure_service("test_type", "function") called at opensuse/tests/console/kdump_and_crash.pm line 19
    kdump_and_crash::run(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/basetest.pm line 361
    eval {...} called at /usr/lib/os-autoinst/basetest.pm line 355
    basetest::runtest(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/autotest.pm line 372
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 372
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 242
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 242
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 296
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38), CODE(0xaaab1fdc3b20)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 477
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/os-autoinst/autotest.pm line 298
    autotest::start_process() called at /usr/bin/isotovideo line 260
[...]
Use of uninitialized value $yast_pid in scalar chomp at opensuse/lib/y2_base.pm line 41.
    y2_base::save_strace_gdb_output(kdump_and_crash=HASH(0xaaab1ef363e0), "yast") called at opensuse/lib/y2_module_basetest.pm line 121
    y2_module_basetest::post_fail_hook(kdump_and_crash=HASH(0xaaab1ef363e0)) called at opensuse/tests/console/kdump_and_crash.pm line 34
    kdump_and_crash::post_fail_hook(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/basetest.pm line 323
    eval {...} called at /usr/lib/os-autoinst/basetest.pm line 323
    basetest::run_post_fail(kdump_and_crash=HASH(0xaaab1ef363e0), "test kdump_and_crash died") called at /usr/lib/os-autoinst/basetest.pm line 391
    basetest::runtest(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/autotest.pm line 372
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 372
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 242
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 242
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 296
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38), CODE(0xaaab1fdc3b20)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 477
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/os-autoinst/autotest.pm line 298
Actions #10

Updated by livdywan over 2 years ago

Logs from o3:

grep 2098355 /var/log/openqa
[2021-12-22T09:45:08.185775Z] [error] [9Ds8h2dfGyEx] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
  path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/worker-log.txt",
[2021-12-22T09:45:33.727237Z] [error] [Ne7eS0f1sVji] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
[2021-12-22T09:45:35.787966Z] [error] [7_ZNRe9HOTmd] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
[2021-12-22T09:45:38.369045Z] [error] [rsPVNwsWwi9j] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
  path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/autoinst-log.txt",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
  path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/journalctl-55.png",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
  "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
  path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/worker-log.txt",
Actions #11

Updated by livdywan over 2 years ago

  • Status changed from In Progress to Resolved

okurz wrote:

Created https://github.com/os-autoinst/openQA/pull/4424 to merge a revert. Also hotpatched OSD with the same.

PR got merged, and the hot patch looks fine. Revised fix will be covered by #102332

Actions

Also available in: Atom PDF