Actions
action #104262
closedo3 responds with 500 on softfailed for job that "failed to load modules"
Description
Observation¶
The openqa-review pipeline failed like so:
++ /usr/bin/openqa-review --host https://openqa.opensuse.org -n -r -T --query-issue-status --no-empty-sections --include-softfails --running-threshold=2 --exclude-job-groups '^(Released|Development|old|EOL)' --reminder-comment-on-issues --save --save-dir /tmp/tmp.JYZsgebM6
[...]
raise MaxRetryError(_pool, url, error or ResponseError(cause))
417urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url: /api/v1/jobs/2095686/details (Caused by ResponseError('too many 500 error responses'))
[...]
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url: /api/v1/jobs/2095686/details (Caused by ResponseError('too many 500 error responses'))
Suggestions¶
- Re-try the job and see if the same problem occurs
- Investigate why openqa.opensuse.org responded with 500
Updated by livdywan about 3 years ago
- A re-try did not make a difference and the response on the affected jobs is consistent.
https://openqa.opensuse.org/api/v1/jobs/2095686/details
yields 500- https://openqa.opensuse.org/tests/2095686 softfailed and shows
Unable to load test modules
- Previous jobs seem to be unaffected e.g.
https://openqa.opensuse.org//api/v1/jobs//2093638
is softfailed but the API route responds normally - For reference, jobs in other groups i.e. not Tumbleweed-MicroOS-Image-ContainerHost also look unaffected regardless of job result
Updated by livdywan about 3 years ago
- Subject changed from openqa-review pipeline failed with RetryError: HTTPSConnectionPool(host='openqa.opensuse.org', port=443): Max retries exceeded with url to o3 responds with 500 on softfailed for job that "failed to load modules"
Renaming the ticket since this is an openQA issue on o3, which simply surfaced in the openqa-review pipeline
Updated by okurz about 3 years ago
- Tracker changed from coordination to action
- Category set to Regressions/Crashes
- Target version set to Ready
maybe related to #102332 ?
Updated by okurz about 3 years ago
- Related to action #102332: Unable to read *.json: Can't open file in o3 openQA logs /var/log/openqa size:M added
Updated by okurz about 3 years ago
The string "Unable to load" comes from the openQA javascript code
assets/javascripts/test_result.js: document.createTextNode('Unable to load ' + (tabConfig.descriptiveName || tabName) + '.')
Updated by okurz about 3 years ago
- Priority changed from High to Urgent
Updated by okurz about 3 years ago
- Status changed from New to In Progress
- Assignee set to okurz
curl -s https://patch-diff.githubusercontent.com/raw/os-autoinst/openQA/pull/4412.patch | patch -R -p1 --dir /usr/share/openqa
systemctl restart openqa-webui
fixes it. So reverting the PR for now.
Updated by okurz about 3 years ago
Created https://github.com/os-autoinst/openQA/pull/4424 to merge a revert. Also hotpatched OSD with the same.
Updated by livdywan about 3 years ago
okurz wrote:
Looking for potentially related errors, found these:
From worker-log.txt:
[2021-12-22T07:12:45.259301+01:00] [warn] Error uploading clamav-110.txt: Connection error: Can't connect: Temporary failure in name resolution
From autoinst-log.txt:
[2021-12-22T10:01:19.489135+01:00] [warn] !!! backend::baseclass::do_capture: There is some problem with your environment, we detected a stall for 19.3847260475159 seconds
[...]
Argument "\nyast2-kdump-status-0" isn't numeric in numeric eq (==) at opensuse/lib/kdump_utils.pm line 295.
kdump_utils::configure_service("test_type", "function") called at opensuse/tests/console/kdump_and_crash.pm line 19
kdump_and_crash::run(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/basetest.pm line 361
eval {...} called at /usr/lib/os-autoinst/basetest.pm line 355
basetest::runtest(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/autotest.pm line 372
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 372
autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 242
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 242
autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 296
autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38), CODE(0xaaab1fdc3b20)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 477
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/os-autoinst/autotest.pm line 298
autotest::start_process() called at /usr/bin/isotovideo line 260
[...]
Use of uninitialized value $yast_pid in scalar chomp at opensuse/lib/y2_base.pm line 41.
y2_base::save_strace_gdb_output(kdump_and_crash=HASH(0xaaab1ef363e0), "yast") called at opensuse/lib/y2_module_basetest.pm line 121
y2_module_basetest::post_fail_hook(kdump_and_crash=HASH(0xaaab1ef363e0)) called at opensuse/tests/console/kdump_and_crash.pm line 34
kdump_and_crash::post_fail_hook(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/basetest.pm line 323
eval {...} called at /usr/lib/os-autoinst/basetest.pm line 323
basetest::run_post_fail(kdump_and_crash=HASH(0xaaab1ef363e0), "test kdump_and_crash died") called at /usr/lib/os-autoinst/basetest.pm line 391
basetest::runtest(kdump_and_crash=HASH(0xaaab1ef363e0)) called at /usr/lib/os-autoinst/autotest.pm line 372
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 372
autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 242
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 242
autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 296
autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38), CODE(0xaaab1fdc3b20)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 477
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0xaaab1fe54b38)) called at /usr/lib/os-autoinst/autotest.pm line 298
Updated by livdywan about 3 years ago
Logs from o3:
grep 2098355 /var/log/openqa
[2021-12-22T09:45:08.185775Z] [error] [9Ds8h2dfGyEx] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/worker-log.txt",
[2021-12-22T09:45:33.727237Z] [error] [Ne7eS0f1sVji] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
[2021-12-22T09:45:35.787966Z] [error] [7_ZNRe9HOTmd] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
[2021-12-22T09:45:38.369045Z] [error] [rsPVNwsWwi9j] Can't open file "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra@aarch64-HD24G/clamav-110.txt": No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 116.
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/autoinst-log.txt",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/journalctl-55.png",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G",
"/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/ulogs",
path => "/var/lib/openqa/testresults/02098/02098355-opensuse-Tumbleweed-JeOS-for-AArch64-aarch64-Build20211221-jeos-extra\@aarch64-HD24G/worker-log.txt",
Updated by livdywan about 3 years ago
- Status changed from In Progress to Resolved
okurz wrote:
Created https://github.com/os-autoinst/openQA/pull/4424 to merge a revert. Also hotpatched OSD with the same.
PR got merged, and the hot patch looks fine. Revised fix will be covered by #102332
Actions