action #102578
closedcoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
coordination #98952: [epic] t/full-stack.t sporadically fails "clickElement: element not interactable" and other errors
[sporadic] t/full-stack.t Failed test 'Expected result for job 1 not found' size:M
Added by livdywan about 3 years ago. Updated about 3 years ago.
Description
Observation¶
t/full-stack.t is becoming more and more of a problem, e.g. see https://app.circleci.com/pipelines/github/os-autoinst/openQA/8392/workflows/6748c60b-4966-4db6-806b-415da997d321/jobs/79281 from https://github.com/os-autoinst/openQA/pull/4360 . I suggest more people to collaborate on this ticket.
# Failed test 'Expected result for job 1 not found'
Acceptance criteria¶
- AC1: t/full-stack.t subtest 'schedule job' is stable
Suggestions¶
- a PR to expose worker log in the schedule job subtest
- Measure fail ratio e.g. with https://github.com/okurz/scripts/blob/master/count_fail_ratio
- Add
done_testing;exit;
after thesubtest
to accelerate testing
Updated by livdywan about 3 years ago
- Copied from coordination #98952: [epic] t/full-stack.t sporadically fails "clickElement: element not interactable" and other errors added
Updated by livdywan about 3 years ago
Suggestions¶
- a PR to expose worker log in the schedule job subtest
Updated by livdywan about 3 years ago
- Assignee deleted (
livdywan)
PR got merged. Might be best to come back to this once we see this on CircleCI or locally (I've not managed to produce a genuine failure with this fix yet)
Updated by okurz about 3 years ago
- Status changed from Workable to New
Wasn't estimated, back to "New"
Updated by livdywan about 3 years ago
- Subject changed from t/full-stack.t Failed test 'Expected result for job 1 not found' to t/full-stack.t Failed test 'Expected result for job 1 not found' [sporadic] size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz about 3 years ago
- Subject changed from t/full-stack.t Failed test 'Expected result for job 1 not found' [sporadic] size:M to [sporadic] t/full-stack.t Failed test 'Expected result for job 1 not found' [sporadic] size:M
- Status changed from Workable to In Progress
- Assignee set to okurz
I followed the suggestion locally with
$ git diff
diff --git a/t/full-stack.t b/t/full-stack.t
index 7b858e8b5..470eaaf08 100644
--- a/t/full-stack.t
+++ b/t/full-stack.t
@@ -218,6 +218,8 @@ subtest 'schedule job' => sub {
or bail_with_log 1,
'Job 1 produced the wrong results';
+ done_testing;exit;
+
subtest 'clone job that crashes' => sub {
client_call('-X POST jobs/1/restart', qr|test_url.+1.+tests.+2|, 'client returned new test_url for test 2');
$driver->refresh();
and called FULLSTACK=1 count_fail_ratio prove -l t/full-stack.t
and got
...
## Run 20
t/full-stack.t .. ok
All tests successful.
Files=1, Tests=31, 33 wallclock secs ( 0.07 usr 0.01 sys + 15.37 cusr 2.52 csys = 17.97 CPU)
Result: PASS
## count_fail_ratio: Run: 20. Fails: 0. Fail ratio 0%
so fail-rate is below 1/20 in my local instance. Will test longer and also on CI
Updated by okurz about 3 years ago
- Subject changed from [sporadic] t/full-stack.t Failed test 'Expected result for job 1 not found' [sporadic] size:M to [sporadic] t/full-stack.t Failed test 'Expected result for job 1 not found' size:M
Updated by okurz about 3 years ago
Local results from FULLSTACK=1 runs=100 count_fail_ratio prove -l t/full-stack.t
:
...
All tests successful.
Files=1, Tests=31, 33 wallclock secs ( 0.08 usr 0.01 sys + 15.59 cusr 2.52 csys = 18.20 CPU)
Result: PASS
## count_fail_ratio: Run: 100. Fails: 0. Fail ratio 0%
so fail-rate below 1%. The test in CI failed with another error in the 4th run (of max. 50): https://github.com/os-autoinst/openQA/pull/4373#issuecomment-979248913
Updated by okurz about 3 years ago
- Status changed from In Progress to Workable
- Assignee deleted (
okurz)
Locally I am at "## count_fail_ratio: Run: 751. Fails: 0. Fail ratio 0%" so a fail rate of at least below 0.13 %. Within circleCI I could reproduce what looks like the error.
https://app.circleci.com/pipelines/github/os-autoinst/openQA/8463/workflows/a8295d0c-d4b5-4c59-8d38-204524a8bcec/jobs/79954
shows
Rerun 4 of 50 …
Calling retry hook ./tools/delete-coverdb-folder
Deleting cover_db_fullstack/
[14:21:16] t/full-stack.t .. 26/? # Unexpected Javascript console errors: [
# {
# level => "SEVERE",
# message => "http://localhost:9526/tests/1/images/thumb/boot-1.png?1637850113525 - Failed to load resource: the server responded with a status of 404 (Not Found)",
# source => "network",
# timestamp => 1637850113984,
# },
# {
# level => "SEVERE",
# message => "http://localhost:9526/tests/1/images/thumb/boot-2.png?1637850113525 - Failed to load resource: the server responded with a status of 404 (Not Found)",
# source => "network",
# timestamp => 1637850113985,
# },
# ]
# full result panel contents:
# State: running, started less than a minute ago
# Scheduled product: job has not been created by posting an ISO
# Assigned worker: f2681b748598:1
# Failed test 'Expected result not found'
# at /home/squamata/project/t/lib/OpenQA/Test/FullstackUtils.pm line 84.
# Failed test 'job 1 passed'
# at t/full-stack.t line 197.
[14:21:16] t/full-stack.t .. 27/? # 0
# Failed test 'autoinst log file generated'
# at t/full-stack.t line 199.
# Looks like you failed 3 tests of 5.
# Failed test 'schedule job'
# at t/full-stack.t line 218.
# unable to read /tmp/XKQPZFxlD7/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: Can't open file "/tmp/XKQPZFxlD7/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt": No such file or directory at t/full-stack.t line 161.
# /tmp/XKQPZFxlD7/full-stack.d/openqa/testresults/../pool/1/worker-log.txt:
# [2021-11-25T14:21:29.415011Z] [debug] [pid:1692] Preparing Mojo::IOLoop::ReadWriteProcess::Session
# [2021-11-25T14:21:29.415915Z] [debug] [pid:1692] Job settings:
...
# [2021-11-25T14:21:29.416569Z] [debug] [pid:1692] Linked asset "/tmp/XKQPZFxlD7/full-stack.d/openqa/share/factory/iso/Core-7.2.iso" to "/tmp/XKQPZFxlD7/full-stack.d/openqa/pool/1/Core-7.2.iso"
# [2021-11-25T14:21:29.416889Z] [debug] [pid:1692] Symlinked from "/tmp/XKQPZFxlD7/full-stack.d/openqa/share/tests/tinycore" to "/tmp/XKQPZFxlD7/full-stack.d/openqa/pool/1/tinycore"
# [2021-11-25T14:21:29.417208Z] [info] [pid:1692] Preparing cgroup to start isotovideo
# [2021-11-25T14:21:29.418208Z] [warn] [pid:1692] Disabling cgroup usage because cgroup creation failed: mkdir /sys/fs/cgroup/systemd/docker: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 84.
# [2021-11-25T14:21:29.418338Z] [info] [pid:1692] You can define a custom slice with OPENQA_CGROUP_SLICE or indicating the base mount with MOJO_CGROUP_FS.
# [2021-11-25T14:21:29.418630Z] [info] [pid:1692] Starting isotovideo container
# [2021-11-25T14:21:29.420870Z] [debug] [pid:1692] Registered process:1699
# [2021-11-25T14:21:29.421030Z] [info] [pid:1699] 1699: WORKING 1
# [2021-11-25T14:21:29.421250Z] [info] [pid:1692] isotovideo has been started (PID: 1699)
# [2021-11-25T14:21:29.421762Z] [debug] [pid:1692] Running job 1 from http://localhost:9526: 00000001-tinycore-1-flavor-i386-Build1-core@coolone.
# [2021-11-25T14:21:29.423493Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:29.584343Z] [debug] [pid:1692] Upload concluded (no current module)
# [2021-11-25T14:21:31.665487Z] [debug] [pid:1692] Starting livelog
# [2021-11-25T14:21:31.667572Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:31.786098Z] [debug] [pid:1692] Upload concluded (no current module)
# [2021-11-25T14:21:32.033447Z] [debug] [pid:1692] Stopping livelog
# [2021-11-25T14:21:32.826565Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:32.986391Z] [debug] [pid:1692] Upload concluded (no current module)
# [2021-11-25T14:21:42.994183Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:43.090318Z] [debug] [pid:1692] Upload concluded (at boot)
# [2021-11-25T14:21:53.094533Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:53.394912Z] [debug] [pid:1927] Optimizing /tmp/XKQPZFxlD7/full-stack.d/openqa/pool/1/testresults/boot-1.png
# [2021-11-25T14:21:53.632212Z] [debug] [pid:1927] Uploading artefact boot-1.png as f4184468dbb276a03240907d0e7e500d
# [2021-11-25T14:21:53.865849Z] [debug] [pid:1692] Starting livelog
# [2021-11-25T14:21:53.867995Z] [debug] [pid:1692] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-25T14:21:53.996201Z] [debug] [pid:1927] Optimizing /tmp/XKQPZFxlD7/full-stack.d/openqa/pool/1/testresults/.thumbs/boot-1.png
# [2021-11-25T14:21:54.026295Z] [debug] [pid:1927] Uploading artefact boot-1.png as f4184468dbb276a03240907d0e7e500d
Bailout called. Further testing stopped: Job 1 produced the wrong results
FAILED--Further testing stopped: Job 1 produced the wrong results
That's the one, right?
Can someone else please try to reproduce locally?
EDIT: 2021-11-26: Based on a suggestion of tinita I ran
make coverage FULLSTACK=1 RETRY=50 STABILITY_TEST=1 TESTS=t/full-stack.t
and encountered
Rerun 16 of 50 …
Calling retry hook ./tools/delete-coverdb-folder
Deleting cover_db/
t/full-stack.t .. 17/? Bailout called. Further testing stopped: clickElement: element not interactable at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/SeleniumTest.pm:81
FAILED--Further testing stopped: clickElement: element not interactable at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/SeleniumTest.pm:81
make[2]: *** [Makefile:196: test-unit-and-integration] Error 255
This does not look like the same and unfortunately there is no stack trace of where in t/full-stack.t that failed itself.
Trying with
diff --git a/t/lib/OpenQA/SeleniumTest.pm b/t/lib/OpenQA/SeleniumTest.pm
index e05f977ed..1ae0f2e11 100644
--- a/t/lib/OpenQA/SeleniumTest.pm
+++ b/t/lib/OpenQA/SeleniumTest.pm
@@ -16,6 +16,7 @@ our @EXPORT = qw(driver_missing check_driver_modules enable_timeout
wait_until wait_until_element_gone wait_for_element
element_prop element_prop_by_selector map_elements);
+use Carp;
use Data::Dump 'pp';
use IPC::Run qw(start);
use Mojo::IOLoop::Server;
@@ -81,7 +82,7 @@ sub start_driver {
$err .= ' at ' . __FILE__ . ':' . __LINE__; # uncoverable statement
# prevent aborting the complete test when interactively debugging
- $INC{'perl5db.pl'} ? fail $err : BAIL_OUT($err); # uncoverable statement
+ $INC{'perl5db.pl'} ? fail $err : confess($err); # uncoverable statement
},
);
then I got
Rerun 2 of 50 …
Calling retry hook ./tools/delete-coverdb-folder
Deleting cover_db/
t/full-stack.t .. 31/? # Unexpected Javascript console errors: [
# {
# level => "SEVERE",
# message => "http://localhost:9526/tests/1/images/thumb/boot-1.png - Failed to load resource: the server responded with a status of 404 (Not Found)",
# source => "network",
# timestamp => 1637926799082,
# },
# ]
# full result panel contents:
# State: running, started less than a minute ago
# Scheduled product: job has not been created by posting an ISO
# Assigned worker: linux-28d7:1
# Failed test 'Expected result not found'
# at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/Test/FullstackUtils.pm line 84.
# Failed test 'job 1 passed'
# at t/full-stack.t line 197.
# 0
# Failed test 'autoinst log file generated'
# at t/full-stack.t line 199.
# Looks like you failed 3 tests of 6.
t/full-stack.t .. 32/?
# Failed test 'schedule job'
# at t/full-stack.t line 218.
# unable to read /tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt": No such file or directory at t/full-stack.t line 161.
# /tmp/WlnajfL08s/full-stack.d/openqa/testresults/../pool/1/worker-log.txt:
# [2021-11-26T12:39:40.799425+01:00] [debug] [pid:7029] Preparing Mojo::IOLoop::ReadWriteProcess::Session
# [2021-11-26T12:39:40.801345+01:00] [debug] [pid:7029] Job settings:
# [2021-11-26T12:39:40.801766+01:00] [debug] [pid:7029]
# ARCH=i386
# BUILD=1
# DISTRI=tinycore
# FLAVOR=flavor
# INTEGRATION_TESTS=1
# ISO=Core-7.2.iso
# JOBTOKEN=y3sRD950IaKP0uKl
# LOG_LEVEL=debug
# MACHINE=coolone
# NAME=00000001-tinycore-1-flavor-i386-Build1-core@coolone
# OPENQA_HOSTNAME=localhost:9526
# OPENQA_URL=http://localhost:9526
# PAUSE_AT=shutdown
# PRJDIR=/tmp/WlnajfL08s/full-stack.d/openqa/share
# PUBLISH_HDD_1=core-hdd.qcow2
# QEMU=i386
# QEMUPORT=20012
# QEMU_NO_FDC_SET=1
# QEMU_NO_TABLET=1
# RETRY_DELAY=5
# RETRY_DELAY_IF_WEBUI_BUSY=60
# TEST=core
# UEFI_PFLASH_VARS=/usr/share/qemu/ovmf-x86_64.bin
# VERSION=1
# VNC=91
# WORKER_CLASS=qemu_i386,qemu_x86_64
# WORKER_HOSTNAME=127.0.0.1
# WORKER_ID=1
# WORKER_INSTANCE=1
# [2021-11-26T12:39:40.803070+01:00] [debug] [pid:7029] Linked asset "/tmp/WlnajfL08s/full-stack.d/openqa/share/factory/iso/Core-7.2.iso" to "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/Core-7.2.iso"
# [2021-11-26T12:39:40.803850+01:00] [debug] [pid:7029] Symlinked from "/tmp/WlnajfL08s/full-stack.d/openqa/share/tests/tinycore" to "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/tinycore"
# [2021-11-26T12:39:40.804481+01:00] [info] [pid:7029] Preparing cgroup to start isotovideo
# [2021-11-26T12:39:40.806333+01:00] [warn] [pid:7029] Disabling cgroup usage because cgroup creation failed: mkdir /sys/fs/cgroup/systemd/user.slice/user-1000.slice/session-2.scope/1: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 84.
# [2021-11-26T12:39:40.806620+01:00] [info] [pid:7029] You can define a custom slice with OPENQA_CGROUP_SLICE or indicating the base mount with MOJO_CGROUP_FS.
# [2021-11-26T12:39:40.807416+01:00] [info] [pid:7029] Starting isotovideo container
# [2021-11-26T12:39:40.810056+01:00] [debug] [pid:7029] Registered process:7064
# [2021-11-26T12:39:40.810286+01:00] [info] [pid:7064] 7064: WORKING 1
# [2021-11-26T12:39:40.810724+01:00] [info] [pid:7029] isotovideo has been started (PID: 7064)
# [2021-11-26T12:39:40.811639+01:00] [debug] [pid:7029] Running job 1 from http://localhost:9526: 00000001-tinycore-1-flavor-i386-Build1-core@coolone.
# [2021-11-26T12:39:40.814680+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:40.968688+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:46.186887+01:00] [debug] [pid:7029] Starting livelog
# [2021-11-26T12:39:46.189172+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:46.457583+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:46.812095+01:00] [debug] [pid:7029] Stopping livelog
# [2021-11-26T12:39:47.459390+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:47.578622+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:57.591256+01:00] [debug] [pid:7029] Unable to read result-shutdown.json: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/result-shutdown.json": No such file or directory at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 132.
# Mojo::File::slurp(Mojo::File=SCALAR(0x5559bd2b7740)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1098
# eval {...} called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1098
# OpenQA::Worker::Job::_read_json_file(OpenQA::Worker::Job=HASH(0x5559bd264330), "result-shutdown.json") called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1154
# OpenQA::Worker::Job::_read_module_result(OpenQA::Worker::Job=HASH(0x5559bd264330), "shutdown") called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1116
# OpenQA::Worker::Job::_read_result_file(OpenQA::Worker::Job=HASH(0x5559bd264330), "shutdown", ARRAY(0x5559bd126d78)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 786
# OpenQA::Worker::Job::_upload_results_step_0_prepare(OpenQA::Worker::Job=HASH(0x5559bd264330), CODE(0x5559bd126c70)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 715
# OpenQA::Worker::Job::_upload_results(OpenQA::Worker::Job=HASH(0x5559bd264330), CODE(0x5559bd126c70)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 903
# OpenQA::Worker::Job::__ANON__(Mojo::IOLoop=HASH(0x5559baf466a0)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 220
…
# [2021-11-26T12:39:58.926528+01:00] [debug] [pid:7172] Uploading artefact boot-2.png as e3b43b69c91fe4072dac1037f416c281
# [2021-11-26T12:39:58.992464+01:00] [debug] [pid:7172] Optimizing /tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/.thumbs/boot-2.png
# [2021-11-26T12:39:59.001311+01:00] [debug] [pid:7172] Uploading artefact boot-2.png as e3b43b69c91fe4072dac1037f416c281
# [2021-11-26T12:39:59.070054+01:00] [debug] [pid:7172] Optimizing /tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/boot-1.png
Bailout called. Further testing stopped: Job 1 produced the wrong results
so to me from the above logs it looks like the openQA job is about to conclude as there is still uploading and optimizing going on. This is also why autoinst-log.txt can not yet be in the testresults directory:
# unable to read /tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt": No such file or directory
Updated by okurz about 3 years ago
- Status changed from Workable to In Progress
- Assignee set to okurz
some ideas where I can continue.
Based on a suggestion of tinita I ran
make coverage FULLSTACK=1 RETRY=50 STABILITY_TEST=1 TESTS=t/full-stack.t
and encountered
Rerun 16 of 50 …
Calling retry hook ./tools/delete-coverdb-folder
Deleting cover_db/
t/full-stack.t .. 17/? Bailout called. Further testing stopped: clickElement: element not interactable at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/SeleniumTest.pm:81
FAILED--Further testing stopped: clickElement: element not interactable at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/SeleniumTest.pm:81
make[2]: *** [Makefile:196: test-unit-and-integration] Error 255
This does not look like the same and unfortunately there is no stack trace of where in t/full-stack.t that failed itself.
Trying with
diff --git a/t/lib/OpenQA/SeleniumTest.pm b/t/lib/OpenQA/SeleniumTest.pm
index e05f977ed..1ae0f2e11 100644
--- a/t/lib/OpenQA/SeleniumTest.pm
+++ b/t/lib/OpenQA/SeleniumTest.pm
@@ -16,6 +16,7 @@ our @EXPORT = qw(driver_missing check_driver_modules enable_timeout
wait_until wait_until_element_gone wait_for_element
element_prop element_prop_by_selector map_elements);
+use Carp;
use Data::Dump 'pp';
use IPC::Run qw(start);
use Mojo::IOLoop::Server;
@@ -81,7 +82,7 @@ sub start_driver {
$err .= ' at ' . __FILE__ . ':' . __LINE__; # uncoverable statement
# prevent aborting the complete test when interactively debugging
- $INC{'perl5db.pl'} ? fail $err : BAIL_OUT($err); # uncoverable statement
+ $INC{'perl5db.pl'} ? fail $err : confess($err); # uncoverable statement
},
);
then I got
Rerun 2 of 50 …
Calling retry hook ./tools/delete-coverdb-folder
Deleting cover_db/
t/full-stack.t .. 31/? # Unexpected Javascript console errors: [
# {
# level => "SEVERE",
# message => "http://localhost:9526/tests/1/images/thumb/boot-1.png - Failed to load resource: the server responded with a status of 404 (Not Found)",
# source => "network",
# timestamp => 1637926799082,
# },
# ]
# full result panel contents:
# State: running, started less than a minute ago
# Scheduled product: job has not been created by posting an ISO
# Assigned worker: linux-28d7:1
# Failed test 'Expected result not found'
# at /home/okurz/local/os-autoinst/openQA/t/lib/OpenQA/Test/FullstackUtils.pm line 84.
# Failed test 'job 1 passed'
# at t/full-stack.t line 197.
# 0
# Failed test 'autoinst log file generated'
# at t/full-stack.t line 199.
# Looks like you failed 3 tests of 6.
t/full-stack.t .. 32/?
# Failed test 'schedule job'
# at t/full-stack.t line 218.
# unable to read /tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt": No such file or directory at t/full-stack.t line 161.
# /tmp/WlnajfL08s/full-stack.d/openqa/testresults/../pool/1/worker-log.txt:
# [2021-11-26T12:39:40.799425+01:00] [debug] [pid:7029] Preparing Mojo::IOLoop::ReadWriteProcess::Session
# [2021-11-26T12:39:40.801345+01:00] [debug] [pid:7029] Job settings:
# [2021-11-26T12:39:40.801766+01:00] [debug] [pid:7029]
# ARCH=i386
# BUILD=1
# DISTRI=tinycore
# FLAVOR=flavor
# INTEGRATION_TESTS=1
# ISO=Core-7.2.iso
# JOBTOKEN=y3sRD950IaKP0uKl
# LOG_LEVEL=debug
# MACHINE=coolone
# NAME=00000001-tinycore-1-flavor-i386-Build1-core@coolone
# OPENQA_HOSTNAME=localhost:9526
# OPENQA_URL=http://localhost:9526
# PAUSE_AT=shutdown
# PRJDIR=/tmp/WlnajfL08s/full-stack.d/openqa/share
# PUBLISH_HDD_1=core-hdd.qcow2
# QEMU=i386
# QEMUPORT=20012
# QEMU_NO_FDC_SET=1
# QEMU_NO_TABLET=1
# RETRY_DELAY=5
# RETRY_DELAY_IF_WEBUI_BUSY=60
# TEST=core
# UEFI_PFLASH_VARS=/usr/share/qemu/ovmf-x86_64.bin
# VERSION=1
# VNC=91
# WORKER_CLASS=qemu_i386,qemu_x86_64
# WORKER_HOSTNAME=127.0.0.1
# WORKER_ID=1
# WORKER_INSTANCE=1
# [2021-11-26T12:39:40.803070+01:00] [debug] [pid:7029] Linked asset "/tmp/WlnajfL08s/full-stack.d/openqa/share/factory/iso/Core-7.2.iso" to "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/Core-7.2.iso"
# [2021-11-26T12:39:40.803850+01:00] [debug] [pid:7029] Symlinked from "/tmp/WlnajfL08s/full-stack.d/openqa/share/tests/tinycore" to "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/tinycore"
# [2021-11-26T12:39:40.804481+01:00] [info] [pid:7029] Preparing cgroup to start isotovideo
# [2021-11-26T12:39:40.806333+01:00] [warn] [pid:7029] Disabling cgroup usage because cgroup creation failed: mkdir /sys/fs/cgroup/systemd/user.slice/user-1000.slice/session-2.scope/1: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 84.
# [2021-11-26T12:39:40.806620+01:00] [info] [pid:7029] You can define a custom slice with OPENQA_CGROUP_SLICE or indicating the base mount with MOJO_CGROUP_FS.
# [2021-11-26T12:39:40.807416+01:00] [info] [pid:7029] Starting isotovideo container
# [2021-11-26T12:39:40.810056+01:00] [debug] [pid:7029] Registered process:7064
# [2021-11-26T12:39:40.810286+01:00] [info] [pid:7064] 7064: WORKING 1
# [2021-11-26T12:39:40.810724+01:00] [info] [pid:7029] isotovideo has been started (PID: 7064)
# [2021-11-26T12:39:40.811639+01:00] [debug] [pid:7029] Running job 1 from http://localhost:9526: 00000001-tinycore-1-flavor-i386-Build1-core@coolone.
# [2021-11-26T12:39:40.814680+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:40.968688+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:46.186887+01:00] [debug] [pid:7029] Starting livelog
# [2021-11-26T12:39:46.189172+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:46.457583+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:46.812095+01:00] [debug] [pid:7029] Stopping livelog
# [2021-11-26T12:39:47.459390+01:00] [debug] [pid:7029] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status
# [2021-11-26T12:39:47.578622+01:00] [debug] [pid:7029] Upload concluded (no current module)
# [2021-11-26T12:39:57.591256+01:00] [debug] [pid:7029] Unable to read result-shutdown.json: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/result-shutdown.json": No such file or directory at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 132.
# Mojo::File::slurp(Mojo::File=SCALAR(0x5559bd2b7740)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1098
# eval {...} called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1098
# OpenQA::Worker::Job::_read_json_file(OpenQA::Worker::Job=HASH(0x5559bd264330), "result-shutdown.json") called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1154
# OpenQA::Worker::Job::_read_module_result(OpenQA::Worker::Job=HASH(0x5559bd264330), "shutdown") called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 1116
# OpenQA::Worker::Job::_read_result_file(OpenQA::Worker::Job=HASH(0x5559bd264330), "shutdown", ARRAY(0x5559bd126d78)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 786
# OpenQA::Worker::Job::_upload_results_step_0_prepare(OpenQA::Worker::Job=HASH(0x5559bd264330), CODE(0x5559bd126c70)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 715
# OpenQA::Worker::Job::_upload_results(OpenQA::Worker::Job=HASH(0x5559bd264330), CODE(0x5559bd126c70)) called at /home/okurz/local/os-autoinst/openQA/script/../lib/OpenQA/Worker/Job.pm line 903
# OpenQA::Worker::Job::__ANON__(Mojo::IOLoop=HASH(0x5559baf466a0)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 220
…
# [2021-11-26T12:39:58.926528+01:00] [debug] [pid:7172] Uploading artefact boot-2.png as e3b43b69c91fe4072dac1037f416c281
# [2021-11-26T12:39:58.992464+01:00] [debug] [pid:7172] Optimizing /tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/.thumbs/boot-2.png
# [2021-11-26T12:39:59.001311+01:00] [debug] [pid:7172] Uploading artefact boot-2.png as e3b43b69c91fe4072dac1037f416c281
# [2021-11-26T12:39:59.070054+01:00] [debug] [pid:7172] Optimizing /tmp/WlnajfL08s/full-stack.d/openqa/pool/1/testresults/boot-1.png
Bailout called. Further testing stopped: Job 1 produced the wrong results
so to me from the above logs it looks like the openQA job is about to conclude as there is still uploading and optimizing going on. This is also why autoinst-log.txt can not yet be in the testresults directory:
# unable to read /tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: Can't open file "/tmp/WlnajfL08s/full-stack.d/openqa/testresults/00000/00000001-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt": No such file or directory
so I am proposing some improvements, see the updated https://github.com/os-autoinst/openQA/pull/4373
Updated by openqa_review about 3 years ago
- Due date set to 2021-12-11
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz about 3 years ago
- Status changed from In Progress to Feedback
Updated https://github.com/os-autoinst/openQA/pull/4373 with more. That should fix our recent problems because we fail too early on javascript errors.
Updated by okurz about 3 years ago
Another improvement PR https://github.com/os-autoinst/openQA/pull/4379 , merged.
Updated by okurz about 3 years ago
- Status changed from Feedback to In Progress
Another improvement PR https://github.com/os-autoinst/openQA/pull/4380 and a full stack stability test https://github.com/os-autoinst/openQA/pull/4381
Updated by okurz about 3 years ago
- Due date deleted (
2021-12-11) - Status changed from In Progress to Resolved
I have not found further problems for the further specific mentioned subtest. I assume after #103422 we will find more
Updated by livdywan about 3 years ago
okurz wrote:
I have not found further problems for the further specific mentioned subtest. I assume after #103422 we will find more
Can you clarify how the failure in os-autoinst / 13-osutils.t is related here?
Updated by okurz about 3 years ago
cdywan wrote:
okurz wrote:
I have not found further problems for the further specific mentioned subtest. I assume after #103422 we will find more
Can you clarify how the failure in os-autoinst / 13-osutils.t is related here?
well, the mentioned error showed problems due to the update of perl-Mojo-IOLoop-ReadWriteProcess which in the meantime we have addressed with basically taking over the project, fixing the bugs, improving the tests, etc. And that can also impact the full-stack test.