coordination #98952
coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
[epic] t/full-stack.t sporadically fails "clickElement: element not interactable" and other errors
100%
Description
Observation¶
Lately observed in multiple circleCI jobs, e.g.
- https://app.circleci.com/pipelines/github/os-autoinst/openQA/7747/workflows/d4262a25-6b1a-4890-85c0-5d8f6da343a0/jobs/73144
- https://app.circleci.com/pipelines/github/os-autoinst/openQA/7701/workflows/52d54d57-5bf0-46c2-a593-623c1cfc6330/jobs/72685
- https://app.circleci.com/pipelines/github/os-autoinst/openQA/7721/workflows/bc335d26-96ed-4b5d-be07-6bf6b95fc7e5/jobs/72894
Acceptance criteria¶
- AC1: t/full-stack.t is consistently stable again
Suggestions¶
- Add retries
- Identify the regression
- Fix
Rollback steps¶
- Remove retry
Subtasks
Related issues
History
#1
Updated by okurz over 1 year ago
- Description updated (diff)
https://github.com/os-autoinst/openQA/pull/4218 to add retries currently.
#2
Updated by okurz over 1 year ago
- Status changed from New to In Progress
- Assignee set to okurz
I consider the ticket "Workable" already as it was in before. Taking the ticket and setting to "In Progress" relating to the above PR to address urgency.
#3
Updated by okurz over 1 year ago
- Status changed from In Progress to New
- Assignee deleted (
okurz) - Priority changed from Urgent to High
https://github.com/os-autoinst/openQA/pull/4218 merged, back to backlog with lower prio to fix.
#4
Updated by okurz over 1 year ago
- Priority changed from High to Urgent
#5
Updated by cdywan over 1 year ago
- Subject changed from t/full-stack.t can timeout, "clickElement: element not interactable" to t/full-stack.t can timeout, "clickElement: element not interactable" size:M
- Status changed from New to Workable
#6
Updated by okurz over 1 year ago
https://app.circleci.com/pipelines/github/os-autoinst/openQA/7796/workflows/05bae277-19b8-450b-92d9-4f21eccb5ff4/jobs/73602/steps shows a failed overall test run after multiple consecutive failures:
[10:25:05] t/full-stack.t .. 19/? # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/8e216fcf83/test_result.js 287 WebSocket connection to 'ws://localhost:9528/liveviewhandler/tests/1/developer/ws-proxy/status' failed: Close received after close", # source => "network", # timestamp => 1632392734180, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392734653, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392734835, # }, # ] # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392735967, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392735967, # }, # ] [10:25:05] t/full-stack.t .. 22/? # javascript console output, waiting for paused: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392741038, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/usXVXLzXROijBH3t/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1632392741038, # }, # ] [10:25:05] t/full-stack.t .. 25/? # javascript console output: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/8e216fcf83/test_result.js 287 WebSocket connection to 'ws://localhost:9528/liveviewhandler/tests/1/developer/ws-proxy/status' failed: Close received after close", # source => "network", # timestamp => 1632392742964, # }, # ] [10:25:05] t/full-stack.t .. 33/? # stopped waiting for '(?^u:Result: passed)', result turned out to be 'failed' # full result panel contents: # 0 1 # Result: failed finished less than a minute ago ( 00:10 minutes ) # Reason: isotovideo done: Can't use an undefined value as a symbol reference at /home/squamata/os-autoinst/myjsonrpc.pm line 54. # Scheduled product: job has not been created by posting an ISO # Assigned worker: 03486c3d351a:1
Locally could not reproduce a failure:
$ make test KEEP_DB=1 TESTS=t/full-stack.t FULLSTACK=1 test -d /dev/shm/tpg && (pg_ctl -D /dev/shm/tpg -s status >&/dev/null || pg_ctl -D /dev/shm/tpg -s start) || ./t/test_postgresql /dev/shm/tpg make test-unit-and-integration TEST_PG="DBI:Pg:dbname=openqa_test;host=/dev/shm/tpg" make[1]: Entering directory '/home/okurz/local/os-autoinst/openQA' export GLOBIGNORE="";\ export DEVEL_COVER_DB_FORMAT=JSON;\ export PERL5OPT=" -It/lib -I/home/okurz/local/os-autoinst/openQA/t/lib -MOpenQA::Test::PatchDeparse";\ RETRY=0 timeout -s SIGINT -k 5 -v $((60 * 1 * (0 + 1) ))m tools/retry prove -l --trap t/full-stack.t t/full-stack.t .. ok All tests successful. Files=1, Tests=111, 176 wallclock secs ( 0.54 usr 0.02 sys + 51.49 cusr 11.38 csys = 63.43 CPU) Result: PASS make[1]: Leaving directory '/home/okurz/local/os-autoinst/openQA' [ 1 = 1 ] || pg_ctl -D /dev/shm/tpg stop
#7
Updated by okurz over 1 year ago
- Status changed from Workable to In Progress
- Assignee set to okurz
Gathering statistics
#8
Updated by okurz over 1 year ago
Running count_fail_ratio make test KEEP_DB=1 TESTS=t/full-stack.t FULLSTACK=1
with count_fail_ratio locally I have:
## count_fail_ratio: Run: 20. Fails: 0. Fail ratio 0%
with a mean runtime of t/full-stack.t of roughly 150s. Maybe reproducible with coverage analysis calling count_fail_ratio make coverage KEEP_DB=1 TESTS=t/full-stack.t FULLSTACK=1
:
## Run 4 export DEVEL_COVER_DB_FORMAT=JSON;\ COVERAGE=1 cover -test Deleting database /home/okurz/local/os-autoinst/openQA/cover_db cover: running make test "OPTIMIZE=-O0 -fprofile-arcs -ftest-coverage" "OTHERLDFLAGS=-fprofile-arcs -ftest-coverage" make[1]: Entering directory '/home/okurz/local/os-autoinst/openQA' test -d /dev/shm/tpg && (pg_ctl -D /dev/shm/tpg -s status >&/dev/null || pg_ctl -D /dev/shm/tpg -s start) || ./t/test_postgresql /dev/shm/tpg make test-unit-and-integration TEST_PG="DBI:Pg:dbname=openqa_test;host=/dev/shm/tpg" make[2]: Entering directory '/home/okurz/local/os-autoinst/openQA' export GLOBIGNORE="";\ export DEVEL_COVER_DB_FORMAT=JSON;\ export PERL5OPT="-mJSON::PP -MDevel::Cover=-select_re,'^/lib',+ignore_re,lib/perlcritic/Perl/Critic/Policy,-coverage,statement,-db,cover_db, -It/lib -I/home/okurz/local/os-autoinst/openQA/t/lib -MOpenQA::Test::PatchDeparse";\ RETRY=0 timeout -s SIGINT -k 5 -v $((60 * 1 * (0 + 1) ))m tools/retry prove -l --trap t/full-stack.t t/full-stack.t .. Failed 1/94 subtests Test Summary Report ------------------- t/full-stack.t (Wstat: 0 Tests: 93 Failed: 0) Parse errors: Tests out of sequence. Found (19) but expected (18) Tests out of sequence. Found (20) but expected (19) Tests out of sequence. Found (21) but expected (20) Tests out of sequence. Found (22) but expected (21) Tests out of sequence. Found (23) but expected (22) Displayed the first 5 of 77 TAP syntax errors. Re-run prove with the -p option to see them all. Files=1, Tests=93, 303 wallclock secs ( 1.01 usr 0.03 sys + 199.11 cusr 15.47 csys = 215.62 CPU) Result: FAIL
well, at least some failure. I don't see what step actually failed, strange. All other tests in this run passed so the overall statistics locally from the above command:
## count_fail_ratio: Run: 20. Fails: 1. Fail ratio 5%
Maybe this is about KVM? AFAIR in circleCI there is no KVM support so the full stack test sets the os-autoinst test variable QEMU_NO_KVM based on the line t/lib/OpenQA/Test/FullstackUtils.pm:$JOB_SETUP->{QEMU_NO_KVM} = '1' unless -r '/dev/kvm';
Running with QEMU_NO_KVM=1 count_fail_ratio make coverage KEEP_DB=1 TESTS=t/full-stack.t FULLSTACK=1
This was also fine with ## count_fail_ratio: Run: 7. Fails: 0. Fail ratio 0%
#9
Updated by openqa_review over 1 year ago
- Due date set to 2021-10-08
Setting due date based on mean cycle time of SUSE QE Tools
#10
Updated by okurz over 1 year ago
- Status changed from In Progress to Workable
- Assignee deleted (
okurz)
https://github.com/os-autoinst/openQA/pull/4233 clearly reproduces the problem in CI. The problem seems to be about recent software changes. Also the "dependency PR" in the past days as visible in https://github.com/os-autoinst/openQA/pull/4232 is causing problems trying to update os-autoinst and deleting some dependencies. The full-stack.t log file then says "Can't locate Inline.pm in @INC" so it can not work.
I suspect a relation to
https://github.com/os-autoinst/os-autoinst/pull/1784
or
https://github.com/os-autoinst/os-autoinst/pull/1782
#11
Updated by cdywan over 1 year ago
- Status changed from Workable to In Progress
- Assignee set to cdywan
I'm taking this since it's affecting a lot of PRs. Let's see what I can come up with/
#12
Updated by cdywan over 1 year ago
Not sure what's causing the Something Unexpected Happened
error on CircleCI. At least one resut I can still see failed with Reason: isotovideo done: Can't use an undefined value as a symbol reference at /home/squamata/os-autoinst/myjsonrpc.pm line 54.
so I'm adding DEBUG_JSON=1
to see if that makes the error more meaningful.
I'm also trying so far unsuccessfully to replicate the error locally (yes, I read all of the above comments).
#13
Updated by cdywan over 1 year ago
I managed to reproduce in the meanwhile (using for ((i=1; i <= 20; i++)); do EXTRA_PROVE_ARGS=-v QEMU_NO_KVM=1 FULLSTACK=1 DEBUG_JSON=1 OPENQA_TEST_TIMEOUT_DISABLE=1 make test-with-database TESTS=t/full-stack.t || break; done
) which makes me think the two container PR's are unrelated since the fullstack test isn't using containers afair.
#14
Updated by cdywan over 1 year ago
Proposed an attempt at improving diagnostics to add context:
- the job we're waiting on
- subtests to allow failures to retain context
- JSON errors from os-autoinst
#15
Updated by cdywan over 1 year ago
Another failure in CI:
- https://github.com/os-autoinst/openQA/pull/4244
Bailout called. Further testing stopped: URL for os-autoinst cmd srv not available[...]Reason: tests died: unable to load main.pm, check the log for the cause
I'm trying to reproduce this one also by running tests on 15.3 but couldn't so far (note: this is distinct from the isotovideo issue above which I can sometimes reproduce on Tumbleweed), but I guess I'll try to change back the container anyway and see what I can learn from that.
Edit: I can reproduce it. I was stupidly using the wrong container w/o realizing it (this is why I hate errors disguised as warnings that get lost in scrollback).
Interestingly on Tumbleweed Inline-Python gets installed, on Leap 15.3 it doesn't with the exact same packages. Doesn't seem to affect the failure, though.
#16
Updated by cdywan over 1 year ago
okurz wrote:
https://github.com/os-autoinst/openQA/pull/4233 clearly reproduces the problem in CI. The problem seems to be about recent software changes. Also the "dependency PR" in the past days as visible in https://github.com/os-autoinst/openQA/pull/4232 is causing problems trying to update os-autoinst and deleting some dependencies. The full-stack.t log file then says "Can't locate Inline.pm in @INC" so it can not work.
- As indicated before I can't confirm that InlinePython is preventing the test from passing.
- I proposed a PR to address the mismatching hash, although I'm not 100% sure why it happened.
#17
Updated by okurz over 1 year ago
cdywan wrote:
Not sure what's causing the
Something Unexpected Happened
error on CircleCI
this was a problem on 2021-09-27, see https://status.circleci.com/ for the entry of that day: "[…] Right now, you might be unable to access the CircleCI UI. This is unfortunately a side-effect of our effort to fix the initial incident."
I'm also trying so far unsuccessfully to replicate the error locally (yes, I read all of the above comments).
[…]
I managed to reproduce in the meanwhile (usingfor ((i=1; i <= 20; i++)); do EXTRA_PROVE_ARGS=-v QEMU_NO_KVM=1 FULLSTACK=1 DEBUG_JSON=1 OPENQA_TEST_TIMEOUT_DISABLE=1 make test-with-database TESTS=t/full-stack.t || break; done
)
just to verify, so you did manage to reproduce problems when running the above locally, right?
[…] which makes me think the two container PR's are unrelated since the fullstack test isn't using containers afair.
which "two container PR's"? And yes, if you just call make test TESTS=t/full-stack.t
(or as you did above) the test will run in your local environment, no container.
#18
Updated by cdywan over 1 year ago
okurz wrote:
cdywan wrote:
[…] which makes me think the two container PR's are unrelated since the fullstack test isn't using containers afair.
which "two container PR's"? And yes, if you just call
make test TESTS=t/full-stack.t
(or as you did above) the test will run in your local environment, no container.
I was referring to https://github.com/os-autoinst/os-autoinst/pull/1784 and https://github.com/os-autoinst/os-autoinst/pull/1782 respectively. I am able to reproduce all except for one error on Tumbleweed, with a certain probability after repeated executions.
I can reproduce tests died: unable to load main.pm, check the log for the cause
reliably on Leap, although I still don't understand the cause because e.g. missing Python support doesn't affect the result.
#19
Updated by okurz over 1 year ago
In https://app.circleci.com/pipelines/github/os-autoinst/openQA/7869/workflows/ea038df5-5d73-4559-8ac8-3ad19a68de73/jobs/74362?invite=true#step-104-963 for https://github.com/os-autoinst/openQA/pull/4251 I see that still an old version of os-autoinst-devel is installed
(263/264) Installing: os-autoinst-devel-4.6.1589880721.55a008dd-lp152.1.13.x86_64 [/](263/264) Installing: os-autoinst-devel-4.6.1589880721.55a008dd-lp152.1.13.x86_64 [done]
trying to reproduce this locally in the container registry.opensuse.org/devel/openqa/ci/containers/base:latest I see
2e14fdb3da3e:/opt/testing_area # zypper in --dry-run os-autoinst-devel Loading repository data... Reading installed packages... Resolving package dependencies... Problem: os-autoinst-devel-4.6.1632799442.f77d4e14-lp152.864.1.x86_64 requires qemu >= 4.0, but this requirement cannot be provided not installable providers: qemu-4.2.0-lp152.8.1.x86_64[repo-oss] qemu-4.2.1-lp152.9.12.1.x86_64[repo-update] qemu-4.2.1-lp152.9.16.2.x86_64[repo-update] qemu-4.2.1-lp152.9.20.1.x86_64[repo-update] qemu-4.2.1-lp152.9.3.1.x86_64[repo-update] qemu-4.2.1-lp152.9.6.1.x86_64[repo-update] qemu-4.2.1-lp152.9.9.2.x86_64[repo-update] Solution 1: Following actions will be done: install qemu-4.2.1-lp152.9.20.1.x86_64 (with vendor change) obs://build.opensuse.org/devel:openQA --> openSUSE install qemu-x86-4.2.1-lp152.9.20.1.x86_64 (with vendor change) obs://build.opensuse.org/devel:openQA --> openSUSE install qemu-kvm-4.2.1-lp152.9.20.1.x86_64 (with vendor change) obs://build.opensuse.org/devel:openQA --> openSUSE Solution 2: do not install os-autoinst-devel-4.6.1632799442.f77d4e14-lp152.864.1.x86_64 Solution 3: break os-autoinst-devel-4.6.1632799442.f77d4e14-lp152.864.1.x86_64 by ignoring some of its dependencies
so there seems to be a problem about the qemu package dependency.
The container image we use for generating dependency update PRs is https://build.opensuse.org/package/show/devel:openQA:ci/base but https://build.opensuse.org/project/show/devel:openQA:ci included a link to qemu from openSUSE:Leap:15.1:Update (yes, 15.1!), which obviously we did as workaround for some problem at that time but I don't remember which one. Now Leap 15.2 has newer versions for qemu but in the base we still had the old version qemu-3.1.1.1 from 15.1 installed which can explain the conflict asking for a vendor change. I removed the linked package and "base" is currently rebuilding.
EDIT: "base" image is rebuilt. I added a description to https://build.opensuse.org/package/show/devel:openQA:ci/base to explain my changes as well. Trying out the refreshed container image locally works fine. I retriggered the nightly pipeline on CI: https://app.circleci.com/pipelines/github/os-autoinst/openQA/7869/workflows/4ece0f1a-d3fb-4692-b856-584cf4d6dc66/jobs/74398
the job did not create a new specific pull request. Reading the code I assume that is because actually no dependencies change. Could that be?
EDIT: 2021-09-25 1340Z: We will see tomorrow if a new automatic dependency PR would be created. In the meantime the actual full-stack.t problems can further be investigated.
#20
Updated by cdywan over 1 year ago
Still locally re-running full-stack.t off my full_stack_t_results_panel_context branch since I don't think I've caught all possible issues but still can't force the failures. Meanwhile poking at #98898 since I re-considered if it's really blocked and I don't just want to spend time looking at passing tests
#21
Updated by cdywan over 1 year ago
- Status changed from In Progress to Feedback
cdywan wrote:
Still locally re-running full-stack.t off my full_stack_t_results_panel_context branch since I don't think I've caught all possible issues but still can't force the failures. Meanwhile poking at #98898 since I re-considered if it's really blocked and I don't just want to spend time looking at passing tests
Results are looking stable for me locally, and even in CI. I updated the branch and added missing details to the description, and I'd like to get this merged in any case.
#22
Updated by cdywan over 1 year ago
I didn't get to take review comments into account because I was investigating #99195, hoping to do that today
#23
Updated by cdywan over 1 year ago
Looks like the myjsonrpc error is back. Unfortunately despite setting DEBUG_JSON
in start_worker
now I don't see the ($$) send_json($fd) JSON=$copy
message I would expect 🤨️
# 0 FAILED--Further testing stopped. Retry 5 of 5 … [16:35:16] t/full-stack.t .. 21/? # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538136370, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538136559, # }, # ] # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538137700, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538137700, # }, # ] [16:35:16] t/full-stack.t .. 25/? # javascript console output, waiting for paused: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538142773, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538142773, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538142773, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/D6Ug4fkKaODmApGW/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1633538142773, # }, # ] [16:35:16] t/full-stack.t .. 26/? # stopped waiting for '(?^u:Result: passed)', result turned out to be 'failed' # full result panel contents: # 0 1 # Result: failed finished less than a minute ago ( 00:10 minutes ) # Reason: isotovideo done: Can't use an undefined value as a symbol reference at /home/squamata/os-autoinst/myjsonrpc.pm line 42. # Scheduled product: job has not been created by posting an ISO # Assigned worker: d7cabfd85cc4:1 # Failed test 'Expected result for job 1 not found' # at /home/squamata/project/t/lib/OpenQA/Test/FullstackUtils.pm line 99. # Failed test 'job 1 passed' # at t/full-stack.t line 192. # 0 Bailout called. Further testing stopped: 0
#24
Updated by cdywan over 1 year ago
- find fullstack.t autoinst.txt and preserve it somewhere persistently
- attempt to reproduce "isotovideo done: Can't use an undefined value as a symbol" in os-autoinst
- maybe the console closing caused an invalid file handle
- shorten the fullstack test?
#25
Updated by cdywan over 1 year ago
I'm splitting out fixes now, according to specific issues:
https://github.com/os-autoinst/openQA/pull/4285
[13:04:01] t/full-stack.t .. 98/? Bailout called. Further testing stopped: clickElement: element not interactable at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm:81 FAILED--Further testing stopped: clickElement: element not interactable at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm:81 make[2]: *** [Makefile:195: test-unit-and-integration] Error 255 make[2]: Leaving directory '/home/squamata/project' make[1]: *** [Makefile:190: test-with-database] Error 2 make[1]: Leaving directory '/home/squamata/project' make: *** [Makefile:179: test-fullstack] Error 2 Exited with code exit status 2
https://github.com/os-autoinst/openQA/pull/4286
t/full-stack.t .. 36/? Bailout called. Further testing stopped: findElement: no such element: Unable to locate element: {"method":"css selector","selector":"#info_box .card-body"} at /home/kalikiana/openQA-dev/repos/openQA/t/lib/OpenQA/SeleniumTest.pm:81
#26
Updated by okurz over 1 year ago
- Due date changed from 2021-10-08 to 2021-10-22
- Priority changed from Urgent to High
Seems like in master there are next to no full-stack failures hitting non-related PRs so reducing prio. For the still open PRs I am confident we can get them merged within the next two weeks
#27
Updated by cdywan over 1 year ago
Split off another PR based on a commit adding an uncaught console error, but instead proposing to make console errors consistently fail tests: https://github.com/os-autoinst/openQA/pull/4288
#28
Updated by cdywan over 1 year ago
cdywan wrote:
[13:04:01] t/full-stack.t .. 98/? Bailout called. Further testing stopped: clickElement: element not interactable at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm:81 FAILED--Further testing stopped: clickElement: element not interactable at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm:81https://github.com/os-autoinst/openQA/pull/4286
t/full-stack.t .. 36/? Bailout called. Further testing stopped: findElement: no such element: Unable to locate element: {"method":"css selector","selector":"#info_box .card-body"} at /home/kalikiana/openQA-dev/repos/openQA/t/lib/OpenQA/SeleniumTest.pm:81
Alternative, global approach to element find timeouts https://github.com/os-autoinst/openQA/pull/4293
#29
Updated by okurz over 1 year ago
both last mentioned PRs still open. https://app.circleci.com/pipelines/github/os-autoinst/openQA/8095/workflows/51b61f49-36f3-43a9-961a-715ee2af25c3/jobs/76539/parallel-runs/0/steps/0-111 in one of my pull requests looks lkie related:
[17:00:39] t/full-stack.t .. 23/? # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/wUn0NgQsFzk9IVSB/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1634230882315, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/wUn0NgQsFzk9IVSB/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1634230882682, # }, # ] # javascript console output, waiting for connection opened: [ # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/wUn0NgQsFzk9IVSB/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1634230883918, # }, # { # level => "SEVERE", # message => "http://localhost:9526/asset/88ed2c606e/ws_console.js 8 WebSocket connection to 'ws://127.0.0.1:20013/wUn0NgQsFzk9IVSB/ws' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED", # source => "network", # timestamp => 1634230883918, # }, # ]
#30
Updated by okurz over 1 year ago
I found again annoying full-stack problems, e.g. in
https://app.circleci.com/pipelines/github/os-autoinst/openQA/8119/workflows/52e0fa2d-164f-475b-ab07-68129f483bca/jobs/76744/steps
I manually retriggered and tests worked so I created https://github.com/os-autoinst/openQA/pull/4314 to add even more retries.
I also created
https://github.com/os-autoinst/os-autoinst/pull/1827
to introduce signatures in os-autoinst myjsonrpc. This might help us with the ugly error
# Reason: isotovideo done: Can't use an undefined value as a symbol reference at /home/squamata/os-autoinst/myjsonrpc.pm line 42.
But likely it won't help because there will be an argument supplied only that the socket behind that is actually undef.
#31
Updated by cdywan over 1 year ago
https://github.com/os-autoinst/openQA/pull/4286
t/full-stack.t .. 36/? Bailout called. Further testing stopped: findElement: no such element: Unable to locate element: {"method":"css selector","selector":"#info_box .card-body"} at /home/kalikiana/openQA-dev/repos/openQA/t/lib/OpenQA/SeleniumTest.pm:81
Still scratching my head on how this somehow reduces coverage and trying things out. I don't even understand how this can affect coverage (see comments on the PR).
Alternative, global approach to element find timeouts https://github.com/os-autoinst/openQA/pull/4293
Changed this to drop implicit timeouts instead. Still confirming necessary fixes for tests that were relying on the 2s delay.
cdywan wrote:
Proposed an attempt at improving diagnostics to add context:
- the job we're waiting on
- subtests to allow failures to retain context
- JSON errors from os-autoinst
Merged. Most importantly this will reveal logs of failed "jobs" in full-stack.t in CI (and locally w/o special options).
#32
Updated by cdywan over 1 year ago
- Due date changed from 2021-10-22 to 2021-11-12
- Priority changed from High to Normal
As discussed briefly in the weekly, I'm reducing the priority since PRs don't fail due to retries. I plan to reduce retries again once my outstanding PRs addressing racy element interactions are merged.
#33
Updated by cdywan over 1 year ago
https://github.com/os-autoinst/os-autoinst/pull/1828 got merged meaning jobs should fail with myjsonprc: called on undefined file descriptor
rather than isotovideo done: Can't use an undefined value as a symbol reference at
. in combination with https://github.com/os-autoinst/openQA/pull/4243 I'm hoping to be able to see why
#34
Updated by okurz over 1 year ago
- Related to action #101734: Prevent "called on undefined file descriptor" in myjsonrpc.pm:40 in openQA t/full-stack.t added
#35
Updated by cdywan over 1 year ago
Based on an idea in conversation I prepared a proof of concept so we can more easily test stability w/o having to modify CI config files in separate branches that we won't want to merge and simply add e.g. #fullstack-stability
to unmodified PRs.
#36
Updated by cdywan over 1 year ago
- Status changed from Feedback to In Progress
Should set this to In Progress.
#37
Updated by cdywan over 1 year ago
- Due date changed from 2021-11-12 to 2021-11-19
cdywan wrote:
Based on an idea in conversation I prepared a proof of concept so we can more easily test stability w/o having to modify CI config files in separate branches that we won't want to merge and simply add e.g.
#fullstack-stability
to unmodified PRs.
Ended up preparing a reference PR anyway since the GHA isn't fully working. Was confused by errors also mentioned in #102332. But this is also because we want to be conservative here and I wanted to demonstrate that we do improve, and not somehow add new instabilities.
#38
Updated by okurz over 1 year ago
- Priority changed from Normal to High
t/full-stack.t is becoming more and more of a problem, e.g. see https://app.circleci.com/pipelines/github/os-autoinst/openQA/8392/workflows/6748c60b-4966-4db6-806b-415da997d321/jobs/79281 from https://github.com/os-autoinst/openQA/pull/4360 . I suggest more people to collaborate on this ticket.
#39
Updated by cdywan over 1 year ago
- Due date deleted (
2021-11-19) - Status changed from In Progress to New
- Assignee deleted (
cdywan)
okurz wrote:
t/full-stack.t is becoming more and more of a problem, e.g. see https://app.circleci.com/pipelines/github/os-autoinst/openQA/8392/workflows/6748c60b-4966-4db6-806b-415da997d321/jobs/79281 from https://github.com/os-autoinst/openQA/pull/4360 . I suggest more people to collaborate on this ticket.
# Failed test 'Expected result for job 1 not found'
And that's yet another failure. I'm setting this back to New now so it can be re-assessed. And either we agree on what exact error(s) will be fixed here, or it needs to be an epic.
#40
Updated by cdywan over 1 year ago
cdywan wrote:
okurz wrote:
t/full-stack.t is becoming more and more of a problem, e.g. see https://app.circleci.com/pipelines/github/os-autoinst/openQA/8392/workflows/6748c60b-4966-4db6-806b-415da997d321/jobs/79281 from https://github.com/os-autoinst/openQA/pull/4360 . I suggest more people to collaborate on this ticket.
# Failed test 'Expected result for job 1 not found'And that's yet another failure. I'm setting this back to New now so it can be re-assessed. And either we agree on what exact error(s) will be fixed here, or it needs to be an epic.
- merge https://github.com/os-autoinst/openQA/pull/4288 first
- a PR to expose worker log in the
schedule job
subtest - drop https://github.com/os-autoinst/openQA/pull/4293 since it might cause new issues and doesn't immediately help here
#41
Updated by cdywan over 1 year ago
- Tracker changed from action to coordination
- Subject changed from t/full-stack.t can timeout, "clickElement: element not interactable" size:M to [epic] t/full-stack.t sporadically fails "clickElement: element not interactable" and other errors
- Status changed from New to Blocked
- Assignee set to cdywan
Making this an epic now
#42
Updated by cdywan over 1 year ago
- Copied to action #102578: [sporadic] t/full-stack.t Failed test 'Expected result for job 1 not found' size:M added
#43
Updated by cdywan over 1 year ago
- Copied to coordination #102581: Proof of concept of t/full-stack.t on GitHubActions added
#44
Updated by cdywan over 1 year ago
- Status changed from Blocked to Workable
- Assignee deleted (
cdywan)
All subtasks resolved. Now to pick another one or prove that we're already good (not filing a ticket yet)
#45
Updated by mkittler over 1 year ago
- Assignee set to mkittler
Ok, so that's just about evaluating the current state at this point, possibly creating further tickets. I'll create a stability test PR.
#46
Updated by mkittler over 1 year ago
- Status changed from Workable to In Progress
#47
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#48
Updated by mkittler over 1 year ago
- Status changed from In Progress to Feedback
It seems stable, PR to remove retry: https://github.com/os-autoinst/openQA/pull/4463
#49
Updated by okurz over 1 year ago
- Status changed from Feedback to Resolved
merged. I can resolve because if this won't stay stable we will see
#50
Updated by mkittler over 1 year ago
- Status changed from Resolved to In Progress
It isn't stable after all, see: https://github.com/os-autoinst/openQA/pull/4468#issuecomment-1019201379
PR for reverting the retry: https://github.com/os-autoinst/openQA/pull/4474
# [2022-01-22T08:07:01.494359Z] [debug] [pid:488] REST-API call: POST http://localhost:9526/api/v1/jobs/1/status # [2022-01-22T08:07:01.528597Z] [info] [pid:831] Uploading video.ogv # [2022-01-22T08:07:01.529126Z] [debug] [pid:831] Uploading artefact video.ogv # [2022-01-22T08:07:01.566264Z] [info] [pid:831] Uploading vars.json # [2022-01-22T08:07:01.566529Z] [debug] [pid:831] Uploading artefact vars.json # [2022-01-22T08:07:01.594899Z] [info] [pid:831] Uploading autoinst-log.txt # [2022-01-22T08:07:01.595178Z] [debug] [pid:831] Uploading artefact autoinst-log.txt # [2022-01-22T08:07:01.621786Z] [info] [pid:831] Uploading worker-log.txt # [2022-01-22T08:07:01.622061Z] [debug] [pid:831] Uploading artefact worker-log.txt Bailout called. Further testing stopped: Job 1 produced the wrong results # Failed test 'no (unexpected) warnings (via END block)' # at /usr/lib/perl5/5.26.1/Test/Builder.pm line 135. # Got the following unexpected warnings: # 1: Use of uninitialized value in subroutine entry at t/full-stack.t line 213.
#51
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#52
Updated by mkittler over 1 year ago
- Status changed from In Progress to Blocked
I've created a separate ticket for that problem and added it as subtask here: #105429
#53
Updated by mkittler over 1 year ago
I found yet another different problem:
# [2022-02-03T10:15:13.295311Z] [debug] [pid:10464] Updating status so job 8 is not considered dead. # [2022-02-03T10:15:13.295796Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status # [2022-02-03T10:15:18.449034Z] [debug] [pid:10464] Updating status so job 8 is not considered dead. # [2022-02-03T10:15:18.449608Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status # [2022-02-03T10:15:18.509589Z] [error] [pid:10464] Unable to setup job 8: Failed to download non-existent.qcow2 to /tmp/kCSyGtlI6Z/full-stack.d/cache/localhost/non-existent.qcow2 # [2022-02-03T10:15:18.509862Z] [debug] [pid:10464] Stopping job 8 from http://localhost:9526: 00000008-tinycore-1-flavor-i386-Build1-core@coolone - reason: setup failure # [2022-02-03T10:15:18.510555Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status # [2022-02-03T10:15:18.559664Z] [info] [pid:10711] Uploading autoinst-log.txt # [2022-02-03T10:15:18.560230Z] [debug] [pid:10711] Uploading artefact autoinst-log.txt # [2022-02-03T10:15:18.599186Z] [info] [pid:10711] Uploading worker-log.txt # [2022-02-03T10:15:18.599561Z] [debug] [pid:10711] Uploading artefact worker-log.txt Bailout called. Further testing stopped: Job 8 produced the wrong results FAILED--Further testing stopped: Job 8 produced the wrong results
/tmp/kCSyGtlI6Z/full-stack.d/openqa/testresults/00000/00000008-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: [2022-02-03T10:15:13.277341Z] [info] [pid:10464] +++ setup notes +++ [2022-02-03T10:15:13.277703Z] [info] [pid:10464] Running on db196f50fa76:1 (Linux 5.11.0-1027-aws #30~20.04.1-Ubuntu SMP Thu Jan 13 11:46:53 UTC 2022 x86_64) [2022-02-03T10:15:13.287826Z] [debug] [pid:10464] Found HDD_1, caching non-existent.qcow2 [2022-02-03T10:15:13.295079Z] [info] [pid:10464] Downloading non-existent.qcow2, request #4 sent to Cache Service [2022-02-03T10:15:18.506652Z] [info] [pid:10464] Download of non-existent.qcow2 processed [2022-02-03T10:15:18.509292Z] [error] [pid:10464] Failed to download non-existent.qcow2 to /tmp/kCSyGtlI6Z/full-stack.d/cache/localhost/non-existent.qcow2 [2022-02-03T10:15:18.554671Z] [info] [pid:10464] +++ worker notes +++ [2022-02-03T10:15:18.554993Z] [info] [pid:10464] End time: 2022-02-03 10:15:18 [2022-02-03T10:15:18.555159Z] [info] [pid:10464] Result: setup failure [2022-02-03T10:15:18.559875Z] [info] [pid:10711] Uploading autoinst-log.txt /tmp/kCSyGtlI6Z/full-stack.d/openqa/testresults/00000/00000008-tinycore-1-flavor-i386-Build1-core@coolone/worker-log.txt: [2022-02-03T10:15:13.276895Z] [debug] [pid:10464] Preparing Mojo::IOLoop::ReadWriteProcess::Session [2022-02-03T10:15:13.278085Z] [debug] [pid:10464] Job settings: [2022-02-03T10:15:13.278316Z] [debug] [pid:10464] ARCH=i386 BUILD=1 CACHEDIRECTORY=/tmp/kCSyGtlI6Z/full-stack.d/cache CACHELIMIT=50 DISTRI=tinycore FLAVOR=flavor HDD_1=non-existent.qcow2 INTEGRATION_TESTS=1 ISO=Core-7.2.iso JOBTOKEN=uqtpW9NvKanR895n LOCAL_UPLOAD=0 LOG_LEVEL=debug MACHINE=coolone NAME=00000008-tinycore-1-flavor-i386-Build1-core@coolone OPENQA_HOSTNAME=localhost:9526 OPENQA_URL=http://localhost:9526 PRJDIR=/tmp/kCSyGtlI6Z/full-stack.d/openqa/share PUBLISH_HDD_1=core-hdd.qcow2 QEMU=i386 QEMUPORT=20012 QEMU_NO_FDC_SET=1 QEMU_NO_KVM=1 QEMU_NO_TABLET=1 RETRY_DELAY=5 RETRY_DELAY_IF_WEBUI_BUSY=60 TEST=core UEFI_PFLASH_VARS=/usr/share/qemu/ovmf-x86_64.bin VERSION=1 VNC=91 WORKER_CLASS=qemu_i386,qemu_x86_64 WORKER_ID=1 WORKER_INSTANCE=1 [2022-02-03T10:15:13.295311Z] [debug] [pid:10464] Updating status so job 8 is not considered dead. [2022-02-03T10:15:13.295796Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status [2022-02-03T10:15:18.449034Z] [debug] [pid:10464] Updating status so job 8 is not considered dead. [2022-02-03T10:15:18.449608Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status [2022-02-03T10:15:18.509589Z] [error] [pid:10464] Unable to setup job 8: Failed to download non-existent.qcow2 to /tmp/kCSyGtlI6Z/full-stack.d/cache/localhost/non-existent.qcow2 [2022-02-03T10:15:18.509862Z] [debug] [pid:10464] Stopping job 8 from http://localhost:9526: 00000008-tinycore-1-flavor-i386-Build1-core@coolone - reason: setup failure [2022-02-03T10:15:18.510555Z] [debug] [pid:10464] REST-API call: POST http://localhost:9526/api/v1/jobs/8/status [2022-02-03T10:15:18.559664Z] [info] [pid:10711] Uploading autoinst-log.txt [2022-02-03T10:15:18.560230Z] [debug] [pid:10711] Uploading artefact autoinst-log.txt [2022-02-03T10:15:18.599186Z] [info] [pid:10711] Uploading worker-log.txt [2022-02-03T10:15:18.599561Z] [debug] [pid:10711] Uploading artefact worker-log.txt ok 36 - no (unexpected) warnings (via END block)
#54
Updated by mkittler over 1 year ago
And yet another type of error:
[16:04:54] t/full-stack.t .. 35/? getElementText: stale element reference: element is not attached to the page document at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm:80 at /home/squamata/project/t/lib/OpenQA/SeleniumTest.pm line 83. OpenQA::SeleniumTest::__ANON__(Test::Selenium::Chrome=HASH(0x55903dd0fc40), "Error while executing command: getElementText: stale element "..., HASH(0x55903e1af408)) called at /usr/lib/perl5/vendor_perl/5.26.1/Selenium/Remote/Driver.pm line 356 Selenium::Remote::Driver::catch {...} ("Error while executing command: getElementText: stale element "...) called at /usr/lib/perl5/vendor_perl/5.26.1/Try/Tiny.pm line 123 Try::Tiny::try(CODE(0x55903df6f838), Try::Tiny::Catch=REF(0x55903e1b64c8)) called at /usr/lib/perl5/vendor_perl/5.26.1/Selenium/Remote/Driver.pm line 361 Selenium::Remote::Driver::__ANON__(CODE(0x55903bfbfd50), Test::Selenium::Chrome=HASH(0x55903dd0fc40), HASH(0x55903e1af408)) called at (eval 1653)[/usr/lib/perl5/vendor_perl/5.26.1/Class/Method/Modifiers.pm:89] line 1 Selenium::Remote::Driver::__ANON__(Test::Selenium::Chrome=HASH(0x55903dd0fc40), HASH(0x55903e1af408)) called at (eval 1655)[/usr/lib/perl5/vendor_perl/5.26.1/Class/Method/Modifiers.pm:148] line 2 Selenium::Remote::Driver::_execute_command(Test::Selenium::Chrome=HASH(0x55903dd0fc40), HASH(0x55903e1af408)) called at (eval 1614)[/usr/lib/perl5/vendor_perl/5.26.1/Sub/Quote.pm:3] line 17 Selenium::Remote::WebElement::_execute_command(Test::Selenium::Remote::WebElement=HASH(0x55903e1b63d8), HASH(0x55903e1af408)) called at /usr/lib/perl5/vendor_perl/5.26.1/Selenium/Remote/WebElement.pm line 358 Selenium::Remote::WebElement::get_text(Test::Selenium::Remote::WebElement=HASH(0x55903e1b63d8)) called at /home/squamata/project/t/lib/OpenQA/Test/FullstackUtils.pm line 64 OpenQA::Test::FullstackUtils::find_status_text(Test::Selenium::Chrome=HASH(0x55903dd0fc40)) called at /home/squamata/project/t/lib/OpenQA/Test/FullstackUtils.pm line 82 OpenQA::Test::FullstackUtils::wait_for_result_panel(Test::Selenium::Chrome=HASH(0x55903dd0fc40), qr(Result: passed)u, "job 6") called at t/full-stack.t line 389 main::__ANON__() called at /usr/lib/perl5/5.26.1/Test/Builder.pm line 309 eval {...} called at /usr/lib/perl5/5.26.1/Test/Builder.pm line 309 Test::Builder::subtest(Test::Builder=HASH(0x5590316b2748), "Cache tests", CODE(0x55903df774f0)) called at /usr/lib/perl5/5.26.1/Test/More.pm line 807 Test::More::subtest("Cache tests", CODE(0x55903df774f0)) called at t/full-stack.t line 438 [16:04:54] t/full-stack.t .. 36/? # Tests were run but no plan was declared and done_testing() was not seen. [16:04:54] t/full-stack.t .. Dubious, test returned 254 (wstat 65024, 0xfe00)
[2022-02-03T16:06:52.286318Z] [debug] [pid:2263] REST-API call: POST http://localhost:9526/api/v1/jobs/5/status [2022-02-03T16:06:52.319892Z] [info] [pid:2479] Uploading video.ogv [2022-02-03T16:06:52.321064Z] [debug] [pid:2479] Uploading artefact video.ogv [2022-02-03T16:06:52.363264Z] [info] [pid:2479] Uploading vars.json [2022-02-03T16:06:52.363544Z] [debug] [pid:2479] Uploading artefact vars.json [2022-02-03T16:06:52.390218Z] [info] [pid:2479] Uploading autoinst-log.txt [2022-02-03T16:06:52.390523Z] [debug] [pid:2479] Uploading artefact autoinst-log.txt [2022-02-03T16:06:52.684203Z] [info] [pid:2479] Uploading worker-log.txt [2022-02-03T16:06:52.684481Z] [debug] [pid:2479] Uploading artefact worker-log.txt /tmp/awiY7cjgy8/full-stack.d/openqa/testresults/00000/00000006-tinycore-1-flavor-i386-Build1-core@coolone/autoinst-log.txt: [2022-02-03T16:07:04.896658Z] [info] [pid:2487] +++ setup notes +++ [2022-02-03T16:07:04.896908Z] [info] [pid:2487] Running on ee00648f165c:1 (Linux 5.11.0-1027-aws #30~20.04.1-Ubuntu SMP Thu Jan 13 11:46:53 UTC 2022 x86_64) [2022-02-03T16:07:04.903808Z] [debug] [pid:2487] Found ISO, caching Core-7.2.iso [2022-02-03T16:07:04.910452Z] [info] [pid:2487] Downloading Core-7.2.iso, request #2 sent to Cache Service [2022-02-03T16:07:10.039168Z] [info] [pid:2487] Download of Core-7.2.iso processed [2022-02-03T16:07:10.049841Z] [debug] [pid:2504] +++ worker notes +++ [37m[2022-02-03T16:07:12.150802Z] [debug] Current version is a7103b16a88ac5669f56dab460989dea6c74d9fa [interface v24] [0m[37m[2022-02-03T16:07:12.160678Z] [debug] git hash in tinycore: a7103b16a88ac5669f56dab460989dea6c74d9fa [0m[37m[2022-02-03T16:07:12.258537Z] [debug] scheduling boot tests/boot.pm [0m[37m[2022-02-03T16:07:12.259277Z] [debug] scheduling assert_screen tests/assert_screen.pm [0m[37m[2022-02-03T16:07:12.259850Z] [debug] scheduling shutdown tests/shutdown.pm [0m[2022-02-03T16:07:12.266159Z] [info] cmdsrv: daemon reachable under http://*:20013/PHWjKijMCBhcl8mk/ [2022-02-03T16:07:12.268301Z] [info] Listening at "http://0.0.0.0:20013" Web application available at http://0.0.0.0:20013 [37m[2022-02-03T16:07:12.269858Z] [debug] git hash in tinycore/needles: a7103b16a88ac5669f56dab460989dea6c74d9fa [0m[37m[2022-02-03T16:07:12.269988Z] [debug] init needles from tinycore/needles [0m[37m[2022-02-03T16:07:12.271322Z] [debug] loaded 4 needles [0m[37m[2022-02-03T16:07:12.274362Z] [debug] Blocking SIGCHLD and SIGTERM [0m[37m[2022-02-03T16:07:12.449349Z] [debug] Unblocking SIGCHLD and SIGTERM [0m[37m[37m[2022-02-03T16:07:12.565572Z] [debug] 2504: channel_out 15, channel_in 14 [2022-02-03T16:07:12.565578Z] [debug] Blocking SIGCHLD and SIGTERM [0m[0m[37m[2022-02-03T16:07:12.737782Z] [debug] Unblocking SIGCHLD and SIGTERM [0m[37m[2022-02-03T16:07:12.739436Z] [debug] 2587: cmdpipe 13, rsppipe 16 [0m[37m[2022-02-03T16:07:12.739544Z] [debug] started mgmt loop with pid 2587 [0m[37m[2022-02-03T16:07:12.831209Z] [debug] qemu version detected: 5.2.0 [0m[37m[2022-02-03T16:07:12.832866Z] [debug] running `/usr/bin/chattr +C /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid` [0m[37m[2022-02-03T16:07:12.854062Z] [debug] Fatal error in command `/usr/bin/chattr +C /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid`: open3: exec of /usr/bin/chattr +C /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid failed: No such file or directory at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 127. [0m[37m[2022-02-03T16:07:12.854171Z] [debug] Configuring storage controllers and block devices [0m[37m[2022-02-03T16:07:12.854760Z] [debug] running `/usr/bin/qemu-img info --output=json /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/Core-7.2.iso` [0m[37m[2022-02-03T16:07:12.939700Z] [debug] Initializing block device images [0m[37m[2022-02-03T16:07:12.939937Z] [debug] running `/usr/bin/qemu-img create -f qcow2 /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/hd0 10G` [0m[37m[2022-02-03T16:07:12.954196Z] [debug] Formatting '/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/hd0', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=10737418240 lazy_refcounts=off refcount_bits=16 [0m[37m[2022-02-03T16:07:12.954325Z] [debug] running `/usr/bin/qemu-img create -f qcow2 -F raw -b /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/Core-7.2.iso /tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/cd0-overlay0 11116544` [0m[37m[2022-02-03T16:07:12.967019Z] [debug] Formatting '/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/cd0-overlay0', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=11116544 backing_file=/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/Core-7.2.iso backing_fmt=raw lazy_refcounts=off refcount_bits=16 [0m[37m[2022-02-03T16:07:12.967116Z] [debug] init_blockdev_images: Finished creating block devices [0m[37m[2022-02-03T16:07:12.968127Z] [debug] starting: /usr/bin/qemu-system-i386 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -m 1024 -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -boot once=d -smp 1 -no-shutdown -vnc :91,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev pipe,id=virtio_console1,path=virtio_console1,logfile=virtio_console1.log,logappend=on -device virtconsole,chardev=virtio_console1,name=org.openqa.console.virtio_console1 -chardev socket,path=qmp_socket,server=on,wait=off,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-file,filename=/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/hd0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0,file=hd0-file,cache.no-flush=on -device virtio-blk,id=hd0-device,drive=hd0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0 [0m[37m[2022-02-03T16:07:12.972810Z] [debug] Waiting for 0 attempts [0m[37m[2022-02-03T16:07:13.973461Z] [debug] Waiting for 1 attempts [0m[37m[2022-02-03T16:07:14.973862Z] [debug] Finished after 2 attempts [0m[37m[2022-02-03T16:07:14.975814Z] [debug] Establishing VNC connection to localhost:5991 [0m[37m[2022-02-03T16:07:14.978549Z] [debug] pointer type 0 0 640 480 -257 [0m[37m[2022-02-03T16:07:14.978648Z] [debug] led state 0 1 1 -261 [0m[37m[2022-02-03T16:07:15.040928Z] [debug] Start CPU [0m[37m[2022-02-03T16:07:15.041399Z] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"46800","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"5991","websocket":false}},"event":"VNC_CONNECTED","timestamp":{"microseconds":976477,"seconds":1643904434}} [0m[37m[2022-02-03T16:07:15.041581Z] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"46800","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"5991","websocket":false}},"event":"VNC_INITIALIZED","timestamp":{"microseconds":977439,"seconds":1643904434}} [0m[37m[2022-02-03T16:07:15.041723Z] [debug] EVENT {"event":"RESUME","timestamp":{"microseconds":41397,"seconds":1643904435}} [0mGOT GO [37m[2022-02-03T16:07:15.052833Z] [debug] QEMU: QEMU emulator version 5.2.0 (SUSE Linux Enterprise 15) [0m[37m[2022-02-03T16:07:15.052915Z] [debug] QEMU: Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers [0m[37m[2022-02-03T16:07:15.057272Z] [debug] Snapshots are supported [0m[1;34m[2022-02-03T16:07:15.058198Z] [debug] ||| starting boot tests/boot.pm [0m[2022-02-03T16:07:15.060589Z] [debug] tests/boot.pm:9 called testapi::assert_screen [2022-02-03T16:07:15.060877Z] [debug] <<< testapi::assert_screen(mustmatch="core", no_wait=1, timeout=15) [37m[2022-02-03T16:07:15.077950Z] [debug] no match: 14.9s, best candidate: boot-core (0.00) [0m[37m[2022-02-03T16:07:15.178824Z] [debug] no change: 14.8s [0m[37m[2022-02-03T16:07:15.188925Z] [debug] no match: 14.8s, best candidate: boot-core (0.00) [0m[37m[2022-02-03T16:07:15.289769Z] [debug] no change: 14.7s [0m[37m[2022-02-03T16:07:15.296391Z] [debug] no match: 14.7s, best candidate: boot-core (0.00) [0m[37m[2022-02-03T16:07:15.397227Z] [debug] no change: 14.6s [0m[37m[2022-02-03T16:07:15.404120Z] [debug] no match: 14.6s, best candidate: boot-core (0.00) [0m[37m[2022-02-03T16:07:15.504853Z] [debug] no change: 14.5s [0m[37m[2022-02-03T16:07:15.511632Z] [debug] no match: 14.5s, best candidate: boot-core (0.00) [0m[32m[2022-02-03T16:07:15.716122Z] [debug] >>> testapi::_handle_found_needle: found boot-core, similarity 0.97 @ 1/74 [0m[2022-02-03T16:07:15.716663Z] [debug] tests/boot.pm:10 called testapi::send_key [2022-02-03T16:07:15.716797Z] [debug] <<< testapi::send_key(key="ret", wait_screen_change=0, do_wait=0) [2022-02-03T16:07:15.986151Z] [debug] tests/boot.pm:14 called testapi::assert_screen [2022-02-03T16:07:15.986329Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", timeout=90) [37m[2022-02-03T16:07:16.617937Z] [debug] no match: 89.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:17.614789Z] [debug] no match: 88.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:18.615972Z] [debug] no match: 87.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:19.617021Z] [debug] no match: 86.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:20.678866Z] [debug] no match: 85.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:21.618809Z] [debug] no match: 84.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:22.625192Z] [debug] no match: 83.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:23.621412Z] [debug] no match: 82.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:24.631652Z] [debug] no match: 81.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:25.676580Z] [debug] no match: 80.3s, best candidate: boot-on_prompt (0.00) [0m[37m[2022-02-03T16:07:26.635284Z] [debug] no match: 79.3s, best candidate: boot-on_prompt (0.00) [0m[32m[2022-02-03T16:07:27.709840Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 [0m[1;34m[2022-02-03T16:07:27.713184Z] [debug] ||| finished boot tests (runtime: 12 s) [0m[1;34m[2022-02-03T16:07:27.714368Z] [debug] ||| starting assert_screen tests/assert_screen.pm [0m[2022-02-03T16:07:27.714949Z] [debug] tests/assert_screen.pm:9 called testapi::assert_screen [2022-02-03T16:07:27.715067Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", timeout=60) [32m[2022-02-03T16:07:28.700955Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 [0m[2022-02-03T16:07:28.701104Z] [debug] tests/assert_screen.pm:10 called testapi::assert_screen [2022-02-03T16:07:28.701229Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", no_wait=1, timeout=30) [32m[2022-02-03T16:07:28.780114Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 [0m[1;34m[2022-02-03T16:07:28.781313Z] [debug] ||| finished assert_screen tests (runtime: 1 s) [0m[1;34m[2022-02-03T16:07:28.782342Z] [debug] ||| starting shutdown tests/shutdown.pm [0m[2022-02-03T16:07:28.782961Z] [debug] tests/shutdown.pm:8 called testapi::type_string [2022-02-03T16:07:28.783123Z] [debug] <<< testapi::type_string(string="sudo su\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47) [2022-02-03T16:07:29.059956Z] [debug] tests/shutdown.pm:9 called testapi::type_string [2022-02-03T16:07:29.060187Z] [debug] <<< testapi::type_string(string="poweroff\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47) [2022-02-03T16:07:29.365904Z] [debug] tests/shutdown.pm:10 called testapi::assert_shutdown [2022-02-03T16:07:29.366066Z] [debug] <<< testapi::check_shutdown(timeout=90) [37m[2022-02-03T16:07:29.366966Z] [debug] QEMU status is not 'shutdown', it is 'running' [0m[37m[2022-02-03T16:07:30.368374Z] [debug] QEMU status is not 'shutdown', it is 'running' [0m[37m[2022-02-03T16:07:31.369858Z] [debug] QEMU status is not 'shutdown', it is 'running' [0m[37m[2022-02-03T16:07:32.371348Z] [debug] QEMU status is not 'shutdown', it is 'running' [0m[37m[2022-02-03T16:07:33.372743Z] [debug] QEMU status is not 'shutdown', it is 'running' [0m[37m[2022-02-03T16:07:34.373885Z] [debug] EVENT {"data":{"guest":true,"reason":"guest-shutdown"},"event":"SHUTDOWN","timestamp":{"microseconds":261986,"seconds":1643904454}} [0m[37m[2022-02-03T16:07:34.373977Z] [debug] EVENT {"event":"STOP","timestamp":{"microseconds":262632,"seconds":1643904454}} [0m[1;34m[2022-02-03T16:07:34.456451Z] [debug] ||| finished shutdown tests (runtime: 6 s) [0m[37m[2022-02-03T16:07:34.458685Z] [debug] stopping autotest process 2517 [0m[37m[2022-02-03T16:07:34.464505Z] [debug] [autotest] process exited: 0 [0m[37m[2022-02-03T16:07:34.565012Z] [debug] done with autotest process [0m[37m[2022-02-03T16:07:34.565636Z] [debug] stopping command server 2514 because test execution ended [0m[37m[2022-02-03T16:07:34.565748Z] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20013/PHWjKijMCBhcl8mk/broadcast [0m[37m[2022-02-03T16:07:34.579909Z] [debug] commands process exited: 0 [0m[37m[2022-02-03T16:07:34.680350Z] [debug] done with command server [0m[37m[2022-02-03T16:07:34.680448Z] [debug] isotovideo done [0m[37m[2022-02-03T16:07:34.681403Z] [debug] backend shutdown state: 1 [0m[33m[2022-02-03T16:07:34.681857Z] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json [0m[37m[2022-02-03T16:07:35.691802Z] [debug] Passing remaining frames to the video encoder [0m[37m[2022-02-03T16:07:35.716400Z] [debug] Waiting for video encoder to finalize the video [0m[37m[2022-02-03T16:07:35.716484Z] [debug] The built-in video encoder (pid 2657) terminated [0m[37m[2022-02-03T16:07:35.716963Z] [debug] QEMU: qemu-system-i386: terminating on signal 15 from pid 2587 (/home/squamata/os-autoinst/isotovideo: backen) [0m[37m[2022-02-03T16:07:35.717841Z] [debug] sending magic and exit [0m[37m[2022-02-03T16:07:35.718080Z] [debug] received magic close [0m[37m[2022-02-03T16:07:35.732717Z] [debug] backend process exited: 0 [0m[37m[2022-02-03T16:07:35.833760Z] [debug] stopping backend process 2587 [0m[37m[2022-02-03T16:07:35.833835Z] [debug] done with backend process [0m2504: EXIT 0 [2022-02-03T16:07:35.852034Z] [info] [pid:2487] Isotovideo exit status: 0 [2022-02-03T16:07:35.882099Z] [info] [pid:2487] +++ worker notes +++ [2022-02-03T16:07:35.882341Z] [info] [pid:2487] End time: 2022-02-03 16:07:35 [2022-02-03T16:07:35.882437Z] [info] [pid:2487] Result: done [2022-02-03T16:07:35.887545Z] [info] [pid:2716] Uploading video.ogv [2022-02-03T16:07:35.921038Z] [info] [pid:2716] Uploading vars.json [2022-02-03T16:07:35.957560Z] [info] [pid:2716] Uploading autoinst-log.txt /tmp/awiY7cjgy8/full-stack.d/openqa/testresults/00000/00000006-tinycore-1-flavor-i386-Build1-core@coolone/serial0.txt: /tmp/awiY7cjgy8/full-stack.d/openqa/testresults/00000/00000006-tinycore-1-flavor-i386-Build1-core@coolone/serial_terminal.txt: /tmp/awiY7cjgy8/full-stack.d/openqa/testresults/00000/00000006-tinycore-1-flavor-i386-Build1-core@coolone/worker-log.txt: [2022-02-03T16:07:04.896231Z] [debug] [pid:2487] Preparing Mojo::IOLoop::ReadWriteProcess::Session [2022-02-03T16:07:04.897213Z] [debug] [pid:2487] Job settings: [2022-02-03T16:07:04.897376Z] [debug] [pid:2487] ARCH=i386 BUILD=1 CACHEDIRECTORY=/tmp/awiY7cjgy8/full-stack.d/cache CACHELIMIT=50 DISTRI=tinycore FLAVOR=flavor INTEGRATION_TESTS=1 ISO=Core-7.2.iso JOBTOKEN=PHWjKijMCBhcl8mk LOCAL_UPLOAD=0 LOG_LEVEL=debug MACHINE=coolone NAME=00000006-tinycore-1-flavor-i386-Build1-core@coolone OPENQA_HOSTNAME=localhost:9526 OPENQA_URL=http://localhost:9526 PRJDIR=/tmp/awiY7cjgy8/full-stack.d/openqa/share PUBLISH_HDD_1= QEMU=i386 QEMUPORT=20012 QEMU_NO_FDC_SET=1 QEMU_NO_KVM=1 QEMU_NO_TABLET=1 RETRY_DELAY=5 RETRY_DELAY_IF_WEBUI_BUSY=60 TEST=core UEFI_PFLASH_VARS=/usr/share/qemu/ovmf-x86_64.bin VERSION=1 VNC=91 WORKER_CLASS=qemu_i386,qemu_x86_64 WORKER_ID=1 WORKER_INSTANCE=1 [2022-02-03T16:07:04.910738Z] [debug] [pid:2487] Updating status so job 6 is not considered dead. [2022-02-03T16:07:04.911376Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:09.981437Z] [debug] [pid:2487] Updating status so job 6 is not considered dead. [2022-02-03T16:07:09.982041Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:10.041474Z] [debug] [pid:2487] Linked asset "/tmp/awiY7cjgy8/full-stack.d/cache/localhost/Core-7.2.iso" to "/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/Core-7.2.iso" [2022-02-03T16:07:10.041919Z] [debug] [pid:2487] Symlinked from "/tmp/awiY7cjgy8/full-stack.d/openqa/share/tests/tinycore" to "/tmp/awiY7cjgy8/full-stack.d/openqa/pool/1/tinycore" [2022-02-03T16:07:10.042270Z] [info] [pid:2487] Preparing cgroup to start isotovideo [2022-02-03T16:07:10.045917Z] [warn] [pid:2487] Disabling cgroup usage because cgroup creation failed: mkdir /sys/fs/cgroup/systemd/docker: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 84. [2022-02-03T16:07:10.046093Z] [info] [pid:2487] You can define a custom slice with OPENQA_CGROUP_SLICE or indicating the base mount with MOJO_CGROUP_FS. [2022-02-03T16:07:10.046446Z] [info] [pid:2487] Starting isotovideo container [2022-02-03T16:07:10.049353Z] [debug] [pid:2487] Registered process:2504 [2022-02-03T16:07:10.049566Z] [info] [pid:2504] 2504: WORKING 6 [2022-02-03T16:07:10.049902Z] [info] [pid:2487] isotovideo has been started (PID: 2504) [2022-02-03T16:07:10.050549Z] [debug] [pid:2487] Running job 6 from http://localhost:9526: 00000006-tinycore-1-flavor-i386-Build1-core@coolone. [2022-02-03T16:07:10.053494Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:10.102369Z] [debug] [pid:2487] Upload concluded (no current module) [2022-02-03T16:07:20.110815Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:20.198628Z] [debug] [pid:2487] Upload concluded (at boot) [2022-02-03T16:07:30.200409Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:30.324345Z] [debug] [pid:2487] Upload concluded (up to boot) [2022-02-03T16:07:35.852430Z] [debug] [pid:2487] Stopping job 6 from http://localhost:9526: 00000006-tinycore-1-flavor-i386-Build1-core@coolone - reason: done [2022-02-03T16:07:35.853194Z] [debug] [pid:2487] REST-API call: POST http://localhost:9526/api/v1/jobs/6/status [2022-02-03T16:07:35.887363Z] [info] [pid:2716] Uploading video.ogv [2022-02-03T16:07:35.887876Z] [debug] [pid:2716] Uploading artefact video.ogv [2022-02-03T16:07:35.920868Z] [info] [pid:2716] Uploading vars.json [2022-02-03T16:07:35.921776Z] [debug] [pid:2716] Uploading artefact vars.json [2022-02-03T16:07:35.957447Z] [info] [pid:2716] Uploading autoinst-log.txt [2022-02-03T16:07:35.957739Z] [debug] [pid:2716] Uploading artefact autoinst-log.txt [2022-02-03T16:07:35.984632Z] [info] [pid:2716] Uploading worker-log.txt [2022-02-03T16:07:35.984885Z] [debug] [pid:2716] Uploading artefact worker-log.txt ok 36 - no (unexpected) warnings (via END block)
#55
Updated by mkittler over 1 year ago
- Status changed from Blocked to In Progress
I can't say what the problem in #note-53 was about because the logs of job 8 look fine. The test description is misleading, the problem could be anything within the cache sub-test. I've created a PR to improve this: https://github.com/os-autoinst/openQA/pull/4499
#56
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#57
Updated by mkittler over 1 year ago
The 2nd problem is luckily a bit easier. I've created https://github.com/os-autoinst/openQA/pull/4502 which should fix it.
#58
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#59
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#60
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#61
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#62
Updated by openqa_review over 1 year ago
Setting due date based on mean cycle time of SUSE QE Tools
#63
Updated by okurz over 1 year ago
mkittler please create according subtasks for the work on this epic
#64
Updated by mkittler over 1 year ago
- Status changed from In Progress to Feedback
The 2nd problem was quite obvious so I just created a PR. It has been merged so it should be done.
The 1st problem lacked information so I created a PR to generally improve the output in case of test failures. However, I couldn't reproduce the problem again in the CI (see https://github.com/os-autoinst/openQA/pull/4496) but luckily also not any further problems. So I'd just ignore it for now. If we encounter it again we can create a new ticket.
I'm setting the epic to feedback because I'm still waiting for the os-autoinst version to be updated (see #105429#note-18) to remove the retry.
#65
Updated by mkittler over 1 year ago
I could now reproduce a failure which is likely identical to #98952#note-53. So I created a new ticket (#106912) because this time we have the relevant logs (due to https://github.com/os-autoinst/openQA/pull/4499).
#66
Updated by mkittler over 1 year ago
- Status changed from Feedback to Blocked
#67
Updated by mkittler over 1 year ago
Let's give the whole thing a few more test runs (before removing the retry): https://github.com/os-autoinst/openQA/pull/4496
#68
Updated by mkittler over 1 year ago
Unfortunately the test runs revealed yet another occurrence: https://app.circleci.com/pipelines/github/os-autoinst/openQA/9082/workflows/e52c8126-88c9-4814-aa6e-e6152df202fd/jobs/86137/steps (PR: https://github.com/os-autoinst/openQA/pull/4496)
This time it is test 6 but it looks like the usual shutdown
problem:
# [2022-03-01T15:30:43.795768Z] [debug] QEMU status is not 'shutdown', it is 'running' # [2022-03-01T15:30:44.797156Z] [debug] QEMU status is not 'shutdown', it is 'running' # [2022-03-01T15:30:45.863300Z] [info] ::: basetest::runtest: # Test died: Machine didn't shut down! at tinycore/tests/shutdown.pm line 9. # # [2022-03-01T15:30:45.865418Z] [debug] ||| finished shutdown tests (runtime: 91 s) # [2022-03-01T15:30:45.865473Z] [debug] ||| post fail hooks runtime: 0 s
Unfortunately there's no video this time (so I only attached the detailed log). I'm not sure yet why that is. The history of my test commit https://github.com/os-autoinst/openQA/commits/90290d1d840dd7c9bc2d8180e640928e92fdad18 contains https://github.com/os-autoinst/openQA/commit/2850133d19f79ec54c1a1975d1f6fe7636f57d2a so it isn't just that I've forgotten to rebase.
#69
Updated by mkittler over 1 year ago
About the absent video: When rebasing my "test commit" I accidentally dropped some lines in the YAML config for the video upload. So it is just my mistake (and no general problem).
#70
Updated by mkittler about 1 year ago
Now I've got the video. I'll create a ticket about it. (ticket: #107941)
#71
Updated by okurz about 1 year ago
- Parent task set to #80142
#72
Updated by okurz 4 months ago
- Status changed from Blocked to Resolved
With https://github.com/os-autoinst/openQA/commit/832091b81f7fbdab599e6e0c71158697013b67ef t/full-stack.t is consistently stable again, otherwise we would notice from failing CI jobs.