action #95374
closedAll pull request openQA CI checks fail now in "webui-docker-compose": "Container is unhealthy" size:M
Description
Observation¶
All pull request openQA CI checks fail now in "webui-docker-compose", e.g. see
https://github.com/os-autoinst/openQA/pull/4032/checks?check_run_id=3044565112#step:3:2083
ERROR: for webui Container "3dc2c3dbe671" is unhealthy.
Encountered errors while bringing up the project.
Error executing a job
Steps to reproduce¶
DONE: TBC if this works locally -> reproducible, confirmed by ilausuch
make test-containers-compose
or find failed examples in any open openQA pull requests
Expected result¶
- From https://github.com/os-autoinst/openQA/actions?query=workflow%3Acompose+branch%3Amaster the last good was 4 days ago, last successful run: https://github.com/os-autoinst/openQA/runs/3020712220?check_suite_focus=true
Problem¶
The "last good" PR seems to be https://github.com/os-autoinst/openQA/pull/3987 . Maybe that introduced a problem that after merge our package in devel:openQA was built and only then all subsequent tests were impacted in a harmful way
Suggestions¶
- DONE: Try to reproduce locally -> ilausuch confirmed that it is
- If not, then reproduce in CI
- Try out if reverting https://github.com/os-autoinst/openQA/pull/3987 helps. If it does not, then try to use an older version of packages as baseline in the docker compose test if this is applicable at all
Impact¶
All PR checks fail hence blocking clean merges of new changes.
Updated by okurz over 3 years ago
- Related to action #92092: containers: openQA test eventually fails because of timeouts added
Updated by ilausuch over 3 years ago
- Status changed from New to In Progress
- Assignee set to ilausuch
Updated by okurz over 3 years ago
- Subject changed from All pull request openQA CI checks fail now in "webui-docker-compose": "Container is unhealthy" to All pull request openQA CI checks fail now in "webui-docker-compose": "Container is unhealthy" size:M
- Description updated (diff)
Updated by okurz over 3 years ago
As dcermak is unavailable for some time and because I suspect that this ticket might be caused by https://github.com/os-autoinst/openQA/pull/3987 as well I merged the revert PR https://github.com/os-autoinst/openQA/pull/4033
Updated by ilausuch over 3 years ago
Reason of the failure
webui_db_init_1 | /root/run_openqa.sh: line 6: This: command not found
Updated by ilausuch over 3 years ago
Following the investigation. The line that is failing in the run_openqa.sh is
webui_db_init_1 | ++ su geekotest -c 'PGPASSWORD=openqa psql -h db -U openqa --list | grep -qe openqa'
webui_db_init_1 | + This account is currently not available.
webui_db_init_1 | /root/run_openqa.sh: line 7: This: command not found
webui_db_init_1 | + sleep .1
Updated by okurz over 3 years ago
- Related to action #95296: openQA-in-openQA container tests fail with "/root/run_openqa.sh: line 6: This: command not found" added
Updated by ilausuch over 3 years ago
The problem is related with the gekotest user
I removed the execution of psql using the gekotest and it workded, but fails in the next execution using this user
webui_db_init_1 | + su geekotest -c /usr/share/openqa/script/openqa-webui-daemon
webui_db_init_1 | This account is currently not available.
Updated by ilausuch over 3 years ago
We cannot execute su on a non login user
webui_db_init_1 | geekotest:x:479:479:openQA user:/dev/null:/sbin/nologin
Updated by ilausuch over 3 years ago
I created this PR to extend the information that we have in case of docker-compose up failure
https://github.com/os-autoinst/openQA/pull/4038
Updated by okurz over 3 years ago
https://github.com/os-autoinst/openQA/runs/3045779927?check_suite_focus=true shows success so it seems the revert helped.
Updated by ilausuch over 3 years ago
Thanks,
It could help in the future
https://github.com/os-autoinst/openQA/pull/4039
Updated by okurz over 3 years ago
- Priority changed from Urgent to Normal
multiple PRs were now merged after I retriggered some failed jobs supporting my hypothesis.
@ilausuch I consider the ticket done as we defined it originally. You can still have it and try to find improvements but we can definitely reduce prio now.
Updated by openqa_review over 3 years ago
- Due date set to 2021-07-27
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz over 3 years ago
- Copied to action #95437: The "webui-docker-compose" CI check should fail if the package is impacted by the PR itself in a harmful way added
Updated by ilausuch over 3 years ago
- Status changed from In Progress to Resolved
okurz wrote:
multiple PRs were now merged after I retriggered some failed jobs supporting my hypothesis.
@ilausuch I consider the ticket done as we defined it originally. You can still have it and try to find improvements but we can definitely reduce prio now.
Yes, let's close this ticket. Maybe in the future we'll have other situations but because of other reasons.