action #96636
closed
static-check-containers is flaky on GHA size:M
Added by livdywan over 3 years ago.
Updated over 3 years ago.
Category:
Regressions/Crashes
Description
Observation¶
I've seen it fail twice now on unrelated PRs neither of which was touching container definitions.
Run make test-check-containers
tools/static_check_containers
# Static check of container/ci/Dockerfile
Unable to find image 'hadolint/hadolint:latest' locally
latest: Pulling from hadolint/hadolint
c0b7b9aa8e26: Pulling fs layer
c0b7b9aa8e26: Verifying Checksum
c0b7b9aa8e26: Download complete
c0b7b9aa8e26: Pull complete
Digest: sha256:a0e201515e9946d4b5bb04a44e4d498b6c5aefac99f1ad106e139a6ff63f6476
Status: Downloaded newer image for hadolint/hadolint:latest
# Static check of container/openqa/Dockerfile
# Static check of container/openqa_data/Dockerfile
# Static check of container/webui/Dockerfile
# Static check of container/webui/Dockerfile-lb
# Static check of container/worker/Dockerfile
make: *** [Makefile:291: test-check-containers] Error 127
Error: Process completed with exit code 2.
e.g. https://github.com/os-autoinst/openQA/runs/3262841890?check_suite_focus=true
Suggestion¶
- Increase verbosity of docker e.g.
-D
- Reproduce the error via
tools/static_check_containers
which calls $cre run --rm -i $img < "$i"
on each Dockerfile
- Description updated (diff)
- Description updated (diff)
- Priority changed from Normal to High
- Target version set to Ready
- Subject changed from static-check-containers is flaky on GHA to static-check-containers is flaky on GHA size:M
- Description updated (diff)
- Status changed from New to Workable
- Status changed from Workable to In Progress
- Assignee set to tinita
I was able to reproduce the error "reliably" with this code:
for i in container/*/Dockerfile*; do
echo "# Static check of $i"
rc=0
for j in {1..200}; do
echo "Loop $j"
$cre -D --log-level=debug run --rm -i $img < "$i" || rc=$?
if (( "$rc" != 0 )); then
echo "Failed ($i) with rc=$rc"
false
fi
done
echo "# $i ok"
done
Output:
https://github.com/perlpunk/openQA/runs/3279348078?check_suite_focus=true
Loop 171
time="2021-08-09T10:46:52Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Loop 172
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Failed (container/webui/Dockerfile-lb) with rc=127
make: *** [Makefile:291: test-check-containers] Error 1
Error: Process completed with exit code 2.
So none of the debug switches are revealing anything.
Running
$cre -D --log-level=debug run --rm --entrypoint /bin/echo alpine:latest $i
instead worked without errors
I was able to reproduce it locally:
Loop 47
DEBU[0000] [hijack] End of stdout
Failed (container/worker/Dockerfile) with rc=127
- Due date set to 2021-08-24
Setting due date based on mean cycle time of SUSE QE Tools
- Status changed from In Progress to Resolved
I wasn't able to find out if there was an earlier hadolint version that didn't have this problem.
It just takes too much iterations until this occurs for me locally.
Also available in: Atom
PDF