action #96636
closedstatic-check-containers is flaky on GHA size:M
Description
Observation¶
I've seen it fail twice now on unrelated PRs neither of which was touching container definitions.
Run make test-check-containers
tools/static_check_containers
# Static check of container/ci/Dockerfile
Unable to find image 'hadolint/hadolint:latest' locally
latest: Pulling from hadolint/hadolint
c0b7b9aa8e26: Pulling fs layer
c0b7b9aa8e26: Verifying Checksum
c0b7b9aa8e26: Download complete
c0b7b9aa8e26: Pull complete
Digest: sha256:a0e201515e9946d4b5bb04a44e4d498b6c5aefac99f1ad106e139a6ff63f6476
Status: Downloaded newer image for hadolint/hadolint:latest
# Static check of container/openqa/Dockerfile
# Static check of container/openqa_data/Dockerfile
# Static check of container/webui/Dockerfile
# Static check of container/webui/Dockerfile-lb
# Static check of container/worker/Dockerfile
make: *** [Makefile:291: test-check-containers] Error 127
Error: Process completed with exit code 2.
e.g. https://github.com/os-autoinst/openQA/runs/3262841890?check_suite_focus=true
Suggestion¶
- Increase verbosity of docker e.g.
-D
- Reproduce the error via
tools/static_check_containers
which calls$cre run --rm -i $img < "$i"
on eachDockerfile
Updated by tinita over 3 years ago
- Priority changed from Normal to High
- Target version set to Ready
Updated by livdywan over 3 years ago
- Subject changed from static-check-containers is flaky on GHA to static-check-containers is flaky on GHA size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by tinita over 3 years ago
- Status changed from Workable to In Progress
- Assignee set to tinita
Updated by tinita over 3 years ago
I was able to reproduce the error "reliably" with this code:
for i in container/*/Dockerfile*; do
echo "# Static check of $i"
rc=0
for j in {1..200}; do
echo "Loop $j"
$cre -D --log-level=debug run --rm -i $img < "$i" || rc=$?
if (( "$rc" != 0 )); then
echo "Failed ($i) with rc=$rc"
false
fi
done
echo "# $i ok"
done
Output:
https://github.com/perlpunk/openQA/runs/3279348078?check_suite_focus=true
Loop 171
time="2021-08-09T10:46:52Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Loop 172
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Failed (container/webui/Dockerfile-lb) with rc=127
make: *** [Makefile:291: test-check-containers] Error 1
Error: Process completed with exit code 2.
So none of the debug switches are revealing anything.
Updated by tinita over 3 years ago
Running
$cre -D --log-level=debug run --rm --entrypoint /bin/echo alpine:latest $i
instead worked without errors
Updated by tinita over 3 years ago
I was able to reproduce it locally:
Loop 47
DEBU[0000] [hijack] End of stdout
Failed (container/worker/Dockerfile) with rc=127
Updated by tinita over 3 years ago
Updated by tinita over 3 years ago
PR: https://github.com/os-autoinst/openQA/pull/4113
I'm trying out older hadolint images to see if there is a difference.
Updated by openqa_review over 3 years ago
- Due date set to 2021-08-24
Setting due date based on mean cycle time of SUSE QE Tools
Updated by tinita over 3 years ago
I wasn't able to find out if there was an earlier hadolint version that didn't have this problem.
It just takes too much iterations until this occurs for me locally.