Project

General

Profile

Actions

action #96636

closed

static-check-containers is flaky on GHA size:M

Added by livdywan over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-08-06
Due date:
2021-08-24
% Done:

0%

Estimated time:

Description

Observation

I've seen it fail twice now on unrelated PRs neither of which was touching container definitions.

Run make test-check-containers
tools/static_check_containers
# Static check of container/ci/Dockerfile
Unable to find image 'hadolint/hadolint:latest' locally
latest: Pulling from hadolint/hadolint
c0b7b9aa8e26: Pulling fs layer
c0b7b9aa8e26: Verifying Checksum
c0b7b9aa8e26: Download complete
c0b7b9aa8e26: Pull complete
Digest: sha256:a0e201515e9946d4b5bb04a44e4d498b6c5aefac99f1ad106e139a6ff63f6476
Status: Downloaded newer image for hadolint/hadolint:latest
# Static check of container/openqa/Dockerfile
# Static check of container/openqa_data/Dockerfile
# Static check of container/webui/Dockerfile
# Static check of container/webui/Dockerfile-lb
# Static check of container/worker/Dockerfile
make: *** [Makefile:291: test-check-containers] Error 127
Error: Process completed with exit code 2.

e.g. https://github.com/os-autoinst/openQA/runs/3262841890?check_suite_focus=true

Suggestion

  • Increase verbosity of docker e.g. -D
  • Reproduce the error via tools/static_check_containers which calls $cre run --rm -i $img < "$i" on each Dockerfile
Actions #1

Updated by livdywan over 2 years ago

  • Description updated (diff)
Actions #2

Updated by tinita over 2 years ago

  • Description updated (diff)
Actions #3

Updated by tinita over 2 years ago

  • Priority changed from Normal to High
  • Target version set to Ready
Actions #4

Updated by livdywan over 2 years ago

  • Subject changed from static-check-containers is flaky on GHA to static-check-containers is flaky on GHA size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #5

Updated by tinita over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to tinita
Actions #6

Updated by tinita over 2 years ago

I was able to reproduce the error "reliably" with this code:

for i in container/*/Dockerfile*; do
    echo "# Static check of $i"
    rc=0
    for j in {1..200}; do
        echo "Loop $j"
        $cre -D --log-level=debug run --rm -i $img < "$i" || rc=$?
        if (( "$rc" != 0 )); then
            echo "Failed ($i) with rc=$rc"
            false
        fi
    done
    echo "# $i ok"
done

Output:
https://github.com/perlpunk/openQA/runs/3279348078?check_suite_focus=true

Loop 171
time="2021-08-09T10:46:52Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Loop 172
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdin"
time="2021-08-09T10:46:53Z" level=debug msg="[hijack] End of stdout"
Failed (container/webui/Dockerfile-lb) with rc=127
make: *** [Makefile:291: test-check-containers] Error 1
Error: Process completed with exit code 2.

So none of the debug switches are revealing anything.

Actions #7

Updated by tinita over 2 years ago

Running

$cre -D --log-level=debug run --rm --entrypoint /bin/echo alpine:latest $i

instead worked without errors

Actions #8

Updated by tinita over 2 years ago

I was able to reproduce it locally:

Loop 47
DEBU[0000] [hijack] End of stdout                       
Failed (container/worker/Dockerfile) with rc=127

Actions #10

Updated by tinita over 2 years ago

PR: https://github.com/os-autoinst/openQA/pull/4113

I'm trying out older hadolint images to see if there is a difference.

Actions #11

Updated by openqa_review over 2 years ago

  • Due date set to 2021-08-24

Setting due date based on mean cycle time of SUSE QE Tools

Actions #12

Updated by tinita over 2 years ago

  • Status changed from In Progress to Resolved

PR was merged

Actions #13

Updated by tinita over 2 years ago

I wasn't able to find out if there was an earlier hadolint version that didn't have this problem.
It just takes too much iterations until this occurs for me locally.

Actions

Also available in: Atom PDF