coordination #55364: [epic] Let's make codecov reports reliable - openQA Project (public) - openSUSE Project Management Tool

Actions

coordination #55364

closed

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

[epic] Let's make codecov reports reliable

Added by okurz over 5 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Organisational

Target version:

Start date:

2020-09-24

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Description

Observation¶

codecov reports often report about coverage changes which are obviously not related to the actual changes of a PR, e.g. when documentation is changed or as in https://github.com/os-autoinst/openQA/pull/2253#issuecomment-520042072 . It seems some of our tests – maybe the full stack test – introduced a flakyness in coverage.

Suggestions¶

We should try to make the reports more reliable and hence less annoying with big red color when it is not helpful.
Let's try to collect our observations here:

https://github.com/os-autoinst/openQA/pull/2253#issuecomment-520042072
- lib/OpenQA/Worker/Settings.pm 84.9% (-13.21%) arrow_down
- ~~lib/OpenQA/Worker/WebUIConnection.pm 84.42% (+2.51%) arrow_up~~ -> https://github.com/os-autoinst/openQA/pull/3718
https://github.com/os-autoinst/openQA/pull/2135#issuecomment-506641541
- lib/OpenQA/WebSockets.pm 88.4% (-5.8%) arrow_down
- lib/OpenQA/Scheduler/Model/Jobs.pm 86.11% (-2.78%) arrow_down
- lib/OpenQA/Worker/Commands.pm 81.17% (-2.36%) arrow_down
- lib/OpenQA/Utils.pm 95.81% (-0.2%) arrow_down
https://github.com/os-autoinst/openQA/pull/2161#issuecomment-513477665
- lib/OpenQA/Setup.pm 90.83% (ø) arrow_up
- lib/OpenQA/Worker/Settings.pm 84.31% (-13.73%) arrow_down
- lib/OpenQA/Worker/Engines/isotovideo.pm 94.66% (+0.48%) arrow_up
https://github.com/os-autoinst/openQA/pull/2161#issuecomment-513477665
- lib/OpenQA/Setup.pm 90.83% (ø) arrow_up
- lib/OpenQA/Worker/Settings.pm 84.31% (-13.73%) arrow_down
- lib/OpenQA/Worker/Engines/isotovideo.pm 94.66% (+0.48%) arrow_up
https://codecov.io/gh/os-autoinst/openQA/pull/2258/changes
https://github.com/os-autoinst/openQA/pull/2920#issuecomment-611509627
- OpenQA::Schema::Result::Jobs::done is only invoked in some testruns with $args{newbuild}
https://codecov.io/gh/os-autoinst/openQA/pull/3276/changes from https://github.com/os-autoinst/openQA/pull/3276#issuecomment-694707108
- t/lib/OpenQA/SeleniumTest.pm:371
- t/34-developer_mode-unit.t:100-102

Other ideas for improving¶

Mark individual sections as required to have 100% code coverage (how to do that?)
Aim for 100% statement coverage at least
Include coverage for javascript
Check coverage for perl template code

Subtasks 7 (0 open — 7 closed)

Related issues 2 (0 open — 2 closed)

Actions

#1

Updated by okurz over 5 years ago

Description updated (diff)

Actions

#2

Updated by okurz over 5 years ago

Description updated (diff)

Actions

#3

Updated by okurz over 5 years ago

It also seems like we are getting duplicate codecov reports which I understand can happen if both an old bot are activated along with the github app? As the bot runs under the account "coolo" we can look into this with him.

Actions

#4

Updated by okurz over 5 years ago

Copied to action #55643: Prevent duplicate codecov reports added

Actions

#5

Updated by andriinikitin over 5 years ago

I have strong impression that this happens when PR is not rebased to latest master.
I.e. codecov compares coverage stats in PR with current coverage on master, which will look incorrect if base of PR is behind several commits

Actions

#6

Updated by okurz over 5 years ago

yes, I thought the same but it still seems as if the "usual suspects" crop up, e.g. lib/OpenQA/Worker/Settings.pm even though we hardly ever touch it.

Actions

#7

Updated by okurz over 5 years ago

Related to action #57050: Turn off travis added

Actions

#8

Updated by okurz over 5 years ago

@andriinikitin I assume this is closely related to your work on circle CI setup. Can you comment on what are your recent experiences regarding codecov reliability?

Actions

#9

Updated by andriinikitin over 5 years ago

okurz wrote:

@andriinikitin I assume this is closely related to your work on circle CI setup. Can you comment on what are your recent experiences regarding codecov reliability?

I believe few lines are unstable, but overall I don't see it as a big problem.

Actions

#10

Updated by okurz about 5 years ago

Status changed from New to Resolved
Assignee set to andriinikitin

@andriinikitin thanks for your assessment. I personally trust coverage reports for openQA as well as os-autoinst lately so I guess we can call this done, probably mainly thanks to andriinikitin, hence assigning to him.

Actions

#11

Updated by okurz about 5 years ago

Status changed from Resolved to Workable
Assignee deleted (~~andriinikitin~~)

Actually I found out that recent pull requests seem to report often the same flaky, changing coverage diff, e.g. https://github.com/os-autoinst/openQA/pull/2782#issuecomment-591963100 :

lib/OpenQA/Worker/Settings.pm 84.9% (-13.21%)
lib/OpenQA/Utils.pm 96.84% (-0.23%)

-> https://codecov.io/gh/os-autoinst/openQA/pull/2782/changes

https://github.com/os-autoinst/openQA/pull/2783#issuecomment-592008196 :

lib/OpenQA/Worker/Settings.pm 84.9% (-13.21%)
lib/OpenQA/Scheduler/Model/Jobs.pm 91.49% (+2.83%)

-> https://codecov.io/gh/os-autoinst/openQA/pull/2783/changes

that is
https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Worker/Settings.pm#L100
which should have been "uncoverable".

and https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Scheduler/Model/Jobs.pm#L305 and following.

and https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Utils.pm#L807

I guess for the latest at least it depends on the full stack test if that part is actually executed based on timing behaviour. Probably a more level or unit test to cover the lines explicitly can help.

Actions

#12

Updated by okurz about 5 years ago

Status changed from Workable to In Progress
Assignee set to okurz

https://github.com/os-autoinst/openQA/pull/2813 to cover https://codecov.io/gh/os-autoinst/openQA/pull/2765/changes about lib/OpenQA/Worker/Job.pm:315 which wasn't even found by me earlier.

Actions

#13

Updated by okurz about 5 years ago

https://github.com/os-autoinst/openQA/pull/2826 to cover uncovered parts of _pick_siblings_of_running in lib/OpenQA/Scheduler/Model/Jobs.pm

Actions

#14

Updated by okurz about 5 years ago

Status changed from In Progress to Feedback

PR merged, next stop: https://codecov.io/gh/os-autoinst/openQA/pull/2811/changes with lib/OpenQA/Worker/WebUIConnection.pm and lib/OpenQA/WebAPI/Controller/API/V1/Job.pm

But I could also wait for more results from next pull requests.

Actions

#15

Updated by andriinikitin about 5 years ago

Not sure if it is obvious, but will mention anyway: my understanding is that there is some race condition on end of some of test(s), when Worker shutdown doesn't always happen in time, so codecov stats don't always reported before test ends.

Actions

#16

Updated by okurz about 5 years ago

yes, good observation. What you mentioned is one option but I think the full stack tests that currently shows the coverage problems is doing the right thing on top level and actually pretty stable. On lower level we have different code paths that are there to provide robustness for different timing and network behaviour so I think we are better off covering these code branches with explicit low-level tests. As an alternative we could also run the full stack tests "often enough", e.g. with make variables STABILITY_TEST=1 RETRY=3 in the hope we at least once cover all lines but that still seems risky.

https://github.com/os-autoinst/openQA/pull/2815 is another PR, merged.

https://github.com/os-autoinst/openQA/pull/2835 was another PR for more coverage that is now merged. Let's see what the next PRs will tell.

Actions

#17

Updated by mkittler about 5 years ago

Description updated (diff)

Actions

#18

Updated by livdywan almost 5 years ago

https://github.com/os-autoinst/openQA/pull/2920#issuecomment-611509627

OpenQA::Schema::Result::Jobs::done is only invoked in some testruns with $args{newbuild}

I gather I'm currently improving coverage in that area.

Actions

#19

Updated by okurz over 4 years ago

Status changed from Feedback to Workable
Assignee deleted (~~okurz~~)
Target version set to Ready

The unreliability has decreased but there are still some areas that seem to report different results from time to time, e.g. see https://github.com/os-autoinst/openQA/pull/3142#issuecomment-656651494 mentioning lib/OpenQA/WebAPI/Plugin/Helpers.pm despite the PR having clearly no related changes. https://github.com/os-autoinst/openQA/pull/2897 is still the pending change planned to increase coverage.

Actions

#20

Updated by okurz over 4 years ago

Description updated (diff)

Actions

#21

Updated by okurz over 4 years ago

Description updated (diff)

Actions

#22

Updated by mkittler over 4 years ago

Seems like https://github.com/os-autoinst/openQA/pull/3399 didn't make this easier. Now we also see changes in code coverage within the test code itself, e.g. https://codecov.io/gh/os-autoinst/openQA/pull/3423/changes.

Loops like

    for (0 .. 5) {
        $ret = $driver->find_element($element)->get_text();
        return $ret if &$test_break($ret);
        sleep .1;
    }

within our test code are hard to make stable. The sleep may or may not be executed depending on the timing.

Actions

#23

Updated by okurz over 4 years ago

I do not think the inclusion of test code is a problem but actually helps. We should be able to specify individual lines with "uncoverable statements" and introduce reliable coverage for all other places. And unreliable test code can likely be linked to unreliable production code coverage.

Actions

#24

Updated by okurz over 4 years ago

Subject changed from Let's make codecov reports reliable to [epic] Let's make codecov reports reliable
Status changed from Workable to Blocked
Assignee set to okurz
Difficulty set to hard

Forced cdywan to discuss this with me ;) We agreed that it's actually not "Workable" for him and I should rework this, e.g. as an epic. There is already a subtask. That is not really the only task but we are not progressing anyway so I will block this epic based on the subtask.

Actions

#25

Updated by szarate over 4 years ago

Tracker changed from action to coordination
Status changed from Blocked to New
Difficulty deleted (~~hard~~)

Actions

#26

Updated by szarate over 4 years ago

See for the reason of tracker change: http://mailman.suse.de/mailman/private/qa-sle/2020-October/002722.html

Actions

#27

Updated by okurz over 4 years ago

Status changed from New to Blocked
Difficulty set to hard

Actions

#28

Updated by okurz over 4 years ago

I have looked into recent codecov reports on https://app.codecov.io/gh/os-autoinst/openQA/pulls?page=1&state=open&order=-pullid and often I found two files with unreliable coverage results: t/05-scheduler-full.t and t/lib/OpenQA/Test/Utils.pm , created two new subtasks.

There is also https://app.codecov.io/gh/os-autoinst/openQA/compare/3555/changes showing uncovered lines in t/lib/OpenQA/SeleniumTest.pm still which I addressed in #78043

https://codecov.io/gh/os-autoinst/openQA/pull/3467/changes shows in lib/OpenQA/WebSockets.pm

97        = "Unable to assign job to worker $worker_id: the worker has not established a ws connection";
98      return $result;

Actions

#29

Updated by okurz over 4 years ago

Estimated time set to 39719.00 h

Actions

#30

Updated by okurz over 4 years ago

Estimated time deleted (~~39719.00 h~~)

Actions

#31

Updated by okurz about 4 years ago

Description updated (diff)

Actions

#32

Updated by okurz almost 4 years ago

Parent task set to #80142

Actions

#33

Updated by okurz over 3 years ago

Status changed from Blocked to Resolved

I think we finally managed – at least for now – to have consistent, trustworthy codecov reports \o/

Actions

Also available in: Atom PDF