coordination #55364
closed
coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
[epic] Let's make codecov reports reliable
Added by okurz over 4 years ago.
Updated over 2 years ago.
Estimated time:
(Total: 0.00 h)
Description
Observation¶
codecov reports often report about coverage changes which are obviously not related to the actual changes of a PR, e.g. when documentation is changed or as in https://github.com/os-autoinst/openQA/pull/2253#issuecomment-520042072 . It seems some of our tests – maybe the full stack test – introduced a flakyness in coverage.
Suggestions¶
We should try to make the reports more reliable and hence less annoying with big red color when it is not helpful.
Let's try to collect our observations here:
Other ideas for improving¶
- Mark individual sections as required to have 100% code coverage (how to do that?)
- Aim for 100% statement coverage at least
- Include coverage for javascript
- Check coverage for perl template code
- Description updated (diff)
- Description updated (diff)
It also seems like we are getting duplicate codecov reports which I understand can happen if both an old bot are activated along with the github app? As the bot runs under the account "coolo" we can look into this with him.
I have strong impression that this happens when PR is not rebased to latest master.
I.e. codecov compares coverage stats in PR with current coverage on master, which will look incorrect if base of PR is behind several commits
yes, I thought the same but it still seems as if the "usual suspects" crop up, e.g. lib/OpenQA/Worker/Settings.pm
even though we hardly ever touch it.
@andriinikitin I assume this is closely related to your work on circle CI setup. Can you comment on what are your recent experiences regarding codecov reliability?
okurz wrote:
@andriinikitin I assume this is closely related to your work on circle CI setup. Can you comment on what are your recent experiences regarding codecov reliability?
I believe few lines are unstable, but overall I don't see it as a big problem.
- Status changed from New to Resolved
- Assignee set to andriinikitin
@andriinikitin thanks for your assessment. I personally trust coverage reports for openQA as well as os-autoinst lately so I guess we can call this done, probably mainly thanks to andriinikitin, hence assigning to him.
- Status changed from Resolved to Workable
- Assignee deleted (
andriinikitin)
- Status changed from Workable to In Progress
- Assignee set to okurz
- Status changed from In Progress to Feedback
Not sure if it is obvious, but will mention anyway: my understanding is that there is some race condition on end of some of test(s), when Worker shutdown doesn't always happen in time, so codecov stats don't always reported before test ends.
yes, good observation. What you mentioned is one option but I think the full stack tests that currently shows the coverage problems is doing the right thing on top level and actually pretty stable. On lower level we have different code paths that are there to provide robustness for different timing and network behaviour so I think we are better off covering these code branches with explicit low-level tests. As an alternative we could also run the full stack tests "often enough", e.g. with make variables STABILITY_TEST=1 RETRY=3
in the hope we at least once cover all lines but that still seems risky.
https://github.com/os-autoinst/openQA/pull/2815 is another PR, merged.
https://github.com/os-autoinst/openQA/pull/2835 was another PR for more coverage that is now merged. Let's see what the next PRs will tell.
- Description updated (diff)
I gather I'm currently improving coverage in that area.
- Status changed from Feedback to Workable
- Assignee deleted (
okurz)
- Target version set to Ready
- Description updated (diff)
- Description updated (diff)
I do not think the inclusion of test code is a problem but actually helps. We should be able to specify individual lines with "uncoverable statements" and introduce reliable coverage for all other places. And unreliable test code can likely be linked to unreliable production code coverage.
- Subject changed from Let's make codecov reports reliable to [epic] Let's make codecov reports reliable
- Status changed from Workable to Blocked
- Assignee set to okurz
- Difficulty set to hard
Forced cdywan to discuss this with me ;) We agreed that it's actually not "Workable" for him and I should rework this, e.g. as an epic. There is already a subtask. That is not really the only task but we are not progressing anyway so I will block this epic based on the subtask.
- Tracker changed from action to coordination
- Status changed from Blocked to New
- Difficulty deleted (
hard)
- Status changed from New to Blocked
- Difficulty set to hard
- Estimated time set to 39719.00 h
- Estimated time deleted (
39719.00 h)
- Description updated (diff)
- Parent task set to #80142
- Status changed from Blocked to Resolved
I think we finally managed – at least for now – to have consistent, trustworthy codecov reports \o/
Also available in: Atom
PDF