Project

General

Profile

Actions

action #93799

closed

coordination #99303: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, approval and release

coordination #110016: [epic][teregen] teregen (maintenance test report template generator) improvements

teregen: Improvement of usability of disabled testcases notification size:M

Added by vpelcak almost 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Low
Assignee:
Target version:
Start date:
2021-06-10
Due date:
% Done:

0%

Estimated time:

Description

User story

There is a new recent feature in template generator.
If the automation squad disables the testcase in aggregated tests (for example in order to fix it), this is automatically detected for the testers to identify missing testcase and cover the missing tests by manual testing.

As a tester, I then occasionally encounter stuff like:

Example:

http://qam.suse.de/testreports/SUSE:Maintenance:19738:241935/log

regression tests:

WARNING: One or more test suites are missing compared to last runs with these source rpms.
Please, check openQA manually. Missing test suites are:

  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_hana_node01 / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_hana_node01@64bit-sap-qam
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-create_hdd_sles4sap_gnome / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-create_hdd_sles4sap_gnome@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_scc_gnome_hana_cli / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_scc_gnome_hana_cli@64bit-sap-qam
  • Maintenance: SLE 12 SP3 Incidents -> qam-incidentinstall / ppc64le sle-12-SP3-Server-DVD-Incidents-Install-ppc64le-Build:19600:libxml2-qam-incidentinstall@ppc64le
  • Maintenance: SLE 12 SP3 Incidents -> mau-extratests2 / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-extratests2@64bit
  • Maintenance: SLE 12 SP3 Incidents -> qam-incidentinstall / s390x sle-12-SP3-Server-DVD-Incidents-Install-s390x-Build:19600:libxml2-qam-incidentinstall@zkvm
  • Maintenance: SLE 12 SP3 Incidents -> mau-extratests-zypper / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-extratests-zypper@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-incidentinstall-sap / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-Install-x86_64-Build:19600:libxml2-qam-incidentinstall-sap@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mau-extratests-phub / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-extratests-phub@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_scc_gnome_netweaver / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_scc_gnome_netweaver@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_hana_node02 / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_hana_node02@64bit-sap-qam
  • Maintenance: SLE 12 SP3 Incidents -> qam-allpatterns+addons / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-qam-allpatterns+addons@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mau-webserver / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-webserver@64bit
  • Maintenance: SLE 12 SP3 Incidents -> qam-incidentinstall / x86_64 sle-12-SP3-Server-DVD-Incidents-Install-x86_64-Build:19600:libxml2-qam-incidentinstall@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-incidentinstall-sap / ppc64le sle-12-SP3-Server-DVD-SAP-Incidents-Install-ppc64le-Build:19600:libxml2-qam-incidentinstall-sap@ppc64le
  • Maintenance: SLE 12 SP3 Incidents -> mau-filesystem / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-filesystem@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mau-extratests1 / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-extratests1@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mau-sles-robot-fw / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-sles-robot-fw@64bit-2gbram
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_scc_gnome_saptune / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_scc_gnome_saptune@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mau-extratests-kdump / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mau-extratests-kdump@64bit
  • Maintenance: SLE 12 SP3 SAP Incidents -> qam-sles4sap_hana_supportserver / x86_64 sle-12-SP3-Server-DVD-SAP-Incidents-x86_64-Build:19600:libxml2-qam-sles4sap_hana_supportserver@64bit
  • Maintenance: SLE 12 SP3 Incidents -> mru-install-minimal-with-addons / x86_64 sle-12-SP3-Server-DVD-Incidents-x86_64-Build:19600:libxml2-mru-install-minimal-with-addons@64bit

in the testreport.

Testers do not understand it and don't know how to interpret it.

Acceptance criteria

  • AC1: it is clear to the tester what he needs to do when he sees message like the one above
  • AC2: ideally, information about what testing needs to be done to cover for the disabled testcase should be easily accessible

Suggestions

I gave this high priority, because there is a danger, that additional testing won't happen and updates are released with reduced test coverage with risk of regressions.


Related issues 3 (1 open2 closed)

Related to QA - action #90914: [teregen] Add overview for stored coverage dataNew2021-04-09

Actions
Related to QA - action #90401: [teregen] Integrate coverage information in a presentable way into test templateResolvedjbaier_cz2021-03-29

Actions
Related to QA - action #88127: [tools][qem] Test coverage DB for maintenance updatesClosedjbaier_cz2021-02-08

Actions
Actions #1

Updated by jbaier_cz almost 3 years ago

  • Related to action #90914: [teregen] Add overview for stored coverage data added
Actions #2

Updated by jbaier_cz almost 3 years ago

  • Assignee deleted (jbaier_cz)

Unassigning myself as there is nothing I can do at this moment, this needs more input from the user-base first.

Actions #3

Updated by livdywan almost 3 years ago

Notes:

  • By definition the tests listed here are no longer there
  • It might be useful to lookup previous jobs for these cases
  • A diff of the YAML (if the YAML was changed) and GitLab diff if applicable

Suggested approach:

  • Include descriptions of tests that previously passed but were removed
Actions #4

Updated by okurz almost 3 years ago

  • Due date set to 2021-07-01
  • Assignee set to okurz
  • Target version set to Ready

To be able to better understand and prioritize, could you try to adapt the description of the ticket according to the template https://progress.opensuse.org/projects/openqav3/wiki/#Feature-requests ? Also, can you explain why you see this as "High" priority?

Actions #5

Updated by vpelcak almost 3 years ago

  • Description updated (diff)
Actions #6

Updated by jbaier_cz almost 3 years ago

  • Related to action #90401: [teregen] Integrate coverage information in a presentable way into test template added
Actions #7

Updated by jbaier_cz almost 3 years ago

  • Related to action #88127: [tools][qem] Test coverage DB for maintenance updates added
Actions #8

Updated by okurz almost 3 years ago

vpelcak wrote:

Acceptance criteria

  • AC1: it is clear to the tester what he needs to do when he sees message like the one above

This sounds like extending the message "Please, check openQA manually". You would only expect better instructions within that text, right? Or elsewhere as well?

  • AC2: ideally, information about what testing needs to be done to cover for the disabled testcase should be easily accessible

That implies that manual testing would be able to do what an automated openQA test scenario based on a complete test suite does. It would be possible for removed test modules, e.g. just the "firefox" test module, but AFAIK that is not handled at all by test report templates. But I see that as effectively impossible when complete test scenarios are unscheduled. I know that this comes up from time to time that people think they can do manually for single updates what an automated test in openQA can do but only a very tiny subset of tests would be possible to execute manually nowadays within a reasonable time.

I am just picking one of the above mentioned test suites, possibly the smallest one, "mau-extratests-zypper". Take for example https://openqa.suse.de/tests/6224114 showing one successful run of such scenario. The test is composed of the test modules. Let's imagine we could ask a human tester to follow all instructions from the test modules. That is effectively following all instructions that show up also in the log file https://openqa.suse.de/tests/6224114/file/autoinst-log.txt . One could search for all "<<<" markers which are commands that openQA types. But that accounts for 2300 lines in the mentioned example, unlikely to be followed by humans. Instead let's look at modules themselves. First "boot_to_desktop", ok, feasible to achieve by a tester just judging from the name. Then comes "zypper_lr_validate". In openQA this takes 8s. The instructions are visible in https://openqa.suse.de/tests/6224114/modules/zypper_lr_validate/steps/1/src corresponding to https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/console/zypper_lr_validate.pm . Very hard to read as the test module needs to be very special depending on SLE service packs. But a test for an incident update likely needs to understand these specifics as well. And then there are many more test modules, like "zypper_ref", "zypper_info", "check_interactive_flag", "validate_packages_and_patterns", "zypper_extend", "coredump_collect". Again, asking humans to redo what an automated test system does is impossible.

The best approach I see is:

  1. If tests produce false positives and it wasn't a mistake that the tests have been added, e.g. someone just added new tests that were never there, then revert what introduces the test regression and retrigger tests for the incident updates
  2. If in the rare exception that one really really really thinks it's the right approach to revert to manual testing, e.g. urgent security fix while all openQA tests are broken due to infrastructure problem, then rely on expertise of professional testers of how the update can be tested best as we can not formalize consistent and complete manual test instructions that could replace the automated tests

I gave this high priority, because there is a danger, that additional testing won't happen and updates are released with reduced test coverage with risk of regressions.

ok, but it's not a recent regression, is it? So basically as a safeguard or workaround until we find a better solution we/you should remind or explain manually what it means.

Actions #9

Updated by vpelcak almost 3 years ago

okurz wrote:

vpelcak wrote:

Acceptance criteria

  • AC1: it is clear to the tester what he needs to do when he sees message like the one above

This sounds like extending the message "Please, check openQA manually". You would only expect better instructions within that text, right? Or elsewhere as well?

Yes. Having better instructions what needs to be done in general would help. The block of the text below, while great start, is hard to decode for the people.

  • AC2: ideally, information about what testing needs to be done to cover for the disabled testcase should be easily accessible

That implies that manual testing would be able to do what an automated openQA test scenario based on a complete test suite does. It would be possible for removed test modules, e.g. just the "firefox" test module, but AFAIK that is not handled at all by test report templates. But I see that as effectively impossible when complete test scenarios are unscheduled. I know that this comes up from time to time that people think they can do manually for single updates what an automated test in openQA can do but only a very tiny subset of tests would be possible to execute manually nowadays within a reasonable time.

I see your point. OTOH, we used to work in that kind of mode already. And I certainly hope that this situation won't be too frequent.

The best approach I see is:

  1. If tests produce false positives and it wasn't a mistake that the tests have been added, e.g. someone just added new tests that were never there, then revert what introduces the test regression and retrigger tests for the incident updates
  2. If in the rare exception that one really really really thinks it's the right approach to revert to manual testing, e.g. urgent security fix while all openQA tests are broken due to infrastructure problem, then rely on expertise of professional testers of how the update can be tested best as we can not formalize consistent and complete manual test instructions that could replace the automated tests

You listed just few of possible options. There could be a version bump causing testcase not working, long time unstable test...

I gave this high priority, because there is a danger, that additional testing won't happen and updates are released with reduced test coverage with risk of regressions.

ok, but it's not a recent regression, is it? So basically as a safeguard or workaround until we find a better solution we/you should remind or explain manually what it means.

No, it is not a regression indeed.

Actions #10

Updated by okurz almost 3 years ago

The best approach I see is:

  1. If tests produce false positives and it wasn't a mistake that the tests have been added, e.g. someone just added new tests that were never there, then revert what introduces the test regression and retrigger tests for the incident updates
  2. If in the rare exception that one really really really thinks it's the right approach to revert to manual testing, e.g. urgent security fix while all openQA tests are broken due to infrastructure problem, then rely on expertise of professional testers of how the update can be tested best as we can not formalize consistent and complete manual test instructions that could replace the automated tests

You listed just few of possible options. There could be a version bump causing testcase not working, long time unstable test...

That makes no difference to me as still a human tester will not be possible to conduct the test in depth as the automated tests are able to do. What I have not mentioned as option is of course what should have been decided in before: That such version update can only be approved as soon as the automated tests have been updated with exception of what I described as option 2 which should not apply for a "version bump causing testcase not working". The risk of releasing a product regression breaking customer use cases is just too great.

As the list of missing tests can get long I would suggest to abbreviate after e.g. 3 lines with and point to a test overview page on openQA instead, e.g. https://openqa.suse.de/tests/overview?distri=sle&build=:19924:webkit2gtk3
but as we can not store openQA jobs indefinitely this has a limitation. All openQA results are deleted unless marked as "important". Rather than "saving the data elsewhere because we don't have enough space in openQA" I would say if we need results for longer in openQA then we will find a solution in openQA. So as alternative to saving a list of tests elsewhere we can discuss keeping results longer for certain job groups or we can mark builds or jobs as "important" and then they are kept around longer. This needs corresponding space on OSD reserved for that purpose of course.

Actions #11

Updated by okurz almost 3 years ago

  • Subject changed from Improvement of usability of disabled testcases notification to teregen: Improvement of usability of disabled testcases notification
  • Assignee deleted (okurz)
  • Priority changed from High to Low
  • Target version changed from Ready to future

I agree that the mentioned ideas would help but currently we need to focus on other needed changes before we can go back to UX improvements in SUSE-internal-only tools.

Actions #12

Updated by okurz almost 3 years ago

  • Due date deleted (2021-07-01)

points from #93799#note-4 have been addressed hence removing due-date

Actions #14

Updated by vpelcak over 2 years ago

Hello.
How is the situation about this?
I feel that the priority is way too low.

When I said that it is not a regression, I think I was not correct.

Yes, it is not a regression in the matter of notifications to the people, but every unscheduled test poses a risk of regression in the test coverage and subsequently in the product.
That needs to be addressed.

OTOH some unschedules could be for a good reason (useless test for example). Perhaps it is possible to somehow address that, too.

Maybe we should refine the ticket.

Actions #15

Updated by jbaier_cz about 2 years ago

  • Parent task set to #110016
Actions #17

Updated by jbaier_cz over 1 year ago

  • Target version changed from future to Ready

We can handle this in as a part of #108812, the details should go into an extra file which can be viewed if needed and a simple summary with a reasonable warning should be present in the main template

Actions #18

Updated by tinita over 1 year ago

  • Subject changed from teregen: Improvement of usability of disabled testcases notification to teregen: Improvement of usability of disabled testcases notification size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #19

Updated by jbaier_cz over 1 year ago

  • Assignee set to jbaier_cz
Actions #20

Updated by jbaier_cz over 1 year ago

  • Status changed from Workable to In Progress
Actions #21

Updated by jbaier_cz over 1 year ago

This issue is addressed by https://gitlab.suse.de/qa-maintenance/teregen/-/merge_requests/16 as explained inside #108812

Actions #22

Updated by jbaier_cz over 1 year ago

  • Status changed from In Progress to Feedback

The change is deployed

Actions #23

Updated by jbaier_cz over 1 year ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF