action #10884
closed
Save image on filesystem kernel bugs
Added by okurz almost 9 years ago.
Updated about 5 years ago.
Category:
Feature requests
Description
User story¶
As a kernel filesystem developer I want to have access to qcow2 images on filesystem kernel bugs to debug what actually happened on the filesystem without subsequent test modules tainting the filesystem
acceptance criteria¶
- AC1: If a filesystem related kernel bug is happening on qemu backend the test run is aborted and the qcow2 image is stored
- AC2: A cleanup strategy for the images exists
tasks¶
- Research current post_failure_hooks in os-autoinst-distri-opensuse
- Grep for kernel bug notice, e.g. a hook into "wait_serial", like "infinite parallel wait serial"
- If bug is in "filesystem", immediately abort test run and tag image for saving
- Adapt openQA gru cleanup task
further details¶
Initial idea was mentioned in https://bugzilla.suse.com/show_bug.cgi?id=963567#c7
- Category set to Enhancement to existing tests
- Subject changed from save image on filesystem kernel bugs to [sle][functional][kernel][opensuse]save image on filesystem kernel bugs
Adding the target version "future" as I would like to better plan it with the SLE Functional team.
- Target version set to future
- Target version changed from future to future
- Subject changed from [sle][functional][kernel][opensuse]save image on filesystem kernel bugs to [kernel] save image on filesystem kernel bugs
By now I think this is something for QSK?
- Subject changed from [kernel] save image on filesystem kernel bugs to Save image on filesystem kernel bugs
Does it make sense to start publishing qcow2 images of failed tests in filesystems? There are plenty and we could run into storage resource problems...
I guess proper logging is more important than this. Anyone can actually re-run the tests and stop it when necessary to troubleshoot manually.
In any case, if the test fails because of a kernel crash or anything, PUBLISH_HDD won't work, so we would need additional work for these cases in post_fail_hook or something.
So, I remove the tag [kernel] there, because this feature is more generic than just kernel or filesystems.
- Status changed from New to Rejected
I think we can safely reject this.
- Project changed from openQA Tests to openQA Project
- Category changed from Enhancement to existing tests to Feature requests
- Status changed from Rejected to New
why would you want to reject it? Do you think it's done or that the original use case wasn't relevant in the first place? If it was relevant then then it should still be relevant as of today. Might be ok to track it within a broader scope though as I understand that both of your teams are not interested. I recently merged https://github.com/os-autoinst/os-autoinst/pull/897 adding the variable "FORCE_PUBLISH_HDD_" which could probably even be set during the test run.
@jlausuch maybe you are willing to pick it back now? :)
@jlausuch maybe you are willing to pick it back now? :)
Pick back what exactly?
No, we won't put this into our backlog unless we see a real need for this. Besides, you already implemented the FORCE_PUBLISH_HDD_ parameter so it's available if anyone wants to use it...
I doubt kernel folks will want to publish qcow2 for every failed tests we have :)
- Status changed from New to Rejected
okurz wrote:
why would you want to reject it? Do you think it's done or that the original use case wasn't relevant in the first place? If it was relevant then then it should still be relevant as of today. Might be ok to track it within a broader scope though as I understand that both of your teams are not interested. I recently merged https://github.com/os-autoinst/os-autoinst/pull/897 adding the variable "FORCE_PUBLISH_HDD_" which could probably even be set during the test run.
@jlausuch maybe you are willing to pick it back now? :)
The use case it's not relevant. QSFU doesn't work with filesystems, and afaict since this wasn't picked by anybody in 3 years, it's very likely that there is no interest.
- Status changed from Rejected to Resolved
- Assignee set to okurz
Fine, but let's not give the impression nothing has been done. So let's call it "done" as described in http://open.qa/docs/#_asset_handling : One just needs to set "FORCE_PUBLISH_HDD_1=…" and will receive an image if possible. This can be done case by case on individual jobs or also within test code, e.g. after detecting a bug situation.
Also available in: Atom
PDF