Actions
action #78127
closedfollow-up to #73633 - lessons learned and suggestions
Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:
0%
Estimated time:
Description
Suggestions from #73633#note-30
- A passive performance measurement regarding throughput on interfaces
- whenever we apply changes to the infrastructure we should have a ticket
- Whenever creating any external ticket, e.g. EngInfra, create internal tracker ticket. Because there might be more internal notes
- Same as in OSD deployment we should look for failed grafana
- Collect all the information between "last good" and "first bad" and then also find the git diff in openqa/salt-states-openqa
- Apply proper "scientific method" with written down hypotheses, experiments and conclusions in tickets, follow https://progress.opensuse.org/projects/openqav3/wiki#Further-decision-steps-working-on-test-issues
- Keep salt states to describe what should not be there
- Try out older btrfs snapshots in systems for crosschecking and boot with disabled salt. In the kernel cmdline append
systemd.mask=salt-minion.service
- team should conduct a work backlog check on a daily base
- nsinger does not mind if someone else provides a suggestion or takes over the ticket
Updated by okurz about 4 years ago
- Copied from action #73633: OSD partially unresponsive, triggering 500 responses, spotty response visible in monitoring panels but no alert triggered (yet) added
Updated by livdywan almost 4 years ago
I feel like this is not Workable since it's got all of the suggestions we came up with but it's not clear what result we expect yet i.e. ACs. And I'm not sure why it has an assignee... @okurz did you mean to define the ACs? Or maybe we should have another call to do that?
Updated by okurz almost 4 years ago
hm, actually I think all except the first point could be moved to the wiki as "best practices" or "good to know" as is.
Updated by okurz almost 4 years ago
- Status changed from Workable to Resolved
- created https://progress.opensuse.org/projects/openqav3/wiki/Wiki#Best-practices-for-infrastructure-work
- added comments to https://progress.opensuse.org/projects/qa/wiki/Wiki#How-we-work-on-our-backlog
- first point added to #65271
Actions