Project

General

Profile

action #122158

Updated by okurz over 1 year ago

## Observation 
 https://monitor.qa.suse.de/d/WDQA-Power8-4-kvm/worker-dashboard-qa-power8-4-kvm?editPanel=65105&tab=alert failed since 2022-12-18 03:00 so the weekly reboot that triggered this, see https://monitor.qa.suse.de/d/WDQA-Power8-4-kvm/worker-dashboard-qa-power8-4-kvm?editPanel=65105&tab=alert&orgId=1&from=1671320482076&to=1671348249406 in detail. The machine is reachable over IPMI but does not show anything obvious. 

 ## Suggestions 
 * Pause related alert(s) 
 * Remove from salt 
 * Trigger OSD deployment which failed due to this 
 * Trigger reboot 
 * Look into the other recent related tickets, e.g. about worker cache sqlite lookup something 
 * #114565 and other related 

 ## Rollback steps 
 * Bring back to salt 
 * Unpause alert after resolution

Back