Project

General

Profile

action #112898

Updated by livdywan almost 2 years ago

### Observation 

 We've been getting minion workers alerts throughout the day. The alerts usually calm down after a while but alert again later. 

 `journalctl -fu openqa-gru.service` isn't showing anything that looks relevant. Although I noticed a lot of `grep was killed, possibly timed out` messages. 
 `/var/log/openqa_gru` mostly contains `[debug] Process ... is performing job "..." with task "..."` type messages. 

 I paused pause the alert for now because we're way past alert fatigue. 

 ### Suggestions 
 - Research what's causing minion workers to disappear frequently 

 ### Rollback steps 
 - Unpause the alert in grafana

Back