Project

General

Profile

action #123508

Updated by okurz almost 2 years ago

## Motivation 

 Sometimes, a running instance aka. teregen process in the same container will stuck itself (probably due to unresponsive socket for some 3rd party data source like IBS) for a long time (until killed). The highest probability is of course during maintenance windows as can be seen in the log. 

 ``` 
 2023/01/19 10:15:02 W [undef] TeReGen: One instance is already running under PID 19746. Exiting. 
 ... 
 2023/01/23 10:00:02 W [undef] TeReGen: One instance is already running under PID 19746. Exiting. 
 ``` 

 This effectively halts generating for all templates until the hanging process is killed and new one is started (via cron or manually). It would be nice to have a possibility to forcefully quit generating if it takes too long. 

 ## Acceptance criteria 

 **AC1:** Templates are generated in a timely manner 
 **AC2:** Only one instance of generator is actively trying to generate new templates 
 **AC3:** The instance which is generating templates is not running indefinitely 

 ## Suggestions 
 * source of teregen is in https://gitlab.suse.de/qa-maintenance/teregen/ 
 * teregen is running on qam, configured in https://gitlab.suse.de/qa-maintenance/qamops 
 * Convert from cron to systemd timers with timeout+restart (or wait for next interval) for the "… generate" command, e.g. based on example MR https://gitlab.suse.de/qa-maintenance/qamops/-/merge_requests/25 
 * Mind the other teregen processes concerning the web interface 
 * See `crontab -l -u teregen`

Back