Project

General

Profile

action #68164

Updated by okurz over 4 years ago

## Observation 

 reported in irc room [#opensuse-factory](irc://chat.freenode.net/opensuse-factory) : 

 ``` 
 [17/06/2020 09:02:45] <Dimstar> Good morning all; anybody knows what's up with o3 not picking up the scheduled jobs? 
 [17/06/2020 09:04:44] <guillaume_g> Dimstar: Hi! :) Workers are reported broken 
 [17/06/2020 09:05:00] <guillaume_g> Dimstar: "No workers active in the cache service" 
 [17/06/2020 09:05:30] <guillaume_g> the only workers which are running are the one which are not auto-updated ;) 
 [17/06/2020 09:07:31] <guillaume_g> kraih: ^ could it be your MR https://github.com/os-autoinst/openQA/pull/3177 ? 
 [17/06/2020 09:07:32] <|Anna|> Github project os-autoinst/openQA pull request#3177: "Reset locks when restarting the cache service Minion worker", created on 2020-06-16, status: closed on 2020-06-16, https://github.com/os-autoinst/openQA/pull/3177 
 [17/06/2020 09:08:09] <fvogt> At least on openqa-aarch64 all services are up and running, according to systemctl 
 [17/06/2020 09:09:40] <guillaume_g> Dimstar: Could you abort openSUSE:Factory:ARM:Live/JeOS:GNOME-efi.aarch64 please? 
 [17/06/2020 09:09:50] <Dimstar> fun - worker info for e.g. ow1:1 is alive, last seen less than a minute ago, broken 
 [17/06/2020 09:10:06] <Dimstar> guillaume_g: done 
 [17/06/2020 09:12:02] <fvogt> openqa-worker-cacheservice-minion.service is dead - it printed usage info... 
 [17/06/2020 09:12:09] <fvogt> " See 'APPLICATION help COMMAND' for more information on a specific command." 
 [17/06/2020 09:12:24] <fvogt> For some reason that has exit code 0, which isn't helpful 
 [17/06/2020 09:13:49] <guillaume_g> Dimstar: thanks! :) 
 [17/06/2020 09:14:07] <fvogt> It's the order of arguments 
 [17/06/2020 09:14:15] <fvogt> It has to be "run -m production", not "-m production run" 
 [17/06/2020 09:16:41] <fvogt> Started it manually, worker is back. So confirmed to be that indeed 
 [17/06/2020 09:23:58] <guillaume_g> Great! 
 [17/06/2020 09:24:22] <fvogt> Now we just need someone to commit and push the fix 
 [17/06/2020 09:24:53] <Dimstar> fvogt: did you restart all workers for this? e.g. ow1, ow4 ow7, imagetester? 
 [17/06/2020 09:25:19] <fvogt> Dimstar: Where happened to your 'S'? 
 [17/06/2020 09:25:30] <fvogt> No, I only tried to prove the theory on openqa-aarch64 
 [17/06/2020 09:25:58] <fvogt> You can run su _openqa-worker -c '/usr/share/openqa/script/openqa-workercache run -m production --reset-locks' if you want to 
 [17/06/2020 09:26:17] <Dimstar> fvogt: ok; that's fine; just needed to know... I'll kick the x86_64 workers 
 ``` 

 from openqaworker13 within osd infrastructure: 

 ``` 
 Jun 17 09:48:57 openqaworker13 systemd[1]: Started OpenQA Worker Cache Service Minion. 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: [22459] [i] [0oHtg3mJ] Cache size of "/var/lib/openqa/cache" is 49GiB, with limit 50GiB 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: Usage: APPLICATION COMMAND [OPTIONS] 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     mojo version 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     mojo generate lite-app 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     ./myapp.pl daemon -m production -l http://*:8080 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     ./myapp.pl get /foo 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     ./myapp.pl routes -v 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: Tip: CGI and PSGI environments can be automatically detected very often and 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:        work without commands. 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: Options (for all commands): 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     -h, --help            Get more information on a specific command 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:         --home <path>     Path to home directory of your application, defaults to 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:                         the value of MOJO_HOME or auto-detection 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:     -m, --mode <name>     Operating mode for your application, defaults to the 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:                         value of MOJO_MODE/PLACK_ENV or "development" 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: Commands: 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    cgi         Start application with CGI 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    cpanify     Upload distribution to CPAN 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    daemon      Start application with HTTP and WebSocket server 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    eval        Run code against application 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    generate    Generate files and directories from templates 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    get         Perform HTTP request 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    inflate     Inflate embedded files to real files 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    minion      Minion job queue 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    prefork     Start application with pre-forking HTTP and WebSocket server 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    psgi        Start application with PSGI 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    routes      Show available routes 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    run         Start Minion worker 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]:    version     Show versions of available modules 
 Jun 17 09:48:58 openqaworker13 openqa-worker-cacheservice-minion[22459]: See 'APPLICATION help COMMAND' for more information on a specific command. 
 ``` 

 so the service exits with *success* showing the help 


 ## Lessons learned + TODOs 

 * Ask explicitly how changes to systemd files have been tested 
 * Add tests for systemd services and/or the daemon wrapper scripts 
 * Prevent wrong arguments exiting the service with *success*

Back