action #95299

Updated by okurz 3 months ago

## Observation

I don't know why this ppc64le jobs timed out, cache service ?
Should the regex be more specific or can it be generic for any timeout ?

## Steps to reproduce

~~Find jobs referencing this ticket with the help of ,
call `openqa-query-for-job-label poo#95299`~~

ssh osd "sudo -u geekotest psql --command=\"select,result_dir,t_finished,host,instance from jobs join workers on where reason ~ 'timeout: setup exceeded' order by t_finished;\" openqa"

## Expected result
A log shows how it should look:

[2021-07-12T10:36:56.0008 CEST] [info] [pid:64283] Download of SLE-15-SP1-Installer-DVD-ppc64le-GM-DVD1.iso processed:
[info] [#106833]
Cache size of "/var/lib/openqa/cache" is 44GiB, with limit 50GiB
[info] [#106833]
Downloading "SLE-15-SP1-Installer-DVD-ppc64le-GM-DVD1.iso" from ""
[info] [#106833]
Content of "/var/lib/openqa/cache/" has not changed, updating last use

[2021-07-12T10:36:56.0100 CEST] [info] [pid:64283] Rsync from 'rsync://' to '/var/lib/openqa/cache/', request #106839 sent to Cache Service
[2021-07-12T10:37:01.0168 CEST] [info] [pid:64283] Output of rsync:
[info] [#106839] Calling: rsync -avHP rsync:// --delete /var/lib/openqa/cache/
receiving incremental file list

sent 1,992 bytes received 2,581,652 bytes 1,722,429.33 bytes/sec
total size is 12,978,495,757 speedup is 5,023.33

[2021-07-12T10:37:01.0168 CEST] [info] [pid:64283] Finished to rsync tests
[2021-07-12T10:37:01.0172 CEST] [debug] [pid:70271] +++ worker notes +++

where one can see the output from the rsync call.

Problem seems to be specific to ppc64le workers, maybe only "malbec" now. Maybe specific to test syncing. The cacheservice-minion logs do not mention the test rsync request at all.

## Suggestions
* Start with calling `git grep` on the source code of openQA for the log messages that we see in the jobs mentioned above (or see comments below), to see which debug messages should be expected before the timeout
* If you identify that there could be helpful log messages, e.g. to be able to distinguish if a request was received by the minion service or not, add it