Actions
action #107746
closedcoordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens
Some directly chained jobs are skipped by openQA
Description
Our test is ten chained jobs. One parent, the rest are children. Sometimes, the last several child jobs were cancelled/skipped by openQA. It seems a websocket problem. Below is the log of http://void.qam.suse.cz/tests/37703:
2022-03-01T11:01:25.603001+01:00 void oqaqambot[14875]: INFO: Project 'SUSE:Maintenance:22978' without jobs done - no worth to comment
2022-03-01T11:01:25.603026+01:00 void oqaqambot[14875]: INFO: Project 'SUSE:Maintenance:23083' without jobs done - no worth to comment
2022-03-01T11:01:25.603050+01:00 void oqaqambot[14875]: INFO: Project 'SUSE:Maintenance:23020' without jobs done - no worth to comment
2022-03-01T11:01:25.648117+01:00 void systemd[1]: openqabot-incident.service: Succeeded.
2022-03-01T11:01:25.648232+01:00 void systemd[1]: Finished Shedule incidents in qam openQA.
2022-03-01T11:01:27.036237+01:00 void openqa-websockets-daemon[4935]: [info] Worker 2 websocket connection closed - 1006
2022-03-01T11:01:27.036551+01:00 void worker[22430]: [warn] Websocket connection to http://void.qam.suse.cz/api/v1/ws/2 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:01:28.673888+01:00 void worker[22430]: [warn] Websocket connection to http://quasar.suse.cz/api/v1/ws/40 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:01:29.783464+01:00 void worker[30530]: [info] Registering with openQA http://void.qam.suse.cz
2022-03-01T11:01:29.889641+01:00 void worker[30530]: [info] Establishing ws connection via ws://void.qam.suse.cz/api/v1/ws/16
2022-03-01T11:01:29.894146+01:00 void worker[30530]: [info] Registered and connected via websockets with openQA host http://void.qam.suse.cz and worker ID 16
2022-03-01T11:01:33.834397+01:00 void openqa-websockets-daemon[4935]: [info] Worker 6 websocket connection closed - 1006
2022-03-01T11:01:33.835030+01:00 void worker[22464]: [warn] Websocket connection to http://void.qam.suse.cz/api/v1/ws/6 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:01:37.037559+01:00 void worker[22430]: [info] Registering with openQA http://void.qam.suse.cz
2022-03-01T11:01:37.072457+01:00 void worker[22430]: [info] Establishing ws connection via ws://void.qam.suse.cz/api/v1/ws/2
2022-03-01T11:01:37.076563+01:00 void worker[22430]: [info] Registered and connected via websockets with openQA host http://void.qam.suse.cz and worker ID 2
2022-03-01T11:01:38.675876+01:00 void worker[22430]: [info] Registering with openQA http://quasar.suse.cz
2022-03-01T11:01:38.720954+01:00 void worker[22430]: [info] Establishing ws connection via ws://quasar.suse.cz/api/v1/ws/40
2022-03-01T11:01:38.725879+01:00 void worker[22430]: [info] Registered and connected via websockets with openQA host http://quasar.suse.cz and worker ID 40
2022-03-01T11:01:43.835993+01:00 void worker[22464]: [info] Registering with openQA http://void.qam.suse.cz
2022-03-01T11:01:43.939580+01:00 void worker[22464]: [info] Establishing ws connection via ws://void.qam.suse.cz/api/v1/ws/6
2022-03-01T11:01:43.943940+01:00 void worker[22464]: [info] Registered and connected via websockets with openQA host http://void.qam.suse.cz and worker ID 6
2022-03-01T11:02:05.663842+01:00 void openqa-websockets-daemon[4935]: [info] Worker 14 websocket connection closed - 1006
2022-03-01T11:02:05.664624+01:00 void worker[30543]: [warn] Websocket connection to http://void.qam.suse.cz/api/v1/ws/14 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:02:15.664907+01:00 void worker[30543]: [info] Registering with openQA http://void.qam.suse.cz
2022-03-01T11:02:15.693299+01:00 void worker[30543]: [info] Establishing ws connection via ws://void.qam.suse.cz/api/v1/ws/14
2022-03-01T11:02:15.697032+01:00 void worker[30543]: [info] Registered and connected via websockets with openQA host http://void.qam.suse.cz and worker ID 14
2022-03-01T11:02:29.965741+01:00 void openqa-websockets-daemon[4935]: [info] Worker 16 websocket connection closed - 1006
2022-03-01T11:02:29.966036+01:00 void worker[30530]: [warn] Websocket connection to http://void.qam.suse.cz/api/v1/ws/16 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:02:32.609299+01:00 void worker[30530]: [info] Isotovideo exit status: 0
2022-03-01T11:02:32.695720+01:00 void worker[30530]: [info] +++ worker notes +++
2022-03-01T11:02:32.695818+01:00 void worker[30530]: [info] End time: 2022-03-01 10:02:32
2022-03-01T11:02:32.695860+01:00 void worker[30530]: [info] Result: done
2022-03-01T11:02:32.715730+01:00 void worker[15116]: [info] Uploading vars.json
2022-03-01T11:02:32.717124+01:00 void worker[15116]: [error] REST-API error (POST http://void.qam.suse.cz/api/v1/jobs/37703/status): Connection error: Premature connection close (remaining tries: 59)
2022-03-01T11:02:32.730554+01:00 void worker[15116]: [info] Uploading autoinst-log.txt
2022-03-01T11:02:32.752175+01:00 void worker[15116]: [info] Uploading worker-log.txt
2022-03-01T11:02:32.763516+01:00 void worker[15116]: [info] Uploading serial0.txt
2022-03-01T11:02:32.776117+01:00 void worker[15116]: [info] Uploading video_time.vtt
2022-03-01T11:02:32.789181+01:00 void worker[15116]: [info] Uploading serial_terminal.txt
2022-03-01T11:02:33.344372+01:00 void worker[30530]: [info] Accepting job 37695 from queue
2022-03-01T11:02:33.344467+01:00 void worker[30530]: [error] Unable to accept job 37695 because the websocket connection to http://void.qam.suse.cz has been lost.
2022-03-01T11:02:33.344538+01:00 void worker[30530]: [info] Skipping job 37696 from queue (parent failed with result api-failure)
2022-03-01T11:02:33.393739+01:00 void worker[30530]: [info] Skipping job 37697 from queue (parent failed with result skipped)
2022-03-01T11:02:33.443233+01:00 void worker[30530]: [info] Skipping job 37698 from queue (parent failed with result skipped)
2022-03-01T11:02:33.485169+01:00 void worker[30530]: [info] Skipping job 37699 from queue (parent failed with result skipped)
2022-03-01T11:02:37.145947+01:00 void openqa-websockets-daemon[4935]: [info] Worker 2 websocket connection closed - 1006
2022-03-01T11:02:37.146319+01:00 void worker[22430]: [warn] Websocket connection to http://void.qam.suse.cz/api/v1/ws/2 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:02:38.776193+01:00 void worker[22430]: [warn] Websocket connection to http://quasar.suse.cz/api/v1/ws/40 finished by remote side with code 1006, no reason - trying again in 10 seconds
2022-03-01T11:02:39.967168+01:00 void worker[30530]: [info] Registering with openQA http://void.qam.suse.cz
2022-03-01T11:02:40.078797+01:00 void worker[30530]: [info] Establishing ws connection via ws://void.qam.suse.cz/api/v1/ws/16
Actions