action #32887

Worker version is reset -> no jobs scheduled

Added by coolo almost 2 years ago. Updated almost 2 years ago.

Status:ResolvedStart date:08/03/2018
Priority:ImmediateDue date:
Assignee:-% Done:

0%

Category:Concrete Bugs
Target version:Done
Difficulty:
Duration:

Description

We have 4 working workers and 248 idle - so I set this to immediate.

Worker2:6

Mar 07 10:14:37 openqaworker2 worker[27246]: [info] registering worker openqaworker2 version 7 with openQA openqa.suse.de using protocol version [1]
Mar 07 10:14:43 openqaworker2 worker[27246]: [error] Unable to upgrade connection for host "openqa.suse.de" to WebSocket: 503. proxy_wstunnel enabled?

-> no jobs for 20 hours, but:

Alive: yes
Websocket connection: Active
Seen: about a minute ago
Status: Online

Version: unknown

And I assume the problem is the unknown version - as the scheduler will ignore all those.

Instance 21 on the same host worked for a while and stopped then:

Mar 07 20:54:51 openqaworker2 worker[30560]: [info] uploading autoinst-log.txt
Mar 07 20:54:51 openqaworker2 worker[30560]: [info] uploading worker-log.txt
Mar 07 20:54:53 openqaworker2 worker[30560]: [info] cleaning up 01525134-caasp-3.0-MS-HyperV-x86_64-Build11.33-MicroOS-vmx_hyperv@svirt-hyperv
Alive: yes
Websocket connection: Active
Seen: less than a minute ago
Status: Online

Version: unknown

So somehow the worker version is forgotten.

History

#1 Updated by coolo almost 2 years ago

  • Subject changed from Workers don't register and don't care to Worker version is reset -> no jobs scheduled

So
https://openqa.opensuse.org/tests/628708 finished 6 hours ago and at that time API_VERSION was set to ''

openqa=> select * from worker_properties where worker_id=41;
  id  |             key              |              value              | worker_id |      t_created      |      t_updated      
------+------------------------------+---------------------------------+-----------+---------------------+---------------------
 9796 | WEBSOCKET_API_VERSION        |                                 |        41 | 2018-03-07 09:47:14 | 2018-03-08 00:21:53

#2 Updated by coolo almost 2 years ago

Yup, _finish in the WS server sets the versions to '' and later the WS says:

Received a message from an incompatible worker 41

as '' is not the right API.

#3 Updated by szarate almost 2 years ago

I guess this is already solved?

#4 Updated by coolo almost 2 years ago

  • Status changed from New to Resolved
  • Target version changed from Current Sprint to Done

Also available in: Atom PDF