Project

General

Profile

Actions

action #18684

closed

Jobs with worker class qemu_x86_64 are taken by machines without this class, causing incomplete jobs

Added by SLindoMansilla almost 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
-
Start date:
2017-04-20
Due date:
% Done:

0%

Estimated time:

Description

observation

Jobs with worker class qemu_x86_64 (settings and var.json) are taken by machines without this class, causing incomplete jobs.

For the jobs taken by overdrive2, the worker class on settings and var.json is different (qemu_x86_64 in settings and qemu_aarch64_maintenance in var.json)

problem

H1 When a big number of jobs are created by cloning a job (e.g. 100), 5% of these jobs are taken by a worker without a matching worker class.
H2 workers.ini is not configured properly. REJECTED BY E2-1 and E2-2
H3 The different worker classes on settings and var.json for the jobs taken by overdrive2 happened because the workers support multi-webui.

E1-1 Execute the workers on overdrive2 and QA-Power8-5-kvm with verbose mode and clone a job 100 times.
R1-1 Not done yet

E2-1 Check that the worker classes are properly configured in workers.ini on overdrive2.
R2-1 overdrive2 uses: qemu_aarch64_maintenance

E2-2 Check that the worker classes are properly configured on workers.ini on QA-Power8-5-kvm.
R2-2 QA-Power8-5-kvm uses: qemu_ppc64le,qemu_ppc64le_no_tmpfs

E3-1 Check that the worker.ini configuration for overdrive2 is set for multi webui.
R3-1 Two webui configured: http://openqa.suse.de http://lord.arch.suse.de

E3-2 Check that the worker.ini configuration for QA-Power8-5-kvm is set for one webui.
R3-2 One webui configured: http://openqa.suse.de


Related issues 3 (0 open3 closed)

Related to openQA Tests - action #18634: [sles][functional]textmode install_and_reboot fails to stop the reboot countdownResolvedSLindoMansilla2017-04-19

Actions
Related to openQA Project - action #20002: [tools] openqa sometimes doesn't update job_dependencies tableResolved2017-06-22

Actions
Has duplicate openQA Tests - action #19376: [tools][sle][functional] test fails to load. ppc64le worker tries to load x86_64 imageResolved2017-05-25

Actions
Actions #1

Updated by coolo almost 7 years ago

What exactly did you do? WORKER_CLASS is different in vars.json than in settings: https://openqa.suse.de/tests/889178/file/vars.json

Actions #2

Updated by coolo almost 7 years ago

Somehow multiple webuis are related:

Apr 20 14:14:31 overdrive2 worker[19403]: [ERROR] 502 response: Proxy Error (remaining tries: 2)
Apr 20 14:32:03 overdrive2 worker[19403]: [INFO] registering worker with openQA http://lord.arch.suse.de...
Apr 20 14:32:03 overdrive2 worker[19403]: [INFO] got job 889178: 00889178-sle-12-SP3-Server-DVD-x86_64-Build0340-textmode_statistics_poo_18634@64bit
Apr 20 14:32:03 overdrive2 worker[19403]: [INFO] 2015: WORKING 889178
Apr 20 14:32:11 overdrive2 worker[19403]: child 2015 died with exit status 256
Actions #3

Updated by SLindoMansilla almost 7 years ago

  • Description updated (diff)
Actions #4

Updated by SLindoMansilla almost 7 years ago

  • Description updated (diff)
  • Status changed from New to In Progress
Actions #5

Updated by SLindoMansilla almost 7 years ago

  • Related to action #18634: [sles][functional]textmode install_and_reboot fails to stop the reboot countdown added
Actions #6

Updated by SLindoMansilla almost 7 years ago

  • Assignee deleted (SLindoMansilla)

Unasigned to keep only 2 assigned tickets

Actions #7

Updated by okurz almost 7 years ago

  • Has duplicate action #19376: [tools][sle][functional] test fails to load. ppc64le worker tries to load x86_64 image added
Actions #8

Updated by szarate almost 7 years ago

  • Related to action #20002: [tools] openqa sometimes doesn't update job_dependencies table added
Actions #9

Updated by szarate almost 7 years ago

  • Assignee set to szarate

This this pr should take care of this. Problem is related to how the jobs are being added to the database, moving everything into a transaction as coolo suggested, solves the problem (apparently).

Time to hunt for poo#20002 as i belive that the condition there is a bit different.

Actions #10

Updated by szarate almost 7 years ago

  • Status changed from In Progress to Resolved

I believe this is solved now.

Actions

Also available in: Atom PDF