QA (public) &raquo; openQA Project (public) &raquo; openQA Infrastructure (public)

2020-04-01

2020-09-30

Related to openQA Project (public) - action #66376: MM tests fail in obscure way when tap device is not present

Resolved

2020-05-04

Has duplicate openQA Tests (public) - action #66907: Multimachine test fails in setup for ARM workers

Rejected

2020-05-15

Copied from openQA Tests (public) - action #63853: [tools] broken /etc/sysconfig/network/ifcfg-br1

Resolved

2020-02-26

Issue # Delay: days Cancel

History
Notes
Property changes

Actions

Updated by okurz about 5 years ago

Copied from action #63853: [tools] broken /etc/sysconfig/network/ifcfg-br1 added

Actions

Updated by pcervinka almost 5 years ago

Blocks action #66907: Multimachine test fails in setup for ARM workers added

Actions

Updated by okurz almost 5 years ago

Subject changed from ensure openqa worker instances are disabled and stopped when "numofworkers" is reduced in salt pillars to ensure openqa worker instances are disabled and stopped when "numofworkers" is reduced in salt pillars, e.g. causing non-obvious multi-machine failures

Actions

Updated by okurz almost 5 years ago

Blocks deleted (action #66907: Multimachine test fails in setup for ARM workers)

Actions

Updated by okurz almost 5 years ago

Has duplicate action #66907: Multimachine test fails in setup for ARM workers added

Actions

Updated by sebchlad almost 5 years ago

Just to make it clear I'm also adding the message as in poo#66907#note-10: 'And in the meantime I got access to OSD workers, so I will try to help by maintaining ARM workers and when needed, I will mask unwanted workers which should not be there or restart the network interfaces etc.'

Actions

Updated by okurz almost 5 years ago

Target version set to Ready

Actions

Updated by okurz almost 5 years ago

Tags changed from caching, openQA, sporadic, arm, ipmi, worker to worker

Actions

Updated by okurz almost 5 years ago

Related to coordination #65118: [epic] multimachine test fails with symptoms "websocket refusing connection" and other unclear reasons added

Actions

#10

Updated by okurz almost 5 years ago

Related to action #66376: MM tests fail in obscure way when tap device is not present added

Actions

#11

Updated by okurz over 4 years ago

Target version changed from Ready to future

Actions

see https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/438#note_293207

#12

Updated by mkittler over 4 years ago

Actions

#13

Updated by mkittler about 4 years ago

I'm wondering why the existing code doesn't not already cover https://progress.opensuse.org/issues/63874. It looks like it should do exactly what the ticket asks for. The code has already been present for 2 years: https://gitlab.suse.de/openqa/salt-states-openqa/-/commit/e80327e29fce8f6f39051167d389c3cf44099a45

That's maybe because openqa-worker.target still gets started¹ and it simply pulls as many worker slots in as there are pool directories. So the mentioned salt code might work but the effort could be neglected again by starting openqa-worker.target. Note that the number of worker slots for openqa-worker.target to pull in is determined by running a systemd generator which checks for the pool directories present under /var/lib/openqa/pool.

¹ It shouldn't be started anymore as it is disabled and no dependencies seem to pull it in. It nevertheless gets started and I still have to find out why.

Actions

#14

Updated by mkittler about 4 years ago

Assignee set to mkittler

After removing the worker target this might even work: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/454

I can try to activate an additional worker slot somewhere and check whether it'll be stopped and disabled on the next salt run.

Enabled/started openqa-worker-auto-restart@42 on openqaworker-arm-1. It should be disabled/stopped automatically on the next salt run.

Actions

#15

Updated by mkittler about 4 years ago

It didn't work. See https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/455 for details and a fix.

Actions