action #3136: support remote workers - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #3136

closed

support remote workers

Added by k0da over 10 years ago. Updated about 10 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

oholecek

Category:

Feature requests

Target version:

Sprint 12

Start date:

2014-02-25

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Description

In order to have pool of workers (like we have in OBS). openQA needs ability to talk to workers via the network.

As bonus: Single WebUI could combine results for different architectures like aarch64,power and s390

Subtasks 5 (0 open — 5 closed)

Actions

Copy link

Updated by lnussel over 10 years ago

Project changed from openQA improvement to openQA Project (public)
Category deleted (QA)
Assignee deleted (~~_miska_~~)
Estimated time set to 40.00 h

Actions

Copy link

Updated by coolo about 10 years ago

Category set to 132

Actions

Copy link

Updated by coolo about 10 years ago

Assignee set to bmwiedemann

I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team

Actions

Copy link

Updated by oholecek about 10 years ago

coolo wrote:

I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team

Yes, it's me.

My first ideas are:
1) split openQA package to openQA, openQA-worker and openQA-(common|base|etc..). Still working with os-autoinst as separate package, required now only by openQA-worker.
2) unify paths so that it is straighforward what data are meant to be shared (isos, tests, testresults for screen shots, ...) and what not (vm pools)
3) don't try to solve storage, only document what needed to be shared. Leave it at admin (eg. use nfs or some distributed storage to ease potential I/O load)

From my brief tests, worker already successfuly communicated with web ui using only openQA API. So from my perspective I don't expect many code changes.

Actions

Copy link

Updated by coolo about 10 years ago

ondrej != oholecek though ;)

Actions

Copy link

Updated by coolo about 10 years ago

Assignee changed from bmwiedemann to oholecek
Target version set to Sprint 12

Actions

Copy link

Updated by coolo about 10 years ago

Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.

Actions

Copy link

Updated by oholecek about 10 years ago

Status changed from New to In Progress

I have split openQA packages to three in my home:oholecek:openQA project. I'll do some tests today, then take a look at those paths and hopefully be done with this part by tomorrow.

coolo wrote:

Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.

We still can document to use -o fsc with cachefilesd when using NFS to enable client side caching. I personally would avoid writing some custom caching option as 1) it is usually not as easy and 2) redundant in case when eg. drbd or ceph is used as a storage backend.

After all that there is the next part not yet mentioned and it is the need for some capability advertising mechanism. Since workers will run on different HW we need to inform scheduler about workers capabilities. So that tests requiring eg. ppc will not run on x86 machines and vice versa.
This will require code changes and probably should be synced with what scheduler guys will come up (so it seamlessly plug into test dependencies).

For now I imagine simple key-value publication of set capabilities (CPU_ARCH, MEM_MAX (may be useful when we start limiting workers memory via cgroups ie.), etc..) during worker registration. Changes in capabilities would be announced during job grab.

Actions

Copy link