action #3136
closedsupport remote workers
Description
In order to have pool of workers (like we have in OBS). openQA needs ability to talk to workers via the network.
As bonus: Single WebUI could combine results for different architectures like aarch64,power and s390
Updated by lnussel over 10 years ago
- Project changed from openQA improvement to openQA Project (public)
- Category deleted (
QA) - Assignee deleted (
_miska_) - Estimated time set to 40.00 h
Updated by coolo about 10 years ago
- Assignee set to bmwiedemann
I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team
Updated by oholecek about 10 years ago
coolo wrote:
I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team
Yes, it's me.
My first ideas are:
1) split openQA package to openQA, openQA-worker and openQA-(common|base|etc..). Still working with os-autoinst as separate package, required now only by openQA-worker.
2) unify paths so that it is straighforward what data are meant to be shared (isos, tests, testresults for screen shots, ...) and what not (vm pools)
3) don't try to solve storage, only document what needed to be shared. Leave it at admin (eg. use nfs or some distributed storage to ease potential I/O load)
From my brief tests, worker already successfuly communicated with web ui using only openQA API. So from my perspective I don't expect many code changes.
Updated by coolo about 10 years ago
- Assignee changed from bmwiedemann to oholecek
- Target version set to Sprint 12
Updated by coolo about 10 years ago
Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.
Updated by oholecek about 10 years ago
- Status changed from New to In Progress
I have split openQA packages to three in my home:oholecek:openQA project. I'll do some tests today, then take a look at those paths and hopefully be done with this part by tomorrow.
coolo wrote:
Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.
We still can document to use -o fsc with cachefilesd when using NFS to enable client side caching. I personally would avoid writing some custom caching option as 1) it is usually not as easy and 2) redundant in case when eg. drbd or ceph is used as a storage backend.
After all that there is the next part not yet mentioned and it is the need for some capability advertising mechanism. Since workers will run on different HW we need to inform scheduler about workers capabilities. So that tests requiring eg. ppc will not run on x86 machines and vice versa.
This will require code changes and probably should be synced with what scheduler guys will come up (so it seamlessly plug into test dependencies).
For now I imagine simple key-value publication of set capabilities (CPU_ARCH, MEM_MAX (may be useful when we start limiting workers memory via cgroups ie.), etc..) during worker registration. Changes in capabilities would be announced during job grab.
Updated by coolo about 10 years ago
please create subtasks for things that take considerable amount of time
Updated by oholecek about 10 years ago
https://github.com/os-autoinst/openQA/pull/70
and
https://build.opensuse.org/request/show/261450
Maybe I should have sent it in different order or what not. One depends on another.
Updated by oholecek about 10 years ago
- Status changed from In Progress to Resolved