Project

General

Profile

action #3136

support remote workers

Added by k0da about 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2014-02-25
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Difficulty:

Description

In order to have pool of workers (like we have in OBS). openQA needs ability to talk to workers via the network.

As bonus: Single WebUI could combine results for different architectures like aarch64,power and s390


Subtasks

action #1719: os-autoinst: get rid of isotovideo scriptResolvedcoolo

action #4680: livestream support for remote workersResolvedoholecek

action #4682: workers capabilities advertisementResolvedmlin7442

action #5296: store worker properties in openQA DBResolvedcoolo

action #4860: upload_logs needs to map openqa.suse.de only on local hostsResolvedcoolo

History

#1 Updated by lnussel about 8 years ago

  • Project changed from openQA improvement to openQA Project
  • Category deleted (QA)
  • Assignee deleted (_miska_)
  • Estimated time set to 40.00 h

#2 Updated by coolo almost 8 years ago

  • Category set to 132

#3 Updated by coolo almost 8 years ago

  • Assignee set to bmwiedemann

I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team

#4 Updated by oholecek almost 8 years ago

coolo wrote:

I assign this to Bernhard, because I don't know if the ondrej account is really the ondrej from the slepos team

Yes, it's me.

My first ideas are:
1) split openQA package to openQA, openQA-worker and openQA-(common|base|etc..). Still working with os-autoinst as separate package, required now only by openQA-worker.
2) unify paths so that it is straighforward what data are meant to be shared (isos, tests, testresults for screen shots, ...) and what not (vm pools)
3) don't try to solve storage, only document what needed to be shared. Leave it at admin (eg. use nfs or some distributed storage to ease potential I/O load)

From my brief tests, worker already successfuly communicated with web ui using only openQA API. So from my perspective I don't expect many code changes.

#5 Updated by coolo almost 8 years ago

ondrej != oholecek though ;)

#6 Updated by coolo almost 8 years ago

  • Assignee changed from bmwiedemann to oholecek
  • Target version set to Sprint 12

#7 Updated by coolo almost 8 years ago

Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.

#8 Updated by oholecek almost 8 years ago

  • Status changed from New to In Progress

I have split openQA packages to three in my home:oholecek:openQA project. I'll do some tests today, then take a look at those paths and hopefully be done with this part by tomorrow.

coolo wrote:

Sounds good, we have to keep an eye on the performance of NFS though - it might be worth it to explicitly cache things on the workers.

We still can document to use -o fsc with cachefilesd when using NFS to enable client side caching. I personally would avoid writing some custom caching option as 1) it is usually not as easy and 2) redundant in case when eg. drbd or ceph is used as a storage backend.

After all that there is the next part not yet mentioned and it is the need for some capability advertising mechanism. Since workers will run on different HW we need to inform scheduler about workers capabilities. So that tests requiring eg. ppc will not run on x86 machines and vice versa.
This will require code changes and probably should be synced with what scheduler guys will come up (so it seamlessly plug into test dependencies).

For now I imagine simple key-value publication of set capabilities (CPU_ARCH, MEM_MAX (may be useful when we start limiting workers memory via cgroups ie.), etc..) during worker registration. Changes in capabilities would be announced during job grab.

#9 Updated by coolo almost 8 years ago

please create subtasks for things that take considerable amount of time

#10 Updated by oholecek almost 8 years ago

https://github.com/os-autoinst/openQA/pull/70
and
https://build.opensuse.org/request/show/261450

Maybe I should have sent it in different order or what not. One depends on another.

#11 Updated by coolo almost 8 years ago

  • Due date set to 2014-12-02

due to changes in a related task

#12 Updated by oholecek over 7 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF