Project

General

Profile

Actions

action #63373

closed

[o3][kernel][scheduler][x86_64] Dependent (child) jobs should start after uploading all of parent assets

Added by pvorel about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
QE Kernel - QE Kernel Done
Start date:
2020-02-11
Due date:
% Done:

0%

Estimated time:
Difficulty:
medium

Description

LTP tests depend on install_ltp. On o3, child jobs start after finished tests, but that's before parent has uploaded needed dependencies. It just does not wait until ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt has been uploaded. This file cannot be expressed a dependency in vars.json. But also opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp.qcow2, which is HDD_1 for child test (PUBLISH_HDD_1 for parent) has not been uploaded yet. Paret upload needed asset at 22:45:53, but test starts at 22:10:24. Is it a setup problem or a bug in scheduler? Similar setup is on osd, where it looks ok.

install_ltp.1169614.autoinst-log.txt (https://openqa.opensuse.org/tests/1169614/file/autoinst-log.txt)

[2020-02-10T22:40:43.0499 CET] [info] +++ setup notes +++
[2020-02-10T22:40:43.0499 CET] [info] Start time: 2020-02-10 21:40:43
...
[2020-02-10T22:45:52.0451 CET] [info] Isotovideo exit status: 0
[2020-02-10T22:45:52.0478 CET] [info] +++ worker notes +++
[2020-02-10T22:45:52.0478 CET] [info] End time: 2020-02-10 21:45:52
...
[2020-02-10T22:45:53.0303 CET] [info] Uploading ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt
...
[2020-02-10T22:46:01.0131 CET] [info] Uploading opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp.qcow2

install_ltp.1169614.worker-log.txt (https://openqa.opensuse.org/tests/1169614/file/worker-log.txt)

[2020-02-10T22:45:53.0303 CET] [info] Uploading ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt
[2020-02-10T22:45:53.0303 CET] [info] Uploading ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt using multiple chunks
[2020-02-10T22:45:53.0304 CET] [info] ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt: 1 chunks
[2020-02-10T22:45:53.0304 CET] [info] ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt: chunks of 1000000 bytes each
[2020-02-10T22:45:53.0383 CET] [info] ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt: Processing chunk 1/1 avg speed ~0.336KB/s
...
[2020-02-10T22:48:23.0775 CET] [info] opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp.qcow2: Processing chunk 1128/1128 avg speed ~342.062KB/s

ltp_cpuhotplug.1169623.autoinst-log.txt (https://openqa.opensuse.org/tests/1169623/file/autoinst-log.txt)

[2020-02-10T22:10:24.0956 UTC] [info] Start time: 2020-02-10 22:10:24
...
[2020-02-10T22:10:35.0098 UTC] [debug] Found ISO, caching openSUSE-Tumbleweed-DVD-x86_64-Snapshot20200209-Media.iso
[2020-02-10T22:10:35.0102 UTC] [info] Downloading openSUSE-Tumbleweed-DVD-x86_64-Snapshot20200209-Media.iso, request #27 sent to Cache Service
[2020-02-10T22:10:45.0203 UTC] [info] Download of openSUSE-Tumbleweed-DVD-x86_64-Snapshot20200209-Media.iso processed:
[info] [#27] Cache size of "/var/lib/openqa/cache" is 12GiB, with limit 50GiB
[info] [#27] Downloading "openSUSE-Tumbleweed-DVD-x86_64-Snapshot20200209-Media.iso" from "http://openqa1-opensuse/tests/1169623/asset/iso/openSUSE-Tumbleweed-DVD-x86_64-Snapshot20200209-Media.iso"
...
[2020-02-10T22:10:50.896 UTC] [debug] scheduling boot_ltp tests/kernel/boot_ltp.pm
Can not open runtest asset /var/lib/openqa/share/factory/other/ltp-cpuhotplug-opensuse-Tumbleweed-x86_64-20200209-DVD@64bit-with-ltp-qcow2.txt: No such file or directory at /var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/lib/main_ltp.pm line 64.
Compilation failed in require at /usr/bin/isotovideo line 288.
[2020-02-10T22:10:50.896 UTC] [debug] terminating command server 4156 because test execution ended through exception
[2020-02-10T22:10:51.897 UTC] [debug] done with command server
4153: EXIT 1

ltp_cpuhotplug.1169623.worker-log.txt (https://openqa.opensuse.org/tests/1169623/file/worker-log.txt)

[2020-02-10T22:10:50.0261 UTC] [info] Preparing cgroup to start isotovideo

Files

install_ltp.1169614.autoinst-log.txt (417 KB) install_ltp.1169614.autoinst-log.txt parent's autoinst-log.txt pvorel, 2020-02-11 07:19
install_ltp.1169614.worker-log.txt (221 KB) install_ltp.1169614.worker-log.txt parent's worker-log.txt pvorel, 2020-02-11 07:19
ltp_cpuhotplug.1169623.autoinst-log.txt (5.35 KB) ltp_cpuhotplug.1169623.autoinst-log.txt child's autoinst-log.txt pvorel, 2020-02-11 07:20
ltp_cpuhotplug.1169623.worker-log.txt (606 Bytes) ltp_cpuhotplug.1169623.worker-log.txt child's worker-log.txt pvorel, 2020-02-11 07:20
install_ltp.1169614.vars.json (3.81 KB) install_ltp.1169614.vars.json parent's vars pvorel, 2020-02-11 07:32
ltp_cpuhotplug.1169623.vars.json (3.03 KB) ltp_cpuhotplug.1169623.vars.json child's vars pvorel, 2020-02-11 07:33

Related issues 1 (0 open1 closed)

Related to openQA Tests - action #51743: [openqa] All LTP tests are failing on boot_ltp for openSUSE (o3) on [x86_64]Resolvedpvorel2019-05-21

Actions
Actions #1

Updated by pvorel about 4 years ago

  • Related to action #51743: [openqa] All LTP tests are failing on boot_ltp for openSUSE (o3) on [x86_64] added
Actions #2

Updated by pvorel about 4 years ago

  • Description updated (diff)
Actions #3

Updated by pvorel about 4 years ago

  • Subject changed from [o3][scheduler] Dependent (child) jobs should start after uploading all of parent assets to [o3][scheduler][x86_64] Dependent (child) jobs should start after uploading all of parent assets
Actions #4

Updated by rpalethorpe about 4 years ago

IIRC the problem here is that os-autoinst generates the list of assets from the SUT (specifically from the LTP package/source). OpenQA therefor does not know which assets to expect ahead of time.

Probably the easiest solution is to create a runtest archive, then OpenQA only needs to expect one asset and can wait for it. (this is also easier for OpenQA's database).

Another problem might be that OpenQA only waits for VM image type assets. So some extra work may be required in the asset handling code.

Actions #5

Updated by okurz about 4 years ago

  • Project changed from openQA Project to openQA Tests
  • Subject changed from [o3][scheduler][x86_64] Dependent (child) jobs should start after uploading all of parent assets to [o3][kernel][scheduler][x86_64] Dependent (child) jobs should start after uploading all of parent assets
  • Category changed from Regressions/Crashes to Bugs in existing tests

rpalethorpe pretty much nailed it. I would not know what openQA can do better when it wouldn't know about the assets from the beginning. Not sure if you can even call it "parent assets" in the current way of implementation. Even more so I rather see this as an issue for "openQA tests" rather than openQA itself.

Actions #6

Updated by pvorel about 4 years ago

  • Priority changed from Normal to High
  • Difficulty set to medium

We will have to implement it to get back testing on o3 on intel. BTW I wonder what's different on o3 (all LTP run problems are only on o3 on intel).

BTW It would be nice if LTP way of testing would be better integrated into openQA. I consider LTP related changes as an improvements, maybe some other tests might would benefit from it as well, but understand that tools team doesn't have resources for this better integration and kernel-qa neither.

Actions #8

Updated by pvorel about 4 years ago

Although https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9747 is a general improvement it will probably not fix o3.
Mdoucha noticed that the problem is only on openqaworker7, yep, openqaworker1 and openqaworker4 are ok.

I wanted to check NFS on openqaworker7, but there is no password authentication (Permission denied (publickey)). That suggest there is something different on openqaworker7.

Actions #9

Updated by pvorel about 4 years ago

  • Status changed from New to Feedback

openqaworker7 was fixed by okurz (thanks!):

@MDoucha I have fixed the NFS mount on openqaworker7. This was an oversight by me when setting up the machine for o3 some days ago. The machine is within the o3 VLAN hence not reachable from internal SUSE same as openqaworker1 and openqaworker4 and others.

So it should be fixed in next build

Actions #10

Updated by pvorel about 4 years ago

  • Assignee set to pvorel
  • Target version set to 445
Actions #11

Updated by pvorel about 4 years ago

  • Status changed from Feedback to Resolved

Worker fixed: https://openqa.opensuse.org/tests/1200301
This also verifies, that the new code works.

Actions #12

Updated by metan about 4 years ago

  • Target version changed from 445 to 457
Actions #13

Updated by pcervinka over 3 years ago

  • Target version changed from 457 to QE Kernel Done
Actions

Also available in: Atom PDF