Project

General

Profile

Actions

action #67462

closed

ppc64le jobs on o3 are slow due unintended VIRTIO_CONSOLE=0 setup (many ltp tests ppc64le are running more than 5 hours since TW 20200520 snapshot)

Added by michel_mno almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
QE Kernel - QE Kernel Done
Start date:
2020-05-29
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Many ltp tests ppc64le are running more than 5 hours since TW 20200520 snapshot
There was normal timing before 20200517

https://openqa.opensuse.org/tests/1281822#next_previous => from 38mn01 to >6h
https://openqa.opensuse.org/tests/1281821#next_previous => from 2mn46 to 5h18
https://openqa.opensuse.org/tests/1281830#next_previous => from 24mn37 to 5h31
https://openqa.opensuse.org/tests/1281828#next_previous => from 25mn19 to >6h
https://openqa.opensuse.org/tests/1281827#next_previous => from 10mn15 to >6h

Observation

openQA test in scenario opensuse-Tumbleweed-DVD-ppc64le-ltp_net_stress_ipsec_udp@ppc64le fails in
boot_ltp

Test suite description

Reproducible

Fails since (at least) Build 20200528 (current job)

Expected result

Last good: 20200517 (or more recent)

Further details

Always latest result in this scenario: latest


Files

ltp_net_stress_ipsec_udp_autoinst.log (243 KB) ltp_net_stress_ipsec_udp_autoinst.log ltp_net_stress_ipsec_udp_autoinst.log michel_mno, 2020-05-29 15:41
Actions #1

Updated by michel_mno almost 4 years ago

  • work on one of failing test, ltp_net_stress_ipsec_udp: https://openqa.opensuse.org/tests/1279237#next_previous => from 38mn01 to >6h
  • retrieve associated autoinst-log.txt as attached ./ltp_net_stress_ipsec_udp_autoinst.log
  • The extracted lines show that the 301s of the "udp4_ipsec02" module is mainly elapsed in two calls:
    • wait_serial => 04:43:58.944 to 04:47:11.127 => 192s 3mn12
    • wait_serial => 04:47:15.195 to 04:48:46.344 => 91s 1mn31
  • so timing issue in called udp_ipsec.sh & vmstat calls for this module.

Seems that all longer tests are network related.

Probably not kernel related problem, because same 5.6.12-1-Default
for TW 20200517 that passed and 20200520 that failed:

Is it a problem on infra of Power8 hosts ?
How to execute one of them locally (outside openQA) ?

=== extract of ./ltp_net_stress_ipsec_udp_autoinst.log
[2020-05-28T04:43:45.220 UTC] [debug] ||| starting udp4_ipsec02 tests/kernel/run_ltp.pm
[2020-05-28T04:43:45.228 UTC] [debug] tests/kernel/run_ltp.pm:325 called testapi::type_string
[2020-05-28T04:43:45.228 UTC] [debug] <<< testapi::type_string(string="(udp_ipsec.sh -p ah -m transport -s 100:1000:65483:R65483; echo \"### TEST udp4_ipsec02 COMPLETE >>> \$?\") | tee /dev/hvc0\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2020-05-28T04:43:58.943 UTC] [debug] tests/kernel/run_ltp.pm:327 called testapi::wait_serial
[2020-05-28T04:43:58.944 UTC] [debug] <<< testapi::wait_serial(buffer_size=undef, regexp=qr/### TEST udp4_ipsec02 COMPLETE >>> \d+/u, timeout=900, quiet=undef, record_output=1, expect_not_found=0, no_regex=0)
[2020-05-28T04:47:11.127 UTC] [debug] >>> testapi::wait_serial: (?^u:### TEST udp4_ipsec02 COMPLETE >>> \d+): ok
[2020-05-28T04:47:11.128 UTC] [debug] tests/kernel/run_ltp.pm:346 called testapi::script_run
[2020-05-28T04:47:11.128 UTC] [debug] <<< testapi::script_run(cmd="vmstat -w", quiet=undef, output="", timeout=undef)
[2020-05-28T04:47:11.128 UTC] [debug] tests/kernel/run_ltp.pm:346 called testapi::script_run
[2020-05-28T04:47:11.129 UTC] [debug] <<< testapi::type_string(string="vmstat -w", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2020-05-28T04:47:12.032 UTC] [debug] tests/kernel/run_ltp.pm:346 called testapi::script_run
[2020-05-28T04:47:12.032 UTC] [debug] <<< testapi::type_string(string="; echo kQ8Ut-\$?- > /dev/hvc0\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2020-05-28T04:47:15.195 UTC] [debug] tests/kernel/run_ltp.pm:346 called testapi::script_run
[2020-05-28T04:47:15.195 UTC] [debug] <<< testapi::wait_serial(buffer_size=undef, regexp=qr/kQ8Ut-\d+-/, timeout=30, quiet=undef, record_output=undef, no_regex=0, expect_not_found=0)
[2020-05-28T04:48:46.344 UTC] [debug] >>> testapi::wait_serial: (?^:kQ8Ut-\d+-): fail
[2020-05-28T04:48:46.347 UTC] [debug] ||| finished udp4_ipsec02 kernel at 2020-05-28 04:48:46 (301 s)
===
Actions #2

Updated by pvorel almost 4 years ago

There is VIRTIO_CONSOLE=0, no surprise for that slow down. Looking for where it's set (not in test suite nor in machine).

Actions #3

Updated by pvorel almost 4 years ago

  • Assignee set to pvorel
  • Target version set to 445
Actions #4

Updated by pvorel almost 4 years ago

  • Subject changed from many ltp tests ppc64le are running more than 5 hours since TW 20200520 snapshot to ppc64le jobs on o3 are slow due unintended VIRTIO_CONSOLE=0 setup (many ltp tests ppc64le are running more than 5 hours since TW 20200520 snapshot)
  • Status changed from New to In Progress
Actions #5

Updated by pvorel almost 4 years ago

  • Status changed from In Progress to Feedback
  • Priority changed from High to Normal

Fixed by jpupava. It was in medium types.

Actions #6

Updated by dzedro almost 4 years ago

I forgot to remove it. Added it there while fixing ppc64le failures [1] after changing console tests to use virtio_console

[1] https://openqa.opensuse.org/group_overview/4

Actions #7

Updated by pvorel almost 4 years ago

  • Status changed from Feedback to Resolved

@dzedro: thanks for explanation

New build is using virtio console https://openqa.opensuse.org/tests/1292396, running only 11 min instead of 28 min (3x speedup).
LTP tests will get fixed by this, closing.

Actions #8

Updated by pcervinka almost 4 years ago

  • Target version changed from 445 to 457
Actions #9

Updated by pcervinka over 3 years ago

  • Target version changed from 457 to QE Kernel Done
Actions

Also available in: Atom PDF