action #119278
closed[qem][qe-core]test fails in valgrind
0%
Description
Observation¶
# bash -e valgrind-test.sh; echo ZXZuM-$?-
Compiling test program ...
Testing valgrind ...
ZXZuM-1-
openQA test in scenario sle-15-SP4-Server-DVD-Updates-aarch64-mau-extratests2@aarch64-virtio fails in
valgrind
Test suite description¶
Testsuite maintained at https://gitlab.suse.de/qa-maintenance/qam-openqa-yml. Run console tests against aggregated test repo
Reproducible¶
Fails since (at least) Build 20221019-1
Expected result¶
Last good: 20221018-1 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by ph03nix over 1 year ago
This is likely a timeout issue, we might need to set a higher timeout on line 47: assert_script_run 'bash -e valgrind-test.sh'
:
assert_script_run('bash -e valgrind-test.sh', timeout => 300);
Updated by rfan1 over 1 year ago
- Assignee set to rfan1
It might have something to do with bad performance on arm workers.
Updated by rfan1 over 1 year ago
- Status changed from New to In Progress
It is not a performance issue, after printing the logs, I can see the below errors:
-valgrind --tool=memcheck --trace-children=yes ./valgrind-test 2>/dev/null
+valgrind -v --tool=memcheck --trace-children=yes ./valgrind-test
logs:
+ valgrind -v --tool=memcheck --trace-children=yes ./valgrind-test
==2050== Memcheck, a memory error detector
==2050== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2050== Using Valgrind-3.18.1-42b08ed5bd-20211015 and LibVEX; rerun with -h for copyright info
==2050== Command: ./valgrind-test
==2050==
--2050-- Valgrind options:
--2050-- -v
--2050-- --tool=memcheck
--2050-- --trace-children=yes
--2050-- Contents of /proc/version:
--2050-- Linux version 5.14.21-150400.24.28-default (geeko@buildhost) (gcc (SUSE Linux) 7.5.0, GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-150100.7.37) #1 SMP PREEMPT_DYNAMIC Mon Oct 10 15:21:12 UTC 2022 (f82da2c)
--2050--
--2050-- Arch and hwcaps: ARM64, LittleEndian, v8-atomics
--2050-- Page sizes: currently 4096, max supported 65536
--2050-- Valgrind library directory: /usr/lib/valgrind
--2050-- Reading syms from /var/tmp/valgrind-test
--2050-- Reading syms from /lib64/ld-2.31.so
--2050-- Reading syms from /usr/lib/valgrind/memcheck-arm64-linux
--2050-- object doesn't have a symbol table
--2050-- object doesn't have a dynamic symbol table
--2050-- Scheduler: using generic scheduler lock implementation.
--2050-- Reading suppressions file: /usr/lib/valgrind/default.supp
==2050== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-2050-by-root-on-susetest
==2050== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-2050-by-root-on-susetest
==2050== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-2050-by-root-on-susetest
==2050==
==2050== TO CONTROL THIS PROCESS USING vgdb (which you probably
==2050== don't want to do, unless you know exactly what you're doing,
==2050== or are doing some strange experiment):
==2050== /usr/lib/valgrind/../../bin/vgdb --pid=2050 ...command...
==2050==
==2050== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==2050== /path/to/gdb ./valgrind-test
==2050== and then give GDB the following command
==2050== target remote | /usr/lib/valgrind/../../bin/vgdb --pid=2050
==2050== --pid is optional if only one valgrind process is running
==2050==
VEX: Mismatch detected between RDMA and atomics features.
Found: v8-atomics
Cannot continue. Good-bye
vex storage: T total 0 bytes allocated
vex storage: P total 0 bytes allocated
valgrind: the 'impossible' happened:
LibVEX called failure_exit().
However, I can't see the issue on my local setup, it might have something to do with the hardware configration.
My setup:
# lscpu|grep Flags
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
openQA workers:
# lscpu; echo 5GZT6-$?-
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: Cavium
Model name: ThunderX 88XX
Model: 1
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-3
Let me try to file a bug
Updated by rfan1 over 1 year ago
- Status changed from In Progress to Blocked
Updated by mgrifalconi over 1 year ago
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/15779 removes the test for now. Since it never worked on 15-sp4.
Before adding it back and to further debug, we should test it in a development job group
Updated by apappas over 1 year ago
- Tags changed from bugbusters to bugbusters, qe-core-coverage
Updated by rfan1 over 1 year ago
The bug is fixed, I will try to re-test it.
However the parent job is failed due to https://bugzilla.suse.com/show_bug.cgi?id=1204924.
Updated by rfan1 over 1 year ago
- Status changed from Blocked to In Progress
Check if the fix is checked in: http://openqa.suse.de/tests/10071023