action #65477
closed[qe-core][sles][functional][sporadic] test fails in clamav - start clamd', timeout => 400
0%
Description
Observation¶
openQA test in scenario sle-15-SP2-Online-aarch64-extra_tests_textmode@aarch64 fails in
clamav
Apr 08 22:47:22 susetest freshclam[11579]: ClamAV update process started at Wed Apr 8 22:47:22 2020
Apr 08 22:47:22 susetest freshclam[11579]: Your ClamAV installation is OUTDATED!
Apr 08 22:47:22 susetest freshclam[11579]: Local version: 0.100.3 Recommended version: 0.102.2
Apr 08 22:47:22 susetest freshclam[11579]: DON'T PANIC! Read https://www.clamav.net/documents/upgrading-clamav
Apr 08 22:47:54 susetest freshclam[11579]: Downloading main.cvd [100%]
Apr 08 22:48:20 susetest freshclam[11579]: main.cvd updated (version: 59, sigs: 4564902, f-level: 60, builder: sigmgr)
Apr 08 22:48:21 susetest freshclam[11579]: Can't query main.59.93.1.0.6810DB54.ping.clamav.net
Apr 08 22:48:39 susetest freshclam[11579]: Downloading daily.cvd [100%]
Apr 08 22:48:50 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 22:48:50 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 22:50:01 susetest CRON[11584]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1)
Apr 08 22:57:25 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 22:57:25 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 22:57:41 susetest freshclam[11579]: daily.cvd updated (version: 25776, sigs: 2251437, f-level: 63, builder: raynman)
Apr 08 22:57:41 susetest freshclam[11579]: Can't query daily.25776.93.1.0.6810DB54.ping.clamav.net
Apr 08 22:57:42 susetest freshclam[11579]: Downloading bytecode.cvd [100%]
Apr 08 22:57:42 susetest freshclam[11579]: bytecode.cvd updated (version: 331, sigs: 94, f-level: 63, builder: anvilleg)
Apr 08 22:57:43 susetest freshclam[11579]: Can't query bytecode.331.93.1.0.6810DB54.ping.clamav.net
Apr 08 22:57:51 susetest freshclam[11579]: Database updated (6816433 signatures) from database.clamav.net (IP: 104.16.219.84)
Apr 08 22:57:51 susetest freshclam[11579]: Clamd was NOT notified: Can't connect to clamd through /var/run/clamav/clamd-socket: No such file or directory
Apr 08 22:58:28 susetest systemd[1]: Reloading.
Apr 08 22:58:41 susetest systemd[1]: Starting Freshclam virus definitions downloader...
Apr 08 22:58:41 susetest freshclam[11621]: freshclam daemon 0.100.3 (OS: linux-gnu, ARCH: aarch64, CPU: aarch64)
Apr 08 22:58:41 susetest freshclam[11621]: ClamAV update process started at Wed Apr 8 22:58:41 2020
Apr 08 22:58:41 susetest freshclam[11621]: Your ClamAV installation is OUTDATED!
Apr 08 22:58:41 susetest freshclam[11621]: Local version: 0.100.3 Recommended version: 0.102.2
Apr 08 22:58:41 susetest freshclam[11621]: DON'T PANIC! Read https://www.clamav.net/documents/upgrading-clamav
Apr 08 22:58:41 susetest freshclam[11621]: main.cvd is up to date (version: 59, sigs: 4564902, f-level: 60, builder: sigmgr)
Apr 08 22:58:41 susetest freshclam[11621]: daily.cvd is up to date (version: 25776, sigs: 2251437, f-level: 63, builder: raynman)
Apr 08 22:58:41 susetest freshclam[11621]: bytecode.cvd is up to date (version: 331, sigs: 94, f-level: 63, builder: anvilleg)
Apr 08 22:58:41 susetest freshclam[11621]: --------------------------------------
Apr 08 22:58:42 susetest systemd[1]: Started Freshclam virus definitions downloader.
Apr 08 22:58:42 susetest systemd[1]: Starting Clamav antivirus Deamon...
Apr 08 22:58:42 susetest clamd[11622]: Received 0 file descriptor(s) from systemd.
Apr 08 22:58:42 susetest clamd[11622]: clamd daemon 0.100.3 (OS: linux-gnu, ARCH: aarch64, CPU: aarch64)
Apr 08 22:58:42 susetest clamd[11622]: Running as user root (UID 0, GID 0)
Apr 08 22:58:42 susetest clamd[11622]: Log file size limited to 1048576 bytes.
Apr 08 22:58:42 susetest clamd[11622]: Reading databases from /var/lib/clamav
Apr 08 22:58:42 susetest clamd[11622]: Not loading PUA signatures.
Apr 08 22:58:42 susetest clamd[11622]: Bytecode: Security mode set to "TrustSigned".
Apr 08 23:00:01 susetest CRON[11625]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1)
Apr 08 23:00:50 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 23:00:50 susetest wickedd-dhcp6[1037]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 08 23:05:22 susetest systemd[1]: clamd.service: Start operation timed out. Terminating.
Apr 08 23:05:22 susetest systemd[1]: Failed to start Clamav antivirus Deamon.
Apr 08 23:05:22 susetest systemd[1]: clamd.service: Unit entered failed state.
Apr 08 23:05:22 susetest systemd[1]: clamd.service: Failed with result 'timeout'.
Suggestion¶
- Get more logs and file a product bug.
Reproducible¶
Fails since (at least) Build 161.1
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Investigations¶
this could be an issue of performance on aarch64, but 400 sec is still not enough?
Updated by SLindoMansilla about 4 years ago
- Description updated (diff)
- Status changed from New to Workable
- Target version set to Milestone 30
- Estimated time set to 42.00 h
Updated by zluo about 4 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
take over
Updated by zluo about 4 years ago
clamd.service could not be started. try to collect statics again latest build.
and increase timeout for aarch64 and collect statics as well.
Updated by zluo about 4 years ago
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10081
let me try this att first.
http://f40.suse.de/tests/7449/file/autoinst-log.txt doesn't show delay issue.
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4158606#next_previous shows clearly that we have sporadic issue with 'start clamd'.
all failed cases belong to openqaworker-arm-1. workers on openqaworker-arm-3 are stable, no failure at all.
Updated by zluo almost 4 years ago
boot_to_desktop has sometimes issue like: http://f40.suse.de/tests/7507#
rogue workqueue lockup bsc#1126782 - Serial error: [ 41.596623] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 32s!
or incomplete status:
Updated by zluo almost 4 years ago
add 'systemctl status clamd' to get more information at first and see what kind of issue could blocks starting clamd:
Updated by zluo almost 4 years ago
to check https://openqa.suse.de/tests/4161390
Updated by zluo almost 4 years ago
Apr 24 10:06:55 susetest systemd[1]: Starting Clamav antivirus Deamon...
Apr 24 10:06:55 susetest clamd[15757]: Received 0 file descriptor(s) from systemd.
Apr 24 10:06:55 susetest clamd[15757]: clamd daemon 0.100.3 (OS: linux-gnu, ARCH: aarch64, CPU: aarch64)
Apr 24 10:06:55 susetest clamd[15757]: Running as user root (UID 0, GID 0)
Apr 24 10:06:55 susetest clamd[15757]: Log file size limited to 1048576 bytes.
Apr 24 10:06:55 susetest clamd[15757]: Reading databases from /var/lib/clamav
Apr 24 10:06:55 susetest clamd[15757]: Not loading PUA signatures.
Apr 24 10:06:55 susetest clamd[15757]: Bytecode: Security mode set to "TrustSigned".
Apr 24 10:10:01 susetest CRON[15760]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1)
Apr 24 10:11:54 susetest clamd[15757]: Loaded 6838621 signatures.
Apr 24 10:12:30 susetest wickedd-dhcp6[1040]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 24 10:12:30 susetest wickedd-dhcp6[1040]: eth0: DHCPv6 is disabled by IPv6 router RA
Apr 24 10:12:35 susetest clamd[15757]: TCP: Bound to [127.0.0.1]:3310
Apr 24 10:12:35 susetest clamd[15757]: TCP: Setting connection queue length to 200
Apr 24 10:12:35 susetest clamd[15757]: LOCAL: Unix socket file /var/run/clamav/clamd-socket
Apr 24 10:12:35 susetest clamd[15757]: LOCAL: Setting connection queue length to 200
Apr 24 10:12:35 susetest clamd[15944]: Limits: Global time limit set to 120000 milliseconds.
Apr 24 10:12:35 susetest clamd[15944]: Limits: Global size limit set to 104857600 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: File size limit set to 26214400 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: Recursion level limit set to 16.
Apr 24 10:12:35 susetest clamd[15944]: Limits: Files limit set to 10000.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxEmbeddedPE limit set to 10485760 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxHTMLNormalize limit set to 10485760 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxHTMLNoTags limit set to 2097152 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxScriptNormalize limit set to 5242880 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxZipTypeRcg limit set to 1048576 bytes.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxPartitions limit set to 50.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxIconsPE limit set to 100.
Apr 24 10:12:35 susetest clamd[15944]: Limits: MaxRecHWP3 limit set to 16.
Apr 24 10:12:35 susetest clamd[15944]: Limits: PCREMatchLimit limit set to 100000.
Apr 24 10:12:35 susetest clamd[15944]: Limits: PCRERecMatchLimit limit set to 5000.
Apr 24 10:12:35 susetest clamd[15944]: Limits: PCREMaxFileSize limit set to 26214400.
Apr 24 10:12:35 susetest clamd[15944]: Archive support enabled.
Apr 24 10:12:35 susetest clamd[15944]: BlockMax heuristic detection disabled.
Apr 24 10:12:35 susetest clamd[15944]: Algorithmic detection enabled.
Apr 24 10:12:35 susetest clamd[15944]: Portable Executable support enabled.
Apr 24 10:12:35 susetest clamd[15944]: ELF support enabled.
Apr 24 10:12:35 susetest clamd[15944]: Mail files support enabled.
Apr 24 10:12:35 susetest clamd[15944]: OLE2 support enabled.
Apr 24 10:12:35 susetest clamd[15944]: PDF support enabled.
Apr 24 10:12:35 susetest clamd[15944]: SWF support enabled.
Apr 24 10:12:35 susetest clamd[15944]: HTML support enabled.
Apr 24 10:12:35 susetest clamd[15944]: XMLDOCS support enabled.
Apr 24 10:12:35 susetest clamd[15944]: HWP3 support enabled.
Apr 24 10:12:35 susetest clamd[15944]: Self checking every 600 seconds.
Apr 24 10:13:35 susetest systemd[1]: clamd.service: Start operation timed out. Terminating.
Apr 24 10:14:27 susetest rsyslogd[1030]: -- MARK --
found in
https://openqa.suse.de/tests/4161390/file/clamav-journal.log
Updated by zluo almost 4 years ago
http://f40.suse.de/tests/7623#next_previous shows no single failure at all.
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4167250#step/clamav/10 shows even network issue, freshclam couldn't finish in 700 seconds...
Updated by zluo almost 4 years ago
to check https://openqa.suse.de/tests/4167635
Updated by zluo almost 4 years ago
use QEMURAM=2048
https://openqa.suse.de/tests/4173417
Updated by zluo almost 4 years ago
http://f40.suse.de/tests/7647#step/boot_to_desktop/8
rogue workqueue lockup bsc#1126782 - Serial error: [ 231.471267] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 151s!
suddenly all tests failed at boot_to_desktop
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4177938 new try because old qcow2 got removed so that my test runs got failed...
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4179769 failed even when I assign QEMURAM=4096, QEMUCPUS=4
[2020-04-29T12:37:56.236 UTC] [debug] starting: /usr/bin/qemu-system-aarch64 -device virtio-gpu-pci -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -soundhw hda -m 4096 -machine virt,usb=off,gic-version=3,its=off -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -boot menu=on,splash-time=5000 -device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 4 -enable-kvm -no-shutdown -vnc :114,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/24/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on -device virtio-blk-device,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0 -blockdev driver=file,node-name=hd1-file,filename=/var/lib/openqa/pool/24/raid/hd1,cache.no-flush=on -blockdev driver=qcow2,node-name=hd1,file=hd1-file,cache.no-flush=on -device virtio-blk-device,id=hd1-device,drive=hd1,serial=hd1 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/var/lib/openqa/pool/24/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0 -drive id=pflash-code-overlay0,if=pflash,file=/var/lib/openqa/pool/24/raid/pflash-code-overlay0,unit=0,readonly=on -drive id=pflash-vars-overlay0,if=pflash,file=/var/lib/openqa/pool/24/raid/pflash-vars-overlay0,unit=1
but it works on seattle:
http://f40.suse.de/tests/7652
Updated by zluo almost 4 years ago
https://bugzilla.suse.com/show_bug.cgi?id=1170825 for record and hope developer can take a look at it and provide feedback on this issue.
Updated by zluo almost 4 years ago
workers on openqaworker-arm-2 looks bad while workers openqaworker-arm-1 still perform good.
it looks we have problem with openqaworker-arm-2 on osd.
Updated by zluo almost 4 years ago
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/241/diffs
this is clearly now. I will check that this issue later on osd.
Updated by okurz almost 4 years ago
For additional information: With https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/241 I reduced the number of worker instances for openqaworker-arm-2 from 30 to 20 which might help with stabilization. Please keep in mind that I also increased the number of worker instances on openqaworker-arm-1 and openqaworker-arm-2 as we have a minimum need of available worker instances to be able to complete our regular test schedule. If you find good indications backed by proper statistics we can also change worker instance numbers again. The "clamav" test seems to be a very good test because the systemd service times out internally so not "test instability" on the top level. For any statistical evaluation please keep in mind to provide sensible numbers, e.g. fail-ratio, mean value, standard deviation, and also counter-experiments, e.g. compare to stability on x86_64 or to a local, stable environment.
Updated by zluo almost 4 years ago
arm-2 workers:
https://openqa.suse.de/tests/4181888#next_previous
arm-3 workers:
https://openqa.suse.de/tests/4181938#next_previous
arm-1 workers:
https://openqa.suse.de/tests/4181988#next_previous
x86_64 workers:
https://openqa.suse.de/tests/4182218#next_previous
Updated by SLindoMansilla almost 4 years ago
- Assignee changed from zluo to SLindoMansilla
Taking over. We need to address hardware impediments.
Updated by okurz almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4246796
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by dzedro almost 4 years ago
passing with virtio https://openqa.suse.de/tests/4260683#step/clamav/93
Updated by okurz almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4308011
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4341542
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4341542
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4341542
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz over 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_textmode
https://openqa.suse.de/tests/4341542
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by SLindoMansilla over 3 years ago
- Status changed from In Progress to Workable
Updated by tjyrinki_suse over 3 years ago
- Subject changed from [sles][functional][u]test fails in clamav - start clamd', timeout => 400 to [qe-core][sles][functional]test fails in clamav - start clamd', timeout => 400
Updated by SLindoMansilla about 3 years ago
- Subject changed from [qe-core][sles][functional]test fails in clamav - start clamd', timeout => 400 to [qe-core][sles][functional][sporadic] test fails in clamav - start clamd', timeout => 400
- Status changed from Workable to New
- Assignee deleted (
SLindoMansilla)
No time to work on this :(
Updated by slo-gin over 1 year ago
This ticket was set to Normal priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by zluo over 1 year ago
- Status changed from New to Rejected
- Assignee set to zluo
since this is a very old ticket, and we don't have the issue for now, reject it