Project

General

Profile

Actions

action #163391

open

test fails in clamav: test needs to be optimized / aligned with clamav system reqs

Added by dimstar 6 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Bugs in existing tests
Target version:
-
Start date:
2024-07-05
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

While debugging https://bugzilla.opensuse.org/show_bug.cgi?id=1227404 (originally filed as product bug), it turned out that there are 'just' a few things with the test that are not 'right'

Please ensure to reach out to Reinhard Max vor input on the optimizations! He has good understanding of how clamav acts/reacts and this knowledge should be used.

Few things we identified:

  • https://docs.clamav.net/Introduction.html?highlight=system%20requirements#recommended-system-requirements -The requirements are 3GB+ (we run on a 2GB machine - and the test fails on OOM, clamscan being killed)
  • while clamscan runs (and consumes a lot of memory), there is a 2nd instance of clamav running as 'clamd' (the daemon) - further pushing on the memory
  • the clamscan calls redirecting their output to a log file, which is only printed in success cases, makes debugging needlessly complicated (the log is not even uploaded on failure); Proposal: send output to tty
  • On successful runs, clamd, freshclam are stopped twice (end of run() plus post_run_hook

And Max certainly has more input than that

openQA test in scenario opensuse-Tumbleweed-JeOS-for-kvm-and-xen-x86_64-jeos-extra@64bit_virtio-2G fails in
clamav

Test suite description

Same as jeos, plus some more tests.

Reproducible

Fails since (at least) Build 20240612

Expected result

Last good: 20240611 (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by dimstar 6 months ago

dimstar wrote:

And Max certainly has more input than that

That's supposed to be Reinhard Max of course

Actions #2

Updated by rmax 6 months ago

Here are my recommendations for cleaning up and improving the tests in addition to what dimstar already wrote:

  • Raising the start timeout of clamd for the test is not needed anymore, because clamd.service now sets it to 5 minutes.
  • In scan_and_parse, remove all command line switches so that the output goes to the terminal (even if it gets ugly due to the progress bar).
  • For the purpose of this test, after running freshclam in the foreground to initialize the database there is no need to also start it as a daemon. Additionally freshclam.service does not support starting it as a daemon anymore, because it now runs in oneshot mode and gets triggered by a systemd timer.
  • If the intention is to test the service file for freshclam as well, then the service could be started instead of running the binary manually (it blocks until the database download is complete).
  • Given that the files that are being tested are quite small, 2G of RAM would be enough for the short term, if we stop running the ClamAV engine (and loading the large virus database) twice at the same time. This means after the setup stuff the clamscan test should run first, then start clamd (it blocks systemctl until the virus database has loaded) and then run the clamdscan test.
  • I noticed that at least in some runs the clamdscan test failed (probably when clamd had crashed due to oom), but the failure was not recognized by the testsuite.
  • The test should not create a swap file to get around limited RAM, because it makes no sense to run a service like ClamAV that is both memory and CPU intensive on a machine that is so low on RAM that it needs to swap to get the virus database loaded.
Actions #3

Updated by openqa_review 5 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: jeos-extra@64bit_virtio-2G
https://openqa.opensuse.org/tests/4363497#step/clamav/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions

Also available in: Atom PDF