Project

General

Profile

action #65109

[tools] backend setting & API improvement

Added by dzedro 4 months ago. Updated 14 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Start date:
2020-03-31
Due date:
% Done:

0%

Estimated time:
Difficulty:
Duration:

Description

There are some things that could be improved.

Maybe decreasing IO load on workers, snapshots on qemu backend are done basically everywhere even where it doesn't make much sense, small jobs(one, few) tests, multimachine and installations.

I add QEMU_DISABLE_SNAPSHOTS=1 into most of QAM suites, there are installations with following tests, but fail is mostly some random timeout, typing, etc. issue. The test is restarted anyway... It could make sense to disable snapshots by default on suites where snapshot is waste of IO, exceptions could overwrite it with QEMU_DISABLE_SNAPSHOTS=0.

Review testapi, remove/fix inconsistent functions and add functions which are used widely in combinations?
e.g. wait_screen_change is widely used and failing where is expected that the function will wait, wait can be more than just screen change, generally is the function waiting 0 seconds. Thus I don't use it, because when you add wait time, mostly in gui/ncurses tests, you expect that it will wait some minimal time, not zero.

I guess there is some screen change like button animation, but that does not mean the action is finished and next step should continue safely.
Exactly for this I prefer wait_still_screen, as you can define min/max wait time based on screen activity.

send_key is often used with wait_still_screen or wait_screen_change, despite wait_screen_change is part of send_key as parameter to wait after key press. I guess most of people don't know that there is parameter in send_key for wait_screen_change, which is still questionable if it will be sufficient wait function.

Btw I was thinking that the parameter is not working since it was using wait_idle, which became deprecated. In one tests was created send_key_and_wait [1], use of wait_still_screen could be parameter of send_key, replace or be second parameter with wait_screen_change?

Acceptance criteria

  • AC1: Unify implementation of max_interval/wait_still_screen/wait_screen_change for send_key and other testapi functions.
  • AC2: Reduce I/O overhead proactively

Suggestions

  • AC1:
    • Extend send_key with a wait_still_screen option
    • Refactor wait logic for better re-use
  • AC2:
    • Enable and disable QEMU_DISABLE_SNAPSHOTS systematically
    • Evaluate resource usage e.g. CPU load, I/O causing timeouts

[1] https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/yast2_gui/yast2_instserver.pm#L27

History

#1 Updated by cdywan 4 months ago

Note for the benefit of drive-by readers: This is based on previous notes, and will need some refinement. I intend to identify some actionable points from this.

#2 Updated by cdywan 4 months ago

  • Description updated (diff)

#3 Updated by cdywan 4 months ago

dzedro wrote:

send_key is often used with wait_still_screen or wait_screen_change, despite wait_screen_change is part of send_key as parameter to wait after key press. I guess most of people don't know that there is parameter in send_key for wait_screen_change, which is still questionable if it will be sufficient wait function.

Btw I was thinking that the parameter is not working since it was using wait_idle, which became deprecated. In one tests was created send_key_and_wait [1], use of wait_still_screen could be parameter of send_key, replace or be second parameter with wait_screen_change?

[1] https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/yast2_gui/yast2_instserver.pm#L27

So the current version of send_key_and_wait is basically send_key $key, $stilltime, $timeout//=5; wait_still_screen $stilltime, $timeout;. On average it's called with a $stilltime of 2 seconds... so I see two attractive options:

  1. Extending send_key with a new $wait_still_screen option, like type_string/password woud make sense for consistency/simplicity
  2. Adding send_key_still_screen which defaults to 2 seconds stilltime and 5 seconds timeout.

To further pout this to the test, we could merge all calls of the send_key_and_wait function and see if that works well.

#4 Updated by okurz 4 months ago

dzedro should we move this to "openQA project" or is your intention to discuss this within the scope of "test maintainers" mainly?

I think you are raising many important points and it is good to see that you care about general performance and stability which I only see in a very limited scope among users of openQA.

In general I think we have a lot of potential for improvement as mainly we are using a system from around 2013-2015 and there had been a lot of improvements in the domain of mainly virtualized environments since then, e.g. improvement using newer qemu features, virtio, qcow tune parameters, filesystems from which machines are started, etc.

#5 Updated by cdywan 4 months ago

dzedro wrote:

Maybe decreasing IO load on workers, snapshots on qemu backend are done basically everywhere even where it doesn't make much sense, small jobs(one, few) tests, multimachine and installations.

I add QEMU_DISABLE_SNAPSHOTS=1 into most of QAM suites, there are installations with following tests, but fail is mostly some random timeout, typing, etc. issue. The test is restarted anyway... It could make sense to disable snapshots by default on suites where snapshot is waste of IO, exceptions could overwrite it with QEMU_DISABLE_SNAPSHOTS=0.

Maybe the setting should be moved to a different layer? The job group, not the test, for instance.

#6 Updated by dzedro 4 months ago

okurz wrote:

dzedro should we move this to "openQA project" or is your intention to discuss this within the scope of "test maintainers" mainly?

I think you are raising many important points and it is good to see that you care about general performance and stability which I only see in a very limited scope among users of openQA.

In general I think we have a lot of potential for improvement as mainly we are using a system from around 2013-2015 and there had been a lot of improvements in the domain of mainly virtualized environments since then, e.g. improvement using newer qemu features, virtio, qcow tune parameters, filesystems from which machines are started, etc.

Anybody is welcome, to discuss such change.

cdywan I would vote for option 1. extend existing send_key.
I was disabling it in specific tests where snapshots are pointless, when you have big group and you don't know exactly where you want to have snapshots then per test.

#7 Updated by cdywan 2 months ago

  • Status changed from New to In Progress
  • Target version set to Current Sprint

dzedro wrote:

cdywan I would vote for option 1. extend existing send_key.

https://github.com/os-autoinst/os-autoinst/pull/1418

#8 Updated by cdywan 2 months ago

  • Status changed from In Progress to Feedback

I closed the above PR for now since I don't necessarily want to address just the obvious symptoms, but it already helped inspire some interesting conversations.

#9 Updated by cdywan 14 days ago

  • Description updated (diff)

Also available in: Atom PDF