Project

General

Profile

Actions

action #28633

closed

[CaaSP] Increase the timeout for updates

Added by pgeorgiadis over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Target version:
-
Start date:
2017-11-30
Due date:
% Done:

100%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario caasp-2.0-CaaSP-DVD-Incidents-x86_64-QAM-CaaSP-controller@qam-caasp_x86_64 fails in
stack_update

Reproducible

Fails since (at least) Build :5901:cloud-init.1511368937

Expected result

Last good: :5887:python-PyJWT.1510348854 (or more recent)

Further details

Always latest result in this scenario: latest

This maintenance update, contains updates for all the containers. As a result, this needs more time.

# Test died: command 'ssh admin.openqa.test './update.sh -s http://download.suse.de/ibs/SUSE:/Maintenance:/6053/SUSE_Updates_SUSE-CAASP_ALL_x86_64/' | tee /dev/ttyS0 | grep EXIT_OK' timed out at /var/lib/openqa/cache/tests/caasp/tests/caasp/stack_update.pm line 34.

My proposal is that need to increase this significantly:

assert_script_run "ssh admin.openqa.test './update.sh -s $repo' | tee /dev/$serialdev | grep EXIT_OK", 120;

To give you an estimate, this is going to download nearly 1G per node:

The following 25 packages are going to be upgraded:
file file-magic kernel-firmware kubernetes-salt libgcc_s1 libgcrypt20 libmagic1 libstdc++6 perl perl-base sles12-caasp-dex-image sles12-dnsmasq-nanny-image sles12-haproxy-image sles12-kubedns-image
sles12-mariadb-image sles12-openldap-image sles12-pause-image sles12-pv-recycler-node-image sles12-salt-api-image sles12-salt-master-image sles12-salt-minion-image sles12-sidecar-image sles12-tiller-image
sles12-velum-image timezone
25 packages to upgrade.
Overall download size: 983.9 MiB

I would increase the timeout up to ~10 minutes.

real    9m31.274s
user    0m0.017s
sys 0m0.011s

I would like to increase the timeout up to 20minutes.

Actions #2

Updated by pgeorgiadis over 6 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 50

Hm 20 minutes seems not to be enough for the whole update. The process is split in two:

  1. transactional update
  2. velum + salt + restart of the nodes

I would like to increase this and find a sweet spot. Testing ...

Actions #3

Updated by pgeorgiadis over 6 years ago

This one needs also adjustment: assert_screen 'velum-bootstrap-done', $nodes * 150;

Actions #4

Updated by mkravec over 6 years ago

I agree with this change: (120 -> 1200)
assert_script_run "ssh admin.openqa.test './update.sh -s $repo' | tee /dev/$serialdev | grep EXIT_OK", 1200;

I don't agree to change this: (it depends on number of nodes / reboot time - until there is failed job with this problem we don't need to increase)
assert_screen 'velum-bootstrap-done', $nodes * 150;

Actions

Also available in: Atom PDF