Project

General

Profile

Actions

action #33319

closed

[sle][functional][u][hard] test fails in hyperv_upload_assets - timeout failing, likely network issue

Added by JERiveraMoya over 6 years ago. Updated over 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
michalnowak
Category:
Bugs in existing tests
Start date:
2018-03-15
Due date:
2018-06-05
% Done:

0%

Estimated time:
Difficulty:
hard

Description

Observation

openQA test in scenario sle-12-SP4-Server-DVD-x86_64-create_hdd_textmode@svirt-hyperv fails in
hyperv_upload_assets

We will track with this ticket how often this is happening, but probably it is a Network issue so increment timeout will not help.

Reproducible

Fails since (at least) Build 0152

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by okurz over 6 years ago

  • Target version set to future

ok, let's track.

Actions #2

Updated by okurz over 6 years ago

  • Subject changed from [sle][functional] test fails in hyperv_upload_assets - timeout failing, likely network issue to [sle][functional][u] test fails in hyperv_upload_assets - timeout failing, likely network issue
  • Status changed from New to Feedback
  • Assignee set to okurz
Actions #3

Updated by okurz over 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_textmode@svirt-hyperv
https://openqa.suse.de/tests/1543334

Actions #4

Updated by okurz over 6 years ago

  • Subject changed from [sle][functional][u] test fails in hyperv_upload_assets - timeout failing, likely network issue to [sle][functional][u][fast] test fails in hyperv_upload_assets - timeout failing, likely network issue
  • Due date set to 2018-04-24
  • Status changed from Feedback to In Progress
  • Target version changed from future to Milestone 15

So we see this still appearing, last job failing in the mentioned scenario is https://openqa.suse.de/tests/1588299

I am not sure what I can understand from the logfiles. I see the following in os-autoinst:

[2018-04-05T15:12:22.0961 CEST] [debug] no change: 1.4s
[2018-04-05T15:12:23.0962 CEST] [debug] no change: 0.4s
[2018-04-05T15:12:25.0041 CEST] [debug] MATCH(hyperv_upload_assets-hyperv_image_uploaded-20180228:0.00)
[2018-04-05T15:12:25.0265 CEST] [debug] >>> testapi::_check_backend_response: match=hyperv_image_uploaded timed out after 3000
[2018-04-05T15:12:25.0337 CEST] [debug] test hyperv_upload_assets failed
…
[2018-04-05T15:12:32.0501 CEST] [info] uploading sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx
[2018-04-05T15:23:10.0361 CEST] [info] uploading video.ogv
[2018-04-05T15:23:11.0870 CEST] [info] uploading vars.json
[2018-04-05T15:23:11.0898 CEST] [info] uploading serial0.txt
[2018-04-05T15:23:11.0932 CEST] [info] uploading autoinst-log.txt

and from the worker log:

[2018-04-05T12:54:38.0236 CEST] [info] 26272: WORKING 1588299
[2018-04-05T15:12:32.0410 CEST] [info] uploading logs_from_installation_system-y2logs.tar.bz2
[2018-04-05T15:12:32.0501 CEST] [info] uploading sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx
[2018-04-05T15:12:32.0501 CEST] [info] sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx multi-chunk upload
[2018-04-05T15:12:53.0666 CEST] [info] sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx: 3360 chunks
[2018-04-05T15:12:53.0666 CEST] [info] sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx: chunks of 1000000 bytes each
[2018-04-05T15:12:53.0834 CEST] [info] sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx: Processing chunk 1/3360 avg speed ~976.562KB/s
…
[2018-04-05T15:23:10.0358 CEST] [info] sle-12-SP4-x86_64-0236-textmode@svirt-hyperv-uefi.vhdx: Processing chunk 3360/3360 avg speed ~622.562KB/s
[2018-04-05T15:23:10.0361 CEST] [info] uploading video.ogv
[2018-04-05T15:23:11.0869 CEST] [info] uploading vars.json
[2018-04-05T15:23:11.0898 CEST] [info] uploading serial0.txt
[2018-04-05T15:23:11.0932 CEST] [info] uploading autoinst-log.txt
[2018-04-05T15:23:12.0013 CEST] [info] uploading worker-log.txt

so am I reading this right that the upload of the SUT image over the jump host fails on a timeout but then the same image is uploaded like a "logfile" by the worker to the openQA webui afterwards? Does this make sense?

The screen https://openqa.suse.de/tests/1588299#step/hyperv_upload_assets/12 shows the image upload within the hyperv host worked fine but there was simply no sle12sp4 needle. I saw that a new "upload finished" needle hyperv_upload_assets-hyperv_image_uploaded-20180411 was created by michalnowak just today so let's see if a retrigger works better -> https://openqa.suse.de/tests/1610410 and https://openqa.suse.de/tests/1610408

Actions #5

Updated by michalnowak over 6 years ago

Yes, I think that it is what it says it is - a missing needle :).

I created the needle in https://openqa.suse.de/tests/1588299 then rescheduled it to https://openqa.suse.de/tests/1610408 (where it passed). Just did not bother to reschedule https://openqa.suse.de/tests/1588299 because (1) as I assume it will pass as well, (2) it's an week old build, and (3) I am looking for a time window to restart the Hyper-V host so April updates are applied. Sorry for the confusion :).

Actions #6

Updated by okurz over 6 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from okurz to michalnowak

and now it fails in a black screen https://openqa.suse.de/tests/1610410#step/bootloader/2 way earlier: "[2018-04-11T22:40:20.0433 CEST] [debug] >>> testapi::_check_backend_response: match=inst-bootmenu timed out after 90". Ideas what it could be now?

Actions #7

Updated by michalnowak over 6 years ago

  • Status changed from Feedback to Workable

What is really important here is: Attempt 1/7: Command failed with Start-VM : The operation cannot be performed while the object is in use. That usually means that other job is downloading resources (here the ISO) we also require and can't use them before they are downloaded. In this line of thinking, if the other job was downloading it slowly (12 SP4 image is 3.7 GB), the timeout of 7 * 5m was correctly triggered.

Actions #8

Updated by mgriessmeier over 6 years ago

  • Due date changed from 2018-04-24 to 2018-05-08
  • Target version changed from Milestone 15 to Milestone 16
Actions #9

Updated by mgriessmeier over 6 years ago

  • Due date changed from 2018-05-08 to 2018-05-22
Actions #10

Updated by riafarov over 6 years ago

  • Subject changed from [sle][functional][u][fast] test fails in hyperv_upload_assets - timeout failing, likely network issue to [sle][functional][u][hard] test fails in hyperv_upload_assets - timeout failing, likely network issue
  • Difficulty set to hard
Actions #11

Updated by mgriessmeier over 6 years ago

  • Due date changed from 2018-05-22 to 2018-06-05

@michal: can you give us an update here?

Actions #12

Updated by michalnowak over 6 years ago

  • Status changed from Workable to Rejected

Infra issue.

Actions

Also available in: Atom PDF