Project

General

Profile

Actions

action #16320

closed

Random timeouts while waiting for serial output when using the virtio backend

Added by rpalethorpe over 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
-
Start date:
2017-01-30
Due date:
% Done:

0%

Estimated time:

Description

Observation

Tests timeout while waiting for output from an LTP test: https://openqa.suse.de/tests/743383.

It appears that the command text is sent to the SUT, but no response is received. In the serial log[1] for the above test it shows that the last test ran and returned a result. However nothing is read by the virtio console backend.

In this test: https://openqa.opensuse.org/tests/342884 [2], one call to wait_serial fails, but then the next succeeds and then it fails again. The calls which pass do not use regular expressions to do the matching.

As a rough estimate this bug occurs in 1%-5% of tests.

Problem

  • H1, QEMU is writing bytes to the log, but not the socket
  • H2, The virtio backend function read_until is not reading bytes from the socket correctly
  • H3, One or more of the read buffers in read_until are being dropped.

Suggestions

  • A0, Inspect more test failures.
  • A1, Run the virtio terminal unit tests repeatedly.
  • A2, Modify the virtio test module to perform a stress test.
  • A3, Investigate how QEMU passes the data.

I am currently waiting for a crash dump of the SUT to be attempted after a freeze.

workaround

  • W0, Retrigger the job manually.
  • W1, Retrigger the job automatically after a timeout.

[1] The serial log is written by QEMU.
[2] There is no virtio serial log for this test, possibly O3 needs updating.


Related issues 1 (0 open1 closed)

Related to openQA Tests - action #12350: [tools]version of os-autoinst on malbec+overdrive2 should be same as other workers (using salt) (was: looks like old version)ResolvedRBrownSUSE2016-09-09

Actions
Actions

Also available in: Atom PDF