Project

General

Profile

Actions

action #41480

closed

[sle][functional][u][ipmi] Malfunction of openqaworker2:25 - Investigate, bring it back or repair it (WAS: remove openqaworker2:25 (IPMI machine) from OSD testing)

Added by zluo over 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
SUSE QA - Milestone 27
Start date:
2018-09-24
Due date:
% Done:

0%

Estimated time:

Description

Observation

Please see https://progress.opensuse.org/issues/31375 #57, #61 for details.
openqaworker2:25 makes each time trouble and first_boot failed if it runs on osd.

Investigation

Hypotheses

  • H1 Issues caused by IPMI SUT machine (sp.fozzie.qa.suse.de [IPMI interface], fozzie-1.qa.suse.de [SUT]) - Rejected by E1-1
  • H2 Issues caused by IPMI WORKER machine (openqaworker2, jump host) - Rejected by E2-1
  • H3 Issues caused by openQA test module first_boot
  • H4 Issues caused by the openQA's IPMI backend.

Experiments

  • E1-1 Install SLE15-SP1 manually on sp.fozzie.qa.suse.de / fozzie-1.qa.suse.de mimicking openQA (ipmitool).
  • R1-1 Possible to perform the installation. Some impediments verified typing linuxrc parameters on PXE boot menu interacting though SOL.
  • E2-1 Install SLE15-SP1 manually from openqaworker2 into sp.fozzie.qa.suse.de / fozzie-1.qa.suse.de. (in progress)
  • R2-1 Possible to perform the installation. Some impediments verified typing linuxrc parameters on PXE boot menu interacting though SOL.
  • E3-1 Perform 10 local openQA job runs for scenario BTRFS to get statistics.

Suggestions

  • Conduct a proper statistical analysis on this specific machine and find out what component fails most often. For this the special worker class 64bit-ipmi_disabled_investigate_poo41480 can be used together with our approach for statistical investigation
  • Identify differences of that worker to others and see if the worker/machine/backend/test is special
  • Come up with fix in code or settings or a decision in what regard the hardware is broken/unusable and must be decommissioned/repaired (beware: costly decision!)

Further information

Known pitfalls

  • When pressing ESC during PXE boot menu countdown, a boot prompt appears where you can type linuxrc parameters. But, you cannot delete(->) nor backspace (<-).
  • The PXE boot menu used by fozzie-1.qa.suse.de [SUT] has 3 levels:
    1. OS version menu (12, 15, 15-SP1...)
    2. Installation media source (NFS, FTP, HTTP...)
    3. Installaiion mode (SSH, VNC...)
  • When selecting a final entry and pressing TAB, the boot line appears and can be edited. But, you cannot backspace (<-) and typing and deleting characters at the limit of right margin causes unexpected broken characters. Also, CTRL-A (beggining) and CTRL-E (end) works. With cursor keys it is possible to move, but pressing fast too many times and keeping it pressed, causes unexpected broken characters.
  • The needed serial device to see the installer output is /dev/ttyS1

Related issues 2 (0 open2 closed)

Related to openQA Tests - action #36027: [sle][functional][u][ipmi] test fails in boot_from_pxe - pxe boot menu doesn't show up at allResolvedxlai2017-10-20

Actions
Blocks openQA Tests - action #31375: [sle][functional][ipmi][u][hard] test fails in first_boot - VNC installation on SLE 15 failed because of various issues (ipmi worker, first_boot, boot_from_pxe, await_install)RejectedSLindoMansilla2018-02-05

Actions
Actions

Also available in: Atom PDF