Project

General

Profile

Actions

action #18144

closed

[tools] restart ipmi management controller before every ipmi job

Added by RBrownSUSE about 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2017-03-29
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)

Description

ipmi cards are unreliable, but seem to be quite reliable if you restart them

before running a test, starting the card with "ipmitool mc reset hot" would be a sensible option to keep them reliable


Subtasks 1 (0 open1 closed)

action #18826: [tools] Investigate serial over lan disconnects for ipmiResolvedRBrownSUSE2017-03-29

Actions

Related issues 2 (1 open1 closed)

Related to openQA Infrastructure - action #17816: [functional][u][ipmi] Error: Unable to establish IPMI v2 / RMCP+ sessionRejected2017-03-20

Actions
Related to openQA Tests - action #13914: [qe-core][functional][ipmi] wait_serial does not get expected output because ipmi console connection is closed New2016-09-27

Actions
Actions #1

Updated by okurz about 7 years ago

  • Related to action #17816: [functional][u][ipmi] Error: Unable to establish IPMI v2 / RMCP+ session added
Actions #2

Updated by okurz about 7 years ago

  • Related to action #13914: [qe-core][functional][ipmi] wait_serial does not get expected output because ipmi console connection is closed added
Actions #3

Updated by RBrownSUSE about 7 years ago

  • Assignee set to RBrownSUSE
  • Priority changed from Normal to Urgent
  • Target version set to Milestone 7

Evaluated as important for Milestone 7

Actions #4

Updated by RBrownSUSE about 7 years ago

Still investigating, https://github.com/os-autoinst/os-autoinst/pull/767 should let os-autoinst get us more helpful info

Actions #5

Updated by okurz about 7 years ago

the PR should be in. Is https://openqa.suse.de/tests/901944#step/textinfo/3 related to that maybe?

Actions #6

Updated by RBrownSUSE about 7 years ago

the PR should be in

No, who are you to say what should or shouldn't be in?

Is https://openqa.suse.de/tests/901944#step/textinfo/3 related to that maybe?

No, but thank you for your opinion

Actions #7

Updated by okurz about 7 years ago

RBrownSUSE wrote:

the PR should be in

No, who are you to say what should or shouldn't be in?

"should be in" as in: "Hm, depending on what I heard when the installation was last deployed the content of the pull request is probably already effective on the corresponding workers."

Actions #9

Updated by coolo about 7 years ago

Could you guys please stop abusing this issue? I already removed okurz's tag - see #6

Actions #10

Updated by RBrownSUSE about 7 years ago

  • Status changed from New to Resolved

Serial disconnect issues resolved by https://github.com/os-autoinst/os-autoinst/pull/777

Serial keep alive workaround removed as no longer beneficial with auto reconnect

No iKVM issues reported since regular nightly restart of the card, so no evidence that the whole mc controller needs to be restarted on every job

Actions

Also available in: Atom PDF