QA (public) &raquo; openQA Project (public) &raquo; openQA Infrastructure (public)

openQA Project (public) - Ready

Category:

Target version:

Start date:

2022-12-19

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Tags:

infra

Description

Observation¶

After upgrade to Leap 15.4 seems like qa-power8-4 wasn't properly rebooting. okurz could connect over ssh but asked for a password where normally we should have SSH keys. mkittler had varying success with "power reset" and "power cycle". Over SoL mkittler saw petitboot

Acceptance criteria¶

AC1: Both qa-power8-4 and qa-power8-5 are used for production openQA jobs again
AC2: Stable over reboot
AC3: Alerts unpaused

Further information¶

At this point a product issue has been created, it contains a summary of what has already been tried and found out as of 2022-08-04: https://bugzilla.opensuse.org/show_bug.cgi?id=1202138

Suggestions¶

Refresh memory about "petitboot" in https://progress.opensuse.org/projects/openqav3/wiki/#PPC-specific-configurations :)
Fix it on qa-power8-5
Then on qa-power8-4 (currently powered off, see #114379)

Rollback steps¶

After https://bugzilla.opensuse.org/show_bug.cgi?id=1202138 is resolved remove kernel-default and util-linux zypper package locks on qa-power8-4, qa-power8-5, power8.openqanet.opensuse.org
Upgrade kernel+OS on qa-power8-4, qa-power8-5, power8.openqanet.opensuse.org

Files

Download all files

power8-5.log.txt (2.07 MB) power8-5.log.txt		mkittler, 2022-07-22 14:09
ipmi-qa-power8-5-boot-loop.txt (1.08 MB) ipmi-qa-power8-5-boot-loop.txt		mkittler, 2022-10-24 14:07

Subtasks 1 (0 open — 1 closed)

action #122158: [alert] qa-power8-4-kvm host up alert - machine not up, nothing obvious on SoL but IPMI works size:M

Resolved

dheidler

2022-12-19

Related issues 10 (1 open — 9 closed)

Related to openQA Infrastructure (public) - action #115208: failed-systemd-services: logrotate-openqa alerting on and off size:M

Resolved

livdywan

Related to openQA Infrastructure (public) - action #116437: Recover qa-power8-5 size:M

Resolved

Related to openQA Infrastructure (public) - action #116473: Add OSD PowerPC workers to automatic recovery we already have for ARM workers

New

2022-09-12

Related to openQA Infrastructure (public) - action #116743: [alert] QA-Power8-5-kvm: host up alert

Resolved

nicksinger

2022-09-19

2022-10-04

Related to openQA Infrastructure (public) - action #117229: [tools] openqa failing on worker QA-Power8-5-kvm

Resolved

2022-09-26

2022-10-13

Related to openQA Infrastructure (public) - coordination #117268: [epic] Handle reduced PowerPC ressources

Resolved

2022-07-21

Related to openQA Infrastructure (public) - action #118024: Ensure all PPC workers are upgraded after kernel regression resolved size:M

Resolved

2022-10-11

Related to openQA Infrastructure (public) - action #119290: [alert] Packet loss between worker hosts and other hosts alert

Rejected

2022-10-24

Related to openQA Infrastructure (public) - action #116078: Recover o3 worker kerosene formerly known as power8, restore IPMI access size:M

Resolved

2022-08-31

Copied from openQA Infrastructure (public) - action #114526: recover openqaworker14

Resolved