Project

General

Profile

Actions

action #132860

closed

openqa-piworker is unstable and needs regular power-cycles size:M

Added by osukup about 1 year ago. Updated 3 months ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
-
Target version:
Start date:
2023-07-17
Due date:
2024-02-27
% Done:

0%

Estimated time:

Description

Observation

https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/1694765

only thing found in logs:
salt_ping.log:

Currently the following minions are down:
8d7
< "openqa-piworker.qa.suse.de"
===================

Acceptance criteria

  • AC1: we are able to process openQA Raspberry Pi bare-metal jobs consistently over some days

Suggestions

  • Identify the cause for regression

    • likely something related to the hardware RTC
    • try if it just works with Leap 15.5 because we wanted to upgrade anyway
    • could be a recent kernel update so try to downgrade
  • If it is really necessary and you exhausted all other remote-controllable options then go to the office, unplug RTC, reinstall the system assuming it was a borked system and corruption, or whatever

  • As Plan Y (if options A to X failed) buy wifi&bluetooth adapter for a IPMI controllable server and use that instead to connect to the rpi bare metal test instances

Rollback steps

  • Add back salt key with ssh osd "sudo salt-key -y -a openqa-piworker.qa.suse.de"

Related issues 3 (0 open3 closed)

Related to openQA Infrastructure - action #132902: Check and document PDU connection of nibali.qe.nue2.suse.orgResolvedokurz2023-07-17

Actions
Related to openQA Infrastructure - action #134735: [alert] openQA piworker openqa-piworker: host up alertResolveddheidler2023-08-28

Actions
Related to openQA Project - action #160089: Handle uncommented package lock on "kernel-default" and "kernel-default-base" on openqa-piworkerResolvedjbaier_cz2024-05-08

Actions
Actions

Also available in: Atom PDF