Project

General

Profile

Actions

action #166136

closed

s390 LPAR s390ZL12 down and unable to boot - potential corrupted filesystem

Added by mgriessmeier 3 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-09-02
Due date:
2024-09-17
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Since Sep 1st, ~ 03:30 AM there is an issue with s390 LPAR ZL12

initially reported in https://suse.slack.com/archives/C02CANHLANP/p1725240063815389
also tracked in https://sd.suse.com/servicedesk/customer/portal/1/SD-166885

https://stats.openqa-monitor.qa.suse.de/d/GDs390zl12/dashboard-for-s390zl12?orgId=1&from=1725109294762&to=1725257278550 shows the timeframe of the event

Hardware messages in the zhmc could point to a potential corrupted filesystem:

Central processor (CP) 0 in partition ZL12, entered disabled wait state. 
The disabled wait program status word (PSW) is 00020001800000000000000000012370. 

Serial output:

uncompression error
 -- System halted

Mitigations

  • Unsilence host_up alert

Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #163778: [alert] host_up & Average Ping time (ms) alert for s390zl12&s390zl13 size:SResolvednicksinger2024-07-112024-08-06

Actions
Actions

Also available in: Atom PDF