Project

General

Profile

action #112196

[alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert size:M

Added by okurz 2 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2022-06-08
Due date:
% Done:

0%

Estimated time:

Description

Observation

[Alerting] QA-Power8-4-kvm: Disk I/O time alert

Metric name sdj
Value 26600.000

http://stats.openqa-monitor.qa.suse.de/d/WDQA-Power8-4-kvm/worker-dashboard-qa-power8-4-kvm?tab=alert&viewPanel=56720&orgId=1

Problem

We just recently in #110269 worked on Disk I/O time alerts, also on QA-Power8-4-kvm. Either we need to relax values even more, or there is a real hardware problem or we need to find different solutions, e.g. longer pending time.

Acceptance criteria

  • AC1: No more alerts

Suggestions

  • Check that there are no actual hardware issues e.g. using smartctl, do what https://progress.opensuse.org/issues/110269#note-12 says
  • Bump the values again
  • Why do we monitor the disk sdj? the machine seems to have only two real physical devices, sda and sdb. journalctl | grep sdj reports:

    May 22 03:33:08 QA-Power8-4-kvm kernel: sd 7:0:0:3: [sdj] Attached SCSI removable disk
    May 29 03:33:08 QA-Power8-4-kvm kernel: sd 7:0:0:3: [sdj] Attached SCSI removable disk
    Jun 05 03:33:07 QA-Power8-4-kvm kernel: sd 8:0:0:3: [sdj] Attached SCSI removable disk
    

We should make sure we do not care about such devices or do not even have these. The devices always show up during boot.


Related issues

Related to openQA Infrastructure - action #110269: [alert] QA-Power8-4-kvm + QA-Power8-5-kvm: Disk I/O time alert size:MResolved

History

#1 Updated by okurz 2 months ago

  • Related to action #110269: [alert] QA-Power8-4-kvm + QA-Power8-5-kvm: Disk I/O time alert size:M added

#2 Updated by cdywan 2 months ago

  • Subject changed from [alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert to [alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert size:M
  • Description updated (diff)
  • Status changed from New to Workable

#3 Updated by okurz 2 months ago

  • Description updated (diff)

#4 Updated by okurz 2 months ago

  • Status changed from Workable to In Progress
  • Assignee set to okurz

#5 Updated by okurz 2 months ago

  • Due date set to 2022-06-22
  • Status changed from In Progress to Feedback

#6 Updated by okurz 2 months ago

  • Due date deleted (2022-06-22)
  • Status changed from Feedback to Resolved

Also available in: Atom PDF