Project

General

Profile

Actions

action #112196

closed

[alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert size:M

Added by okurz over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2022-06-08
Due date:
% Done:

0%

Estimated time:

Description

Observation

[Alerting] QA-Power8-4-kvm: Disk I/O time alert

Metric name sdj
Value 26600.000

http://stats.openqa-monitor.qa.suse.de/d/WDQA-Power8-4-kvm/worker-dashboard-qa-power8-4-kvm?tab=alert&viewPanel=56720&orgId=1

Problem

We just recently in #110269 worked on Disk I/O time alerts, also on QA-Power8-4-kvm. Either we need to relax values even more, or there is a real hardware problem or we need to find different solutions, e.g. longer pending time.

Acceptance criteria

  • AC1: No more alerts

Suggestions

  • Check that there are no actual hardware issues e.g. using smartctl, do what https://progress.opensuse.org/issues/110269#note-12 says
  • Bump the values again
  • Why do we monitor the disk sdj? the machine seems to have only two real physical devices, sda and sdb. journalctl | grep sdj reports:

    May 22 03:33:08 QA-Power8-4-kvm kernel: sd 7:0:0:3: [sdj] Attached SCSI removable disk
    May 29 03:33:08 QA-Power8-4-kvm kernel: sd 7:0:0:3: [sdj] Attached SCSI removable disk
    Jun 05 03:33:07 QA-Power8-4-kvm kernel: sd 8:0:0:3: [sdj] Attached SCSI removable disk
    

We should make sure we do not care about such devices or do not even have these. The devices always show up during boot.


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure (public) - action #110269: [alert] QA-Power8-4-kvm + QA-Power8-5-kvm: Disk I/O time alert size:MResolvedkraih

Actions
Actions #1

Updated by okurz over 2 years ago

  • Related to action #110269: [alert] QA-Power8-4-kvm + QA-Power8-5-kvm: Disk I/O time alert size:M added
Actions #2

Updated by livdywan over 2 years ago

  • Subject changed from [alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert to [alert][sporadic] QA-Power8-4-kvm: Disk I/O time alert size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #3

Updated by okurz over 2 years ago

  • Description updated (diff)
Actions #4

Updated by okurz over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to okurz
Actions #5

Updated by okurz over 2 years ago

  • Due date set to 2022-06-22
  • Status changed from In Progress to Feedback
Actions #6

Updated by okurz over 2 years ago

  • Due date deleted (2022-06-22)
  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF