Project

General

Profile

Actions

action #94312

closed

[Alerting] web UI: Too many Minion job failures alert - likely due to openqa-client declared deprecated

Added by okurz almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2021-06-21
Due date:
2021-07-06
% Done:

0%

Estimated time:

Description

Observation

Alert email received on 2021-06-20 01:39Z. Details about alert on https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=19&orgId=1

Too many Minion jobs have failed on openqa.suse.de Review the failed jobs on https://openqa.suse.de/minion/jobs?state=failed and create a ticket if there's not already one and the failed jobs aren't just a symptom of a bigger problem (e.g. database outage). After investigation remove the failed jobs (possibly keeping one instance of a failure kind around). For the general log of the Minion job queue, checkout journalctl -fu openqa-gru.service and /var/log/openqa_gru on openqa.suse.de.

Details from openqa-gru on osd:

-- Logs begin at Sun 2021-06-20 03:30:00 CEST, end at Mon 2021-06-21 10:29:02 CEST. --
Jun 21 05:46:46 openqa openqa-gru[1446]: WARNING: openqa-client is deprecated and planned to be removed in the future. Please use openqa-cli>
Jun 21 05:46:48 openqa openqa-gru[1446]: https://openqa.suse.de/tests/6297977 : Unknown issue, to be reviewed -> https://openqa.suse.de/test>
Jun 21 05:46:48 openqa openqa-gru[1446]: Likely the error is within this log excerpt, last lines before shutdown:
Jun 21 05:46:48 openqa openqa-gru[1446]: ---
Jun 21 05:46:48 openqa openqa-gru[1446]: [2021-06-21T05:46:17.919 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, ba>
Jun 21 05:46:48 openqa openqa-gru[1446]:   Virtio terminal and svirt serial terminal do not support send_key. Use
Jun 21 05:46:48 openqa openqa-gru[1446]:   type_string (possibly with an ANSI/XTERM escape sequence), or switch to a
Jun 21 05:46:48 openqa openqa-gru[1446]:   console which sends key presses, not terminal codes.
Jun 21 05:46:48 openqa openqa-gru[1446]:    at /usr/lib/os-autoinst/consoles/serial_screen.pm line 68.
Jun 21 05:46:48 openqa openqa-gru[1446]:           consoles::serial_screen::send_key(consoles::serial_screen=HASH(0x56014f31dde0), HASH(0x56>
Jun 21 05:46:48 openqa openqa-gru[1446]:           backend::baseclass::bouncer(backend::qemu=HASH(0x560150b54ce0), "send_key", HASH(0x56014f>
Jun 21 05:46:48 openqa openqa-gru[1446]:           backend::baseclass::send_key(backend::qemu=HASH(0x560150b54ce0), HASH(0x56014f27f2c0)) ca>
Jun 21 05:46:48 openqa openqa-gru[1446]:           backend::baseclass::handle_command(backend::qemu=HASH(0x560150b54ce0), HASH(0x56014f2ebb5>
Jun 21 05:46:48 openqa openqa-gru[1446]:           backend::baseclass::check_socket(backend::qemu=HASH(0x560150b54ce0), IO::Handle=GLOB(0x56>
Jun 21 05:46:48 openqa openqa-gru[1446]:           backend::qemu::check_socket(backend::qemu=HASH(0x560150b54ce0), IO::Handle=GLOB(0x56014ef>
Jun 21 05:46:48 openqa openqa-gru[1446]:           eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 191
Jun 21 05:46:48 openqa openqa-gru[1446]: ---
Jun 21 05:46:48 openqa openqa-gru[1446]: 1 unknown issues to be reviewed:
Jun 21 05:46:48 openqa openqa-gru[1446]:  - https://openqa.suse.de/tests/6297977 backend died: Virtio terminal and svirt serial ter
Jun 21 05:47:25 openqa openqa-gru[1446]: WARNING: openqa-client is deprecated and planned to be removed in the future. Please use openqa-cli>
Jun 21 05:47:25 openqa openqa-gru[1446]: https://openqa.suse.de/tests/6299083 : Unknown issue, to be reviewed -> https://openqa.suse.de/test>
Jun 21 05:47:25 openqa openqa-gru[1446]: Likely the error is within this log excerpt, last lines before shutdown:
Jun 21 05:47:25 openqa openqa-gru[1446]: ---
Jun 21 05:47:25 openqa openqa-gru[1446]: [2.0K blob data]
Jun 21 05:47:25 openqa openqa-gru[1446]: [1.2K blob data]
Jun 21 05:47:25 openqa openqa-gru[1446]:   
Jun 21 05:47:25 openqa openqa-gru[1446]: [2021-06-21T05:47:22.783 CEST] [debug] git fetch: remote: Total 0 (delta 0), reused 0 (delta 0), pa>
Jun 21 05:47:25 openqa openqa-gru[1446]:   
Jun 21 05:47:25 openqa openqa-gru[1446]: Could not find '7fe5802d21307dc25430282daefba72e8342a49f' in complete history at /usr/lib/os-autoin>
Jun 21 05:47:25 openqa openqa-gru[1446]: ---

so likely two problems:

  • openqa-client used and should be openqa-cli instead
  • unable to find git hashes in history

Acceptance criteria

  • AC1: Look where openqa-client is used and replace by openqa-cli, likely in github.com/os-autoinst/scripts/
  • AC2: Ensure no out-of-the-ordinary minion alerts are still there
  • AC3: Alert fixed

Suggestions

  • Start by replacing uses of openqa-client in github.com/os-autoinst/scripts/ with openqa-cli

Rollback

Actions

Also available in: Atom PDF