Actions
action #94312
closed[Alerting] web UI: Too many Minion job failures alert - likely due to openqa-client declared deprecated
Start date:
2021-06-21
Due date:
2021-07-06
% Done:
0%
Estimated time:
Description
Observation¶
Alert email received on 2021-06-20 01:39Z. Details about alert on https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=19&orgId=1
Too many Minion jobs have failed on openqa.suse.de Review the failed jobs on https://openqa.suse.de/minion/jobs?state=failed and create a ticket if there's not already one and the failed jobs aren't just a symptom of a bigger problem (e.g. database outage). After investigation remove the failed jobs (possibly keeping one instance of a failure kind around). For the general log of the Minion job queue, checkout journalctl -fu openqa-gru.service
and /var/log/openqa_gru
on openqa.suse.de.
Details from openqa-gru on osd:
-- Logs begin at Sun 2021-06-20 03:30:00 CEST, end at Mon 2021-06-21 10:29:02 CEST. --
Jun 21 05:46:46 openqa openqa-gru[1446]: WARNING: openqa-client is deprecated and planned to be removed in the future. Please use openqa-cli>
Jun 21 05:46:48 openqa openqa-gru[1446]: https://openqa.suse.de/tests/6297977 : Unknown issue, to be reviewed -> https://openqa.suse.de/test>
Jun 21 05:46:48 openqa openqa-gru[1446]: Likely the error is within this log excerpt, last lines before shutdown:
Jun 21 05:46:48 openqa openqa-gru[1446]: ---
Jun 21 05:46:48 openqa openqa-gru[1446]: [2021-06-21T05:46:17.919 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, ba>
Jun 21 05:46:48 openqa openqa-gru[1446]: Virtio terminal and svirt serial terminal do not support send_key. Use
Jun 21 05:46:48 openqa openqa-gru[1446]: type_string (possibly with an ANSI/XTERM escape sequence), or switch to a
Jun 21 05:46:48 openqa openqa-gru[1446]: console which sends key presses, not terminal codes.
Jun 21 05:46:48 openqa openqa-gru[1446]: at /usr/lib/os-autoinst/consoles/serial_screen.pm line 68.
Jun 21 05:46:48 openqa openqa-gru[1446]: consoles::serial_screen::send_key(consoles::serial_screen=HASH(0x56014f31dde0), HASH(0x56>
Jun 21 05:46:48 openqa openqa-gru[1446]: backend::baseclass::bouncer(backend::qemu=HASH(0x560150b54ce0), "send_key", HASH(0x56014f>
Jun 21 05:46:48 openqa openqa-gru[1446]: backend::baseclass::send_key(backend::qemu=HASH(0x560150b54ce0), HASH(0x56014f27f2c0)) ca>
Jun 21 05:46:48 openqa openqa-gru[1446]: backend::baseclass::handle_command(backend::qemu=HASH(0x560150b54ce0), HASH(0x56014f2ebb5>
Jun 21 05:46:48 openqa openqa-gru[1446]: backend::baseclass::check_socket(backend::qemu=HASH(0x560150b54ce0), IO::Handle=GLOB(0x56>
Jun 21 05:46:48 openqa openqa-gru[1446]: backend::qemu::check_socket(backend::qemu=HASH(0x560150b54ce0), IO::Handle=GLOB(0x56014ef>
Jun 21 05:46:48 openqa openqa-gru[1446]: eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 191
Jun 21 05:46:48 openqa openqa-gru[1446]: ---
Jun 21 05:46:48 openqa openqa-gru[1446]: 1 unknown issues to be reviewed:
Jun 21 05:46:48 openqa openqa-gru[1446]: - https://openqa.suse.de/tests/6297977 backend died: Virtio terminal and svirt serial ter
Jun 21 05:47:25 openqa openqa-gru[1446]: WARNING: openqa-client is deprecated and planned to be removed in the future. Please use openqa-cli>
Jun 21 05:47:25 openqa openqa-gru[1446]: https://openqa.suse.de/tests/6299083 : Unknown issue, to be reviewed -> https://openqa.suse.de/test>
Jun 21 05:47:25 openqa openqa-gru[1446]: Likely the error is within this log excerpt, last lines before shutdown:
Jun 21 05:47:25 openqa openqa-gru[1446]: ---
Jun 21 05:47:25 openqa openqa-gru[1446]: [2.0K blob data]
Jun 21 05:47:25 openqa openqa-gru[1446]: [1.2K blob data]
Jun 21 05:47:25 openqa openqa-gru[1446]:
Jun 21 05:47:25 openqa openqa-gru[1446]: [2021-06-21T05:47:22.783 CEST] [debug] git fetch: remote: Total 0 (delta 0), reused 0 (delta 0), pa>
Jun 21 05:47:25 openqa openqa-gru[1446]:
Jun 21 05:47:25 openqa openqa-gru[1446]: Could not find '7fe5802d21307dc25430282daefba72e8342a49f' in complete history at /usr/lib/os-autoin>
Jun 21 05:47:25 openqa openqa-gru[1446]: ---
so likely two problems:
- openqa-client used and should be openqa-cli instead
- unable to find git hashes in history
Acceptance criteria¶
- AC1: Look where openqa-client is used and replace by openqa-cli, likely in github.com/os-autoinst/scripts/
- AC2: Ensure no out-of-the-ordinary minion alerts are still there
- AC3: Alert fixed
Suggestions¶
- Start by replacing uses of openqa-client in github.com/os-autoinst/scripts/ with openqa-cli
Rollback¶
- Unpause alert "web UI: Too many Minion job failures alert" again on https://stats.openqa-monitor.qa.suse.de/alerting/list?state=not_ok
Actions