Project

General

Profile

Actions

action #105828

closed

4-7 logreport emails a day cause alert fatigue size:M

Added by livdywan over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2022-02-03
Due date:
2022-02-17
% Done:

0%

Estimated time:

Description

Observation

Thanks to #80812 o3 can send out emails. Unfortunately now we're getting 4-7 logreport emails from openqa-monitor@ariel.suse-dmz.opensuse.org on a daily basis and we're not keeping up with handling all of them.
Emails are sent by a cronjob running https://github.com/os-autoinst/openqa-logwarn

Examples:

[2022-02-02T09:44:45.023821Z] [error] [pid:6229] Cannot read symbolic link (/opt/openqa-trigger-from-obs/openSUSE:Leap:15.4:ARM:Images:ToTest/.run_last): No such file or directory
[2022-02-02T08:07:52.883567Z] [warn] [pid:22053] Ignoring invalid group {"name":"38"} when creating new job 2172324
[2022-02-02T02:30:10.097604Z] [warn] [pid:10722] Unable to wakeup scheduler: Request timeout
[2022-02-02T02:30:14.810226Z] [error] [pid:13594] Publishing opensuse.openqa.job.restart failed: Connect timeout (9 attempts left)
[2022-02-01T15:38:12.281868Z] [warn] [pid:28556] fatal: Invalid revision range 745485c7527687dab875e0ab0f4c96f730e26dea..8f56d6708e2211a41fe189635a3bbebd2f9d0be8
[2022-02-01T15:38:12.282093Z] [error] [pid:28556] cmd returned 32768

Acceptance criteria

Suggestions

  • Team up to investigate all of the current issues
  • Create individual tickets for the issues and blocklist them by proposing changes to https://github.com/os-autoinst/openqa-logwarn (changes are effective ~10 minutes after a merge)

Related issues 18 (10 open8 closed)

Related to openQA Project - action #105930: o3 logreports - empty warnings/errorsNew2022-02-03

Actions
Related to openQA Project - action #105924: o3 logreports - Template was modifiedRejectedmkittler2022-02-03

Actions
Related to openQA Project - action #105921: o3 logreports - Cannot read symbolic link (/opt/openqa-trigger-from-obs/.../.run_last): No such file or directoryNew2022-02-03

Actions
Related to openQA Project - action #105918: o3 logreports - fatal: Invalid revision range sha1..sha2New2022-02-03

Actions
Related to openQA Project - action #105915: o3 logreports - Needle file <filename>.json not found within /var/.../opensuse/needlesNew2022-02-03

Actions
Related to openQA Project - action #105909: o3 logreports - Ignoring invalid group {"name":"123"} when creating new jobResolvedokurz2022-02-03

Actions
Related to openQA Project - action #105903: o3 logreports - Publishing opensuse.openqa.job.restart failed: Connect timeout (9 attempts left)New2022-02-03

Actions
Related to openQA Project - action #105900: o3 logreports - Unable to wakeup scheduler: Request timeoutNew2022-02-03

Actions
Related to openQA Project - action #106245: o3 logreports - Testsuite 'xyz' is invalidRejectedmkittler

Actions
Related to openQA Project - action #106613: o3 logreports DBIx::Class::Row::update(): Can't update OpenQA::Schema::Result::JobLocks row not foundWorkable2022-02-10

Actions
Related to openQA Infrastructure - action #106880: Job template name ... is already used in job group error logged on o3 size:MResolvedmkittler2022-02-16

Actions
Related to openQA Infrastructure - action #107023: cmd returned 31744 repeatedly reported on o3New2022-02-03

Actions
Related to openQA Project - action #137765: logwarn does not work on new o3 (anymore?) size:MResolved2023-10-11

Actions
Copied from openQA Infrastructure - action #95293: Monitoring alerts on errors in logs on o3 (was: followup to: error on "Next & previous results": ajax error message and no results showing up) size:MResolvedokurz2021-07-09

Actions
Copied to openQA Infrastructure - action #106756: cmd returned 32768 repeatedly reported on o3New2022-02-03

Actions
Copied to openQA Project - action #106759: Worker xyz has no heartbeat (400 seconds), restarting repeatedly reported on o3 size:MResolvedlivdywan2022-02-03

Actions
Copied to openQA Infrastructure - action #106760: DBI Exception: DBD::Pg::st execute failed: number of parameters must be between 0 and 65535 repeatedly reported on o3New2022-02-03

Actions
Copied to openQA Infrastructure - action #108533: o3 logreports DBI Exception: DBD::Pg::st execute failed: ERROR: invalid input syntax for type integerResolvedtinita2022-03-31

Actions
Actions

Also available in: Atom PDF