action #106880
closed
Job template name ... is already used in job group error logged on o3 size:M
Added by livdywan almost 3 years ago.
Updated almost 3 years ago.
Description
Observation¶
[2022-02-16T06:49:08.407202Z] [error] Job template name 'security_tpm2_swtpm' with opensuse-Tumbleweed-DVD-x86_64 and 64bit is already used in job group 'Development Tumbleweed'
Acceptance criteria¶
- AC1: User errors are not logged (only reported in the ui to the user)
- AC2: Internal error messages are not shown to the user (just generic messages)
Suggestions¶
- Remove the error on the openQA side, since this looks like it should be surfaced in API responses and UX as a user error
- Distinguish different error classes that are both logged and used in API error responses
- Priority changed from Normal to High
- Target version set to Ready
Adding to backlog with "High" to address urgency of alerting, i.e. exclude from alerting with openqa-logwarn
okurz wrote:
Adding to backlog with "High" to address urgency of alerting, i.e. exclude from alerting with openqa-logwarn
Imho dropping the message from openQA is what we should do right away if we agree that it's the right solution, otherwise we're just doubling the work
cdywan wrote:
Imho dropping the message from openQA is what we should do right away if we agree that it's the right solution, otherwise we're just doubling the work
I think that might not be trivial though.
The message comes from a die
in create_or_update_job_template
, and it's called inside a try
block, and the catch block collects all errors, logs them and returns them via API.
We still should log unexpected errors (e.g. from the database), but errors like this should not be logged.
So we should differentiate between user errors and unexpected errors.
- Related to action #105828: 4-7 logreport emails a day cause alert fatigue size:M added
- Subject changed from Job template name ... is already used in job group error logged on o3 to Job template name ... is already used in job group error logged on o3 size:M
- Description updated (diff)
- Description updated (diff)
- Status changed from New to Workable
- Related to action #105909: o3 logreports - Ignoring invalid group {"name":"123"} when creating new job added
- Related to action #105924: o3 logreports - Template was modified added
- Related to action #106245: o3 logreports - Testsuite 'xyz' is invalid added
- Status changed from Workable to In Progress
- Due date set to 2022-03-08
Setting due date based on mean cycle time of SUSE QE Tools
- Status changed from In Progress to Feedback
- Related to deleted (action #105909: o3 logreports - Ignoring invalid group {"name":"123"} when creating new job)
- Due date deleted (
2022-03-08)
- Status changed from Feedback to Resolved
https://github.com/os-autoinst/openQA/pull/4520 is merged as well as your https://github.com/os-autoinst/openqa-logwarn/pull/28
As the logwarn change is deployed within minutes I triggered a manual deployment on o3 (zypper dup
) so that we will not run into this message overnight. I tested the changed functionality by trying to put duplicate job template definitions into https://openqa.opensuse.org/admin/job_templates/74 ("Development Other") of a job template that is already defined in https://openqa.opensuse.org/admin/job_templates/38 ("Development Tumbleweed"):
defaults:
x86_64:
machine: 64bit
priority: 55
products:
opensuse-Tumbleweed-KDE-Live-x86_64:
distri: opensuse
flavor: KDE-Live
version: Tumbleweed
scenarios:
x86_64:
opensuse-Tumbleweed-KDE-Live-x86_64:
- kde_live_upgrade_leap_15.2:
machine: uefi
and I received no log message in /var/log/openqa at all but a good message in the UI telling us:
There was a problem applying the changes:
Job template name 'kde_live_upgrade_leap_15.2' with opensuse-Tumbleweed-KDE-Live-x86_64 and uefi is already used in job group 'Development Tumbleweed'
so I consider this story successfully completed as well \o/
Also available in: Atom
PDF