Project

General

Profile

Actions

action #121588

open

Jobs shown on qem-dashboard are missing in OSD

Added by martinsmac over 1 year ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
2022-12-06
Due date:
% Done:

0%

Estimated time:

Description

Observation

Missing tests for SLES 15-SP3 on dashboard:
http://dashboard.qam.suse.de/incident/26100

When using the dashboard, it shows the icon for SLES-15-SP3 in RED, but when opening the link, the page is empty:

https://openqa.suse.de/tests/overview?build=%3A26100%3Atcl&distri=sle&groupid=367


Files

dashboard-blocked-show.png (89.4 KB) dashboard-blocked-show.png martinsmac, 2022-12-06 14:03
Actions #1

Updated by kraih over 1 year ago

  • Subject changed from incident jobs are not created to Jobs shown on qem-dashboard are missing in OSD
  • Description updated (diff)
Actions #2

Updated by kraih over 1 year ago

I've looked at the dashboard database and these are the OSD jobs in question:

dashboard_db=# SELECT job_group, group_id, job_id, status, updated FROM openqa_jobs oj JOIN incident_openqa_settings os ON oj.incident_settings = os.id WHERE os.incident = 5449942 AND group_id = 367;
             job_group             | group_id | job_id  | status  |            updated
-----------------------------------+----------+---------+---------+-------------------------------
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587249 | failed  | 2022-11-01 05:46:32.415023+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587251 | failed  | 2022-11-01 05:46:32.431005+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587252 | stopped | 2022-11-01 05:46:32.43856+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587250 | failed  | 2022-11-01 05:46:32.422772+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587253 | stopped | 2022-11-01 05:46:32.446451+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587254 | failed  | 2022-11-01 05:46:32.374261+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587265 | passed  | 2022-11-01 05:46:32.707045+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587256 | passed  | 2022-11-01 05:46:32.62426+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587258 | passed  | 2022-11-01 05:46:32.644131+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587262 | passed  | 2022-11-01 05:46:32.680608+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587261 | passed  | 2022-11-01 05:46:32.672063+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587257 | failed  | 2022-11-01 05:46:32.63335+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587260 | passed  | 2022-11-01 05:46:32.663346+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587255 | passed  | 2022-11-01 05:46:32.610727+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587259 | passed  | 2022-11-01 05:46:32.654128+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587264 | passed  | 2022-11-01 05:46:32.69896+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587266 | passed  | 2022-11-01 05:46:32.716857+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587263 | passed  | 2022-11-01 05:46:32.689074+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587269 | failed  | 2022-11-01 05:46:31.736583+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587268 | failed  | 2022-11-01 05:46:31.757022+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587270 | failed  | 2022-11-01 05:46:32.043626+01
 Maintenance: SLE 15 SP3 Incidents |      367 | 9587267 | failed  | 2022-11-01 05:46:31.766053+01
(22 rows)

Looking at the OSD database, they are indeed not present there.

Edit: Updated data with states and timestamps.

Actions #3

Updated by okurz over 1 year ago

  • Target version set to future

Well, likely the jobs have just been pruned in the meantime. We can check if we can save results for longer based on available space. That would help to mitigate. But also we should think of solutions that would make sure that the dashboard really only reflects jobs that are present in OSD. Well, ok, there can be some minutes of gracetime until the new data is synchronized.

Actions #4

Updated by kraih over 1 year ago

  • Target version deleted (future)

okurz wrote:

Well, likely the jobs have just been pruned in the meantime. We can check if we can save results for longer based on available space. That would help to mitigate. But also we should think of solutions that would make sure that the dashboard really only reflects jobs that are present in OSD. Well, ok, there can be some minutes of gracetime until the new data is synchronized.

That's something the new /api/v1/job_settings/jobs API could be used for. Although i suspect that its current limit to the 20k most recent job ids would cause new cases of missing data. So maybe that should be adressed first. :)

Actions #5

Updated by kraih over 1 year ago

  • Target version set to future
Actions

Also available in: Atom PDF