action #62984: Fix problem with job-worker assignment resulting in API errors - openQA Project (public) - openSUSE Project Management Tool

Actions

action #62984

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #62420: [epic] Distinguish all types of incompletes

coordination #61922: [epic] Incomplete jobs with no logs at all

Fix problem with job-worker assignment resulting in API errors

Added by mkittler over 5 years ago. Updated about 5 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

mkittler

Category:

Regressions/Crashes

Target version:

Start date:

2020-02-03

Due date:

% Done:

Estimated time:

Description

Now with the reason being passed from the worker to the web UI we're able to query the database for jobs incompleted due to API errors. Unfortunately, the following query usually returns some jobs on OSD:

openqa=> select id, t_started, state, reason from jobs where reason like '%Got status update%' and result = 'incomplete' and t_finished >= (NOW() - interval '12 hour') order by id;
   id    |      t_started      | state |                                                                            reason                                                                             
---------+---------------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------
 3856945 | 2020-02-03 00:52:54 | done  | api failure: 400 response: Got status update for job 3856945 with unexpected worker ID 679 (expected no updates anymore, job is done with result incomplete)
 3856983 |                     | done  | api failure: 400 response: Got status update for job 3856983 and worker 1310 but there is not even a worker assigned to this job (job is scheduled)
 3857048 | 2020-02-03 01:36:32 | done  | api failure: 400 response: Got status update for job 3857048 with unexpected worker ID 679 (expected no updates anymore, job is done with result incomplete)
 3857698 | 2020-02-03 09:08:04 | done  | api failure: 400 response: Got status update for job 3857698 with unexpected worker ID 1030 (expected no updates anymore, job is done with result incomplete)

I suppose https://github.com/os-autoinst/openQA/pull/2667 helps only a little by fixing one small race condition but there's apparently a bigger problem.

Note that these jobs might have been marked as incomplete by the web UI and then the reason got overridden by the worker again so the reason might be misleading. That should be fixed, too.

Files

fedorastg20200207.zip (451 KB) fedorastg20200207.zip

AdamWill, 2020-02-07 20:25

Related issues 2 (0 open — 2 closed)

Actions

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #62984

Fix problem with job-worker assignment resulting in API errors

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by okurz over 5 years ago

Updated by AdamWill over 5 years ago

Updated by AdamWill over 5 years ago

Updated by okurz over 5 years ago

Updated by dzedro over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by AdamWill over 5 years ago

Updated by AdamWill over 5 years ago

Updated by AdamWill over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by AdamWill over 5 years ago

Updated by mkittler over 5 years ago

Updated by AdamWill over 5 years ago

Updated by mkittler over 5 years ago

Updated by okurz over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler over 5 years ago

Updated by okurz over 5 years ago

Updated by okurz over 5 years ago

Updated by mkittler over 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by mkittler about 5 years ago

Updated by AdamWill about 5 years ago

Updated by AdamWill about 5 years ago

Updated by mkittler about 5 years ago

Updated by okurz about 5 years ago