Project

General

Profile

Actions

action #94084

closed

POST /job_templates_scheduling fails silently when the template contains an undefined machine

Added by dancermak over 3 years ago. Updated over 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-06-16
Due date:
% Done:

0%

Estimated time:

Description

I have started out with a completely empty openQA instance and tried to apply a job template via a script that uses the /job_templates_scheduling route to set the job template. The template includes the following default:

defaults:
  x86_64:
    machine: 64bit
    priority: 50

But as the instance is completely fresh, it does not have the 64 bit machine defined and the above template fails to apply.

Unfortunately, the API swallows this failure and completely and still responds with a 204. The failure only becomes apparent when checking the logs of the webui, which includes:

webui_1          | [error] Machine '64bit' is invalid
webui_2          | [error] Machine '64bit' is invalid
webui_1          | [error] Machine '64bit' is invalid
webui_1          | [error] Machine '64bit' is invalid
Actions #1

Updated by okurz over 3 years ago

  • Category set to Regressions/Crashes
  • Target version set to future

could you please help to update your ticket description based on the template https://progress.opensuse.org/projects/openqav3/wiki/#Defects

Actions #2

Updated by dancermak over 3 years ago

Observation

The POST /job_templates_scheduling route responds with a 204 code if the passed job template is invalid due to referencing a non-existing machine. The job template is then not applied, but the failure is not evident from the HTTP return code. The actual failure is only observable when having access to the log of the webui.

Steps to reproduce

  • pick an arbitrary machine name that does not exist on your openQA instance
  • send the following template via POST /job_templates_scheduling:
defaults:
  x86_64:
    machine: FOOBAR_I_DO_NOT_EXIST
    priority: 50
  • openQA responds with 204, but the job template is not applied

Problem

The template is verified and because the machine is invalid, it is not applied. However, the result of the verification is not taken into account when responding to the API call.

Suggestion

The error is definitely getting logged by openqa, as the log of the webui reveals:

webui_1          | [error] Machine 'FOOBAR_I_DO_NOT_EXIST' is invalid
webui_2          | [error] Machine 'FOOBAR_I_DO_NOT_EXIST' is invalid
webui_1          | [error] Machine 'FOOBAR_I_DO_NOT_EXIST' is invalid
webui_1          | [error] Machine 'FOOBAR_I_DO_NOT_EXIST' is invalid

Workaround

Check that the job template got applied and manually error out if it did not.

Actions #3

Updated by tinita over 3 years ago

I can't reproduce this.

If the YAML only contains this:

defaults:
  x86_64:
    machine: FOOBAR_I_DO_NOT_EXIST
    priority: 50

then products and scenarios are missing.

When I add them:

defaults:
  x86_64:
    machine: FOOBAR_I_DO_NOT_EXIST
    priority: 50
products: {}
scenarios: {}

And post this to a newly created job group:

./script/openqa-cli api -X POST --host http://localhost:9526 /job_templates_scheduling/85 schema=JobTemplates-01.yaml preview=0 template="$(cat /tmp/non-existing-machine.yaml)" -v

then it gets accepted, and I see no error and get a 200 OK.

But that is because there are no scenarios, so the machine is never checked.

When I post this:

defaults:
  i586:
    machine: FOOBAR_I_DO_NOT_EXIST
    priority: 50

products:
  opensuse-Tumbleweed-DVD-i586:
    distri: opensuse
    flavor: DVD
    version: Tumbleweed

scenarios:
  i586:
    opensuse-Tumbleweed-DVD-i586:
    - textmode:
        priority: 40

Then I get a 400 Bad Request:

{"error":["Machine 'FOOBAR_I_DO_NOT_EXIST' is invalid"],"error_status":400,"id":85,"job_group_id":85}

So there must be some difference in what you do.

Actions #4

Updated by mkittler over 3 years ago

We also have unit tests for this (see t/api/08-jobtemplates.t). Can you share the concrete API calls you're doing?

Actions #5

Updated by dancermak over 2 years ago

I completely forgot about this one and unfortunately don't have the setup anymore to reproduce this issue. So feel free to close it.

Actions #6

Updated by okurz over 2 years ago

  • Status changed from New to Rejected
  • Assignee set to okurz
Actions

Also available in: Atom PDF