coordination #64322
closedcoordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens
[epic] Improve feedback on multi-machine API errors
0%
Description
Problem¶
Currently the error feedback of the multi-machine API is not great, e.g. when the creation of a mutex fails one only gets "return code 400" but no further information what the problem could be:
[2020-03-04T15:17:02.724 CET] [debug] mutex create 'radvd.mutex'
[2020-03-04T15:17:03.902 CET] [warn] !!! lockapi::mutex_create: Unknown return code 400 for lock api
[2020-03-04T15:17:03.983 CET] [debug] Create mutex failed at /var/lib/openqa/pool/22/os-autoinst-distri-opensuse/lib/wickedbase.pm line 552.
In this case it would be good to know that the mutex name is considered invalid. Other multi-machine APIs likely have bad error feedback, too.
Further information¶
When looking into the error mentioned above I came up with the following suggestion:
The regex for the validation is defined in mutex_create in openQA/lib/OpenQA/WebAPI/Controller/API/V1/Locks.pm.
The _validation_error helper defined in openQA/lib/OpenQA/WebAPI/Plugin/Helpers.pm would actually return a semi-useful error message. However, in api_call in os-autoinst/mmapi.pm or _try_lock in os-autoinst/lockapi.pm the error is not logged (just the return code 400). It would be good if the validation error would be logged so you would get at least "Invalid request parameters (name)" somewhere in the os-autoinst log. The message from the validation helper could be improved as well to make it clear why the name is considered invalid.
Attempt to improve the problem: https://github.com/os-autoinst/os-autoinst/pull/1359
Updated by mkittler over 4 years ago
- Related to action #64075: Use validation consistently in WebAPI controllers added
Updated by okurz over 4 years ago
https://github.com/os-autoinst/os-autoinst/pull/1359 is closed unmerged meanwhile. Can we call this a duplicate of #32545? To simply reduce the amount of tickets I suggest to merge the relevant parts into #32545 and then close this one as duplicate. Can you do that?
Updated by asmorodskyi over 4 years ago
https://progress.opensuse.org/issues/32545 - after clarification from coolo is about need to catch improper combination of PARALLEL_WITH / START_AFTER
this one is about the fact that client and server of lock API has different understanding of expected HTTP request/response . IMO merging this two would create confusion
Updated by mkittler over 4 years ago
@okurz These issues look completely different to me. One is about the multi-machine API of openQA and its client within os-autoinst (the "mutex stuff") and the other one about openQA's scheduling API and dependency management.
Updated by okurz over 4 years ago
I know. But the point where they come together is the (implicit) epic about "improve error handling and UX for multi-machine test handling".
Updated by okurz almost 3 years ago
- Subject changed from Improve feedback on multi-machine API errors to [epic] Improve feedback on multi-machine API errors
- Parent task set to #103962
Updated by mkittler over 2 years ago
- Status changed from New to Resolved
Looks like I've already implemented this: https://github.com/Martchus/os-autoinst/commit/0afd919ded64fa5cecbba8ee15d9c803d3218ccb
All functions in mmapi/lockapi are already using the an error handler which logs the server reply (and not only the return code) directly or indirectly. So I'm just considering this resolved.