action #169144
opencoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
coordination #96263: [epic] Exclude certain Minion tasks from "Too many Minion job failures alert" alert
coordination #99831: [epic] Better handle minion tasks failing with "Job terminated unexpectedly"
Link to minion job from openQA job with waiting task
0%
Description
Motivation¶
When a job is waiting for a GruTask, e.g. https://openqa.opensuse.org/tests/4604726
State: scheduled, waiting for background tasks (id: 20718867, name: download_asset), created about 24 hours ago
then it's hard to find the corresponding minion job because the gru_task_id 20718867
is not searchable from the web interface.
In this case the minion job failed because of a sigterm Job terminated unexpectedly (exit code: 0, signal: 15)
(we have #108980 for that). But in any case it should be possible to quickly find the minion job to be able to fix a problem.
Suggestions¶
- @kraih said you can only search for
notes
key names in the webapi notes field, so instead of notes{"gru_id": 20718867}
we should have{"gru_id_20718867": 1}
, then we could link to a minion search query from the job. Of course there are several places relying on the current content of the notes. We could use both keys for a certain time.