Job life cycle not always covered by events
I'm currently working with events, again. It would be beneficial to this work if all job life cycles were fully covered by events, unless something truly weird happens - i.e. for every
openqa_job_create event that happens, there should be a corresponding job-went-away event: at least one of
openqa_job_done, or possibly
openqa_job_delete). However, I don't believe this is currently the case.
To give a specific example: cancelling (or, I think, restarting or duplicating) a job with children that are scheduled, but not running. Any children that are running should get an
openqa_job_done...I think?...but I don't believe scheduled children do. If I'm following the flow correctly, their state just gets changed in the database, but no event is emitted. So anything that's trying to follow the life cycle of a given job by events will be left hanging, wondering what happened to it.
Also, cancelling an ISO emits 'openqa_cancel_iso' and then just calls the database
cancel_by_settings (not the web API one, which emits events) on the ISO value. Again I think this will result in
job_done events for running jobs (I don't totally remember how that happens - I think it's because ultimately a 'stop doing that!' signal is sent to the worker, and the worker winds up going back through the web API to say 'I stopped now!', or something like that), but no specific events for scheduled jobs. Anything trying to keep track of job life cycles would have to catch the cancel_by_settings message and do quite a lot of work to figure out which previously-scheduled jobs just got cancelled.
I don't know if this is a goal of openQA at all, and if so how high a priority fixing it would be, but I thought it was worth bringing up, at least.