Investigation jobs run because of the lack of automatic takeover size:M
Several investigations jobs were scheduled for a test which was known to fail because of a particular issue:
The "Next & Previous" jobs show that an issue was linked before.
- AC1: Minion jobs are kept long enough to be able to investigate this should it happen again
- AC2: Logs for carryover are verbose enough to be able to investigate this should it happen again
- Investigate this quickly while the logs are fresh and hot
- The bug reference from the existing job wasn't taken over
- Look into the audit table to find out if a takeover comment was deleted
- Increase the time to keep minion jobs in the database. Until then, if we have such a case again, look for the minion entry as fast as possible and copy the minion entry into the ticket
#4 Updated by tinita about 1 month ago
- Subject changed from Investigation jobs run because of the lack of automatic carryover to Investigation jobs run because of the lack of automatic takeover
- Description updated (diff)
I looked into the minion_jobs table but the job was already too old and the entry deleted.
I was wondering why the investigation comment was made 75 minutes after the job was finished:
kraih suggested to increase the time to delete minion jobs. The default is 2 days. We should add a setting for it:
probably one line in WebAPI.pm to assign the setting to $self->minion->remove_after , and the setting itself to the settings module
SQL for searching for a certain job in the minion table:
select id, args, notes->'hook_rc' as hook_rc, notes->'hook_result' as hook_result, created, finished from minion_jobs where jsonb_typeof(args->0) = 'number' and cast(args->0 as int) = 8739190
#5 Updated by tinita about 1 month ago
I looked into the audit log if the takeover comment was even created (and then possibly deleted).
It's a bit hard to tell because the comment events get logged since a few days only, and the oldest comment event in the audit log is from "2022-05-12 05:47:14" CEST I believe.
The job in question finished at 05:25 UTC -> 07:25 CEST, that means the comment event should have been logged.
Now one problem is, comment events are logged with their id, but without the job(group) id.
So if a comment is deleted, we can never connect the comment audit entry to a job.
That will be fixed by https://github.com/os-autoinst/openQA/pull/4655 once it is merged.
But for takeover comments, they are actually logged in the audit table with an additional entry
For the job in question that should have been
So I looked for that id:
select * from audit_events where event = 'comment_create' and event_data like '%8292431%';
but got nothing.
So my conclusion is the takeover comment was never created.
But it should have been, and I confirmed that by trying it out locally and just putting the
carry_over_bugrefs call into a normal job view for the job in question, and it called the code which would have created the comment, so all conditions for a carry over candidate were fulfilled.