action #27454
closed
- Related to action #25970: Profile/Optimize _workers_checker in WebSockets server added
- Target version set to Ready
we stopped updating this field as it was causing a lot of DB noise to update the field every subsecond.
- Category set to 122
- Parent task set to #32851
This is still related to scheduling (as some logic is split in the ws server)
- Start date changed from 2017-11-07 to 2018-05-05
due to changes in a related task
- Subject changed from [tools] Worker's seen DB field is ignored by WebSocket server when checking for stale jobs to [tools][scheduling] Worker's seen DB field is ignored by WebSocket server when checking for stale jobs
- Category changed from 122 to Feature requests
is this still valid? sorry, don't understand myself
Current state: The "last seen" timestamp of a worker is updated in the database when the worker updates the job status. It is also updated when the worker sends its status updates via web sockets. And yes, additionally to that, we track the "last seen" timestamp also a 2nd time in the web socket server. This 2nd timestamp is obviously not updated when the worker "just" uses the REST API. And only that timestamp is used to mark stale jobs as incomplete.
Having the timestamp twice is a bit redundant and weird. Since the database timestamp is not updated during the multi-chunk upload it wouldn't help taking it into account to prevent incomplete jobs because the worker is blocking/unresponsive. Updating the database timestamp during the upload might be quite expensive. So although having 2 timestamps is not nice I don't see any benefit in refactoring this right now.
Improving the multi-chunk upload and other blocking things on the worker is much more beneficial to prevent the problem in the first place.
Note that we sometimes see jobs in perpetual "running" or "uploading" state. I'm afraid this refactoring wouldn't help here too because in these cases the jobs are not incompleted because the worker-job relation is (somehow) unset.
So while this "curiosity" in our code base still exists I don't see a big benefit in improving it.
- Status changed from New to In Progress
- Assignee set to mkittler
- Target version changed from Ready to Current Sprint
- Status changed from In Progress to Resolved
- Target version changed from Current Sprint to Done
Also available in: Atom
PDF