coordination #47117
closed[epic] Fix worker->websocket->scheduler->webui connection
0%
Description
We have multiple problems with the way we split state between various components. And simple fixes won't help as our architecture is just
too complex for little gains :(
Updated by coolo almost 6 years ago
- Related to action #42986: parallel jobs not reliable killed/restarted added
Updated by coolo almost 6 years ago
- Related to action #47060: [worker service][scheduling] openqaworker2:21 ~ openqaworker2:24 stops getting new jobs for over 1 day. added
Updated by coolo almost 6 years ago
- Related to action #46886: worker3 keeps jobs added
Updated by coolo almost 6 years ago
- Blocks action #41066: Scheduling jobs for IPMI (bare metal) on the same worker (aka FOLLOW_TEST_DIRECTLY aka START_DIRECTLY_AFTER_TEST). added
Updated by coolo almost 6 years ago
- Blocked by action #46802: Replace D-Bus with plain HTTP added
Updated by coolo almost 6 years ago
- Related to action #42980: job stayed in assigned but is dead added
Updated by coolo almost 6 years ago
- Related to action #47087: [scheduling] Workers on openqaworker2 stuck frequently added
Updated by mkittler almost 6 years ago
A few notes from our last tools team meeting regarding the worker:
- Remove engines to simplify the code. The worker was never really independent from isotovideo anyways.
- Use Minion jobs for more than just the cache service (e.g. for uploading results).
- Allow multiple slots per (systemd) service.
Updated by mkittler almost 6 years ago
- Related to action #46187: Create list of "worker responsibilities" added
Updated by okurz over 5 years ago
- Category changed from 122 to Feature requests
Updated by okurz over 5 years ago
- Blocks action #41027: worker disconnects during cleanup added
Updated by mkittler about 5 years ago
- Blocks deleted (action #41066: Scheduling jobs for IPMI (bare metal) on the same worker (aka FOLLOW_TEST_DIRECTLY aka START_DIRECTLY_AFTER_TEST).)
Updated by mkittler about 5 years ago
- Blocks deleted (action #41027: worker disconnects during cleanup)
Updated by okurz over 4 years ago
- Subject changed from EPIC: Fix worker->websocket->scheduler->webui connection to [epic] Fix worker->websocket->scheduler->webui connection
- Status changed from New to Resolved
- Assignee set to okurz
Over the past months a lot of work has happened in the area of the communication and architecture. Definitely we have not changed the big picture architecture but have we reworked stuff and we at least added much more coverage. Some points that had been mentioned have been done since then, e.g. "use external processes for uploading results". But for more only more specific tasks would help. I don't see how a ticket "Fix worker…" would help anymore.
Updated by szarate about 4 years ago
- Tracker changed from action to coordination
- Difficulty deleted (
hard)
Updated by szarate about 4 years ago
See for the reason of tracker change: http://mailman.suse.de/mailman/private/qa-sle/2020-October/002722.html