action #176121
closedopenQA Infrastructure (public) - coordination #161414: [epic] Improved salt based infrastructure management
salt-states-openqa pipeline deploy fails on master, SaltReqTimeoutError: Message timed out
0%
Description
deploy stage failing since 2025-01-23 at 21:58 (triggered by https://gitlab.suse.de/openqa/salt-states-openqa/-/commit/ab6442055afd6a6b8b59580c168992e1161573fa)
first run failed with xml.parsers.expat.ExpatError: no element found: line 1, column 0
:
ID: python3-augeas
Function: pkg.installed
Result: False
Comment: Attempt 1: Returned a result of "False", with the following comment: "An exception occurred in this state: Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
*cdata["args"], **cdata["kwargs"]
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)```
...
File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
xml.parsers.expat.ExpatError: no element found: line 1, column 0
rerun failed with salt.exceptions.SaltReqTimeoutError: Message timed out
:
monitor.qe.nue2.suse.org:
The minion function caused an exception: Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/salt/minion.py", line 1912, in _thread_return
function_name, function_args, executors, opts, data
File "/usr/lib/python3.6/site-packages/salt/minion.py", line 1870, in _execute_job_function
return_data = self.executors[fname](opts, data, func, args, kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/executors/direct_call.py", line 10, in execute
return func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/modules/state.py", line 1161, in highstate
initial_pillar=_get_initial_pillar(opts),
File "/usr/lib/python3.6/site-packages/salt/state.py", line 4953, in __init__
initial_pillar=initial_pillar,
File "/usr/lib/python3.6/site-packages/salt/state.py", line 774, in __init__
self.opts["pillar"] = self._gather_pillar()
File "/usr/lib/python3.6/site-packages/salt/state.py", line 877, in _gather_pillar
return pillar.compile_pillar()
File "/usr/lib/python3.6/site-packages/salt/pillar/__init__.py", line 360, in compile_pillar
dictkey="pillar",
File "/usr/lib/python3.6/site-packages/salt/utils/asynchronous.py", line 112, in wrap
lambda: getattr(self.obj, key)(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/ioloop.py", line 459, in run_sync
return future_cell[0].result()
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/concurrent.py", line 249, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1064, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python3.6/site-packages/salt/channel/client.py", line 172, in crypted_transfer_decode_dictentry
timeout=timeout,
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1056, in run
value = future.result()
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/concurrent.py", line 249, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1064, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python3.6/site-packages/salt/transport/zeromq.py", line 920, in send
ret = yield self.message_client.send(load, timeout=timeout)
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1056, in run
value = future.result()
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/concurrent.py", line 249, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1064, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python3.6/site-packages/salt/transport/zeromq.py", line 630, in send
recv = yield future
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/gen.py", line 1056, in run
value = future.result()
File "/usr/lib/python3.6/site-packages/salt/ext/tornado/concurrent.py", line 249, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
salt.exceptions.SaltReqTimeoutError: Message timed out
Workaround¶
- Retrigger
Updated by robert.richardson about 1 month ago
- Tags changed from alert, reactive work to alert, reactive work, infra
- Priority changed from Normal to Urgent
- Target version set to Ready
related to: https://progress.opensuse.org/issues/175695
Updated by robert.richardson about 1 month ago
- Related to action #175695: salt states sporadically fail in applying the security sensor repo with "xml.parsers.expat.ExpatError: syntax error: line 1, column 0", didn't we remove the security sensor repo? added
Updated by okurz about 1 month ago ยท Edited
- Description updated (diff)
- Assignee set to okurz
- Priority changed from Urgent to High
I found one upstream issue https://github.com/saltstack/salt/issues/53147 "Salt Tornado API: salt.exceptions.SaltReqTimeoutError: Message timed out". A newer python and salt version might be helpful but of course not easy to achieve an Leap. I will look into the issue and consider retrying on the according stage. Apparently robert.richardson already retriggered again as the third run of deploy succeeded in https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3705753
Updated by okurz about 1 month ago
- Due date set to 2025-02-11
- Status changed from New to Feedback
Two related improvements: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1354
Updated by okurz about 1 month ago
- Due date deleted (
2025-02-11) - Status changed from Feedback to Resolved
merged and deployed. Looks good for now. We will have to see from production if more related problems come up.
Updated by jbaier_cz 27 days ago
- Related to action #175989: Too big logfiles causing failed systemd services alert: logrotate (monitor, openqaw5-xen, s390zl12) size:S added
Updated by okurz 3 days ago
- Copied to action #178078: salt pipeline deploy fails on master, SaltReqTimeoutError: Message timed out for petrol added