Actions
action #176325
opencoordination #161414: [epic] Improved salt based infrastructure management
sporadic: zypper related stack traces in salt pipeline
Start date:
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Observation¶
Sometimes our salt pipelines fail with errors like:
worker36.oqa.prg2.suse.org:
----------
ID: security-sensor
Function: pkg.latest
Name: velociraptor-client
Result: False
Comment: An exception occurred in this state: Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
*cdata["args"], **cdata["kwargs"]
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1285, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/states/pkg.py", line 2659, in latest
*desired_pkgs, fromrepo=fromrepo, refresh=refresh, **kwargs
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 828, in latest_version
package_info = info_available(*names, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 752, in info_available
"info", "-t", "package", *batch[:batch_size]
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 439, in __call
salt.utils.stringutils.to_str(self.__call_result["stdout"])
File "/usr/lib64/python3.6/xml/dom/minidom.py", line 1968, in parseString
return expatbuilder.parseString(string)
File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 925, in parseString
return builder.parseString(string)
File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
xml.parsers.expat.ExpatError: syntax error: line 1, column 0
Started: 14:56:34.269887
Duration: 1873.979 ms
Changes:
or
monitor.qe.nue2.suse.org:
----------
ID: grafana
Function: pkg.latest
Result: False
Comment: An exception occurred in this state: Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
*cdata["args"], **cdata["kwargs"]
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1285, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/states/pkg.py", line 2659, in latest
*desired_pkgs, fromrepo=fromrepo, refresh=refresh, **kwargs
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
return self.loader.run(run_func, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
return callable(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
ret = _func_or_method(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 828, in latest_version
package_info = info_available(*names, **kwargs)
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 752, in info_available
"info", "-t", "package", *batch[:batch_size]
File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 439, in __call
salt.utils.stringutils.to_str(self.__call_result["stdout"])
File "/usr/lib64/python3.6/xml/dom/minidom.py", line 1968, in parseString
return expatbuilder.parseString(string)
File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 925, in parseString
return builder.parseString(string)
File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
xml.parsers.expat.ExpatError: syntax error: line 1, column 0
Started: 11:47:06.739267
also see https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3726297
The issue was happening more often with more "unstable" repositories (e.g. the security-sensor repo) but still happens sometimes e.g. on our monitor host:
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/3764576#L853
Acceptance criteria¶
- AC1: Temporary errors reading repo metadata don't cause salt pipeline failures
Suggestions¶
- This occurs with multiple packages including velociraptor and grafana
- The error seems to suggest that the
pkg.latest
-state tries to parse some XML from zypper but already fails very early ("line 1, column 0"). From the stacktrace you can seezypper info -t package
, maybe this can be used to replicate our problems?- Research if there is another way to specify repos/packages in salt to avoid this symptom
- Lookup known issues
- Find or file a zypper issue upstream
- Consider changing zypper settings, e.g. disable auto-refresh
- Find or file an OBS issue upstream
- Come up with a reproducer that doesn't depend on a salt pipeline run
- zypper info -t package grafana`
- Make the error message visible i.e. if the "syntax error" is actually an error fetching the data
Actions