Project

General

Profile

Actions

action #176325

open

coordination #161414: [epic] Improved salt based infrastructure management

sporadic: zypper related stack traces in salt pipeline

Added by dheidler about 1 month ago. Updated 5 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

Sometimes our salt pipelines fail with errors like:

worker36.oqa.prg2.suse.org:
----------
          ID: security-sensor
    Function: pkg.latest
        Name: velociraptor-client
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
                  *cdata["args"], **cdata["kwargs"]
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1285, in wrapper
                  return f(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/states/pkg.py", line 2659, in latest
                  *desired_pkgs, fromrepo=fromrepo, refresh=refresh, **kwargs
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 828, in latest_version
                  package_info = info_available(*names, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 752, in info_available
                  "info", "-t", "package", *batch[:batch_size]
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 439, in __call
                  salt.utils.stringutils.to_str(self.__call_result["stdout"])
                File "/usr/lib64/python3.6/xml/dom/minidom.py", line 1968, in parseString
                  return expatbuilder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 925, in parseString
                  return builder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
                  parser.Parse(string, True)
              xml.parsers.expat.ExpatError: syntax error: line 1, column 0
     Started: 14:56:34.269887
    Duration: 1873.979 ms
     Changes:   

or

monitor.qe.nue2.suse.org:
----------
          ID: grafana
    Function: pkg.latest
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
                  *cdata["args"], **cdata["kwargs"]
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1285, in wrapper
                  return f(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/states/pkg.py", line 2659, in latest
                  *desired_pkgs, fromrepo=fromrepo, refresh=refresh, **kwargs
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 828, in latest_version
                  package_info = info_available(*names, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 752, in info_available
                  "info", "-t", "package", *batch[:batch_size]
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 439, in __call
                  salt.utils.stringutils.to_str(self.__call_result["stdout"])
                File "/usr/lib64/python3.6/xml/dom/minidom.py", line 1968, in parseString
                  return expatbuilder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 925, in parseString
                  return builder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
                  parser.Parse(string, True)
              xml.parsers.expat.ExpatError: syntax error: line 1, column 0
     Started: 11:47:06.739267

also see https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3726297
The issue was happening more often with more "unstable" repositories (e.g. the security-sensor repo) but still happens sometimes e.g. on our monitor host:
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/3764576#L853

Acceptance criteria

  • AC1: Temporary errors reading repo metadata don't cause salt pipeline failures

Suggestions

  • This occurs with multiple packages including velociraptor and grafana
  • The error seems to suggest that the pkg.latest-state tries to parse some XML from zypper but already fails very early ("line 1, column 0"). From the stacktrace you can see zypper info -t package, maybe this can be used to replicate our problems?
    • Research if there is another way to specify repos/packages in salt to avoid this symptom
    • Lookup known issues
    • Find or file a zypper issue upstream
    • Consider changing zypper settings, e.g. disable auto-refresh
    • Find or file an OBS issue upstream
  • Come up with a reproducer that doesn't depend on a salt pipeline run
    • zypper info -t package grafana`
  • Make the error message visible i.e. if the "syntax error" is actually an error fetching the data

Related issues 2 (1 open1 closed)

Related to openQA Infrastructure (public) - action #175695: salt states sporadically fail in applying the security sensor repo with "xml.parsers.expat.ExpatError: syntax error: line 1, column 0", didn't we remove the security sensor repo?Resolvedokurz2025-01-17

Actions
Copied to openQA Infrastructure (public) - action #176949: [timeboxed:10h][research] understand/prevent sporadic zypper related stack traces in salt size:SWorkable

Actions
Actions #1

Updated by tinita about 1 month ago

  • Related to action #175695: salt states sporadically fail in applying the security sensor repo with "xml.parsers.expat.ExpatError: syntax error: line 1, column 0", didn't we remove the security sensor repo? added
Actions #2

Updated by okurz about 1 month ago

  • Tags set to infra, reactive work, alert, salt
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready
Actions #3

Updated by okurz about 1 month ago

  • Priority changed from Low to High
Actions #4

Updated by okurz about 1 month ago

  • Status changed from In Progress to Feedback
Actions #6

Updated by nicksinger 27 days ago

  • Subject changed from sporadic? security sensor related syntax error in salt pipeline at least for worker36 to sporadic: zypper related stack traces in salt pipeline
  • Status changed from Resolved to Workable
  • Assignee deleted (okurz)
  • Priority changed from High to Normal
  • Start date deleted (2025-01-29)

I just encountered this again but with the grafana state on monitor - there we use the "official" devel:openQA repo:

monitor.qe.nue2.suse.org:
----------
          ID: grafana
    Function: pkg.latest
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/usr/lib/python3.6/site-packages/salt/state.py", line 2402, in call
                  *cdata["args"], **cdata["kwargs"]
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1285, in wrapper
                  return f(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/states/pkg.py", line 2659, in latest
                  *desired_pkgs, fromrepo=fromrepo, refresh=refresh, **kwargs
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1234, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                  return callable(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1249, in _run_as
                  ret = _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 828, in latest_version
                  package_info = info_available(*names, **kwargs)
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 752, in info_available
                  "info", "-t", "package", *batch[:batch_size]
                File "/usr/lib/python3.6/site-packages/salt/modules/zypperpkg.py", line 439, in __call
                  salt.utils.stringutils.to_str(self.__call_result["stdout"])
                File "/usr/lib64/python3.6/xml/dom/minidom.py", line 1968, in parseString
                  return expatbuilder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 925, in parseString
                  return builder.parseString(string)
                File "/usr/lib64/python3.6/xml/dom/expatbuilder.py", line 223, in parseString
                  parser.Parse(string, True)
              xml.parsers.expat.ExpatError: syntax error: line 1, column 0
     Started: 11:47:06.739267

I searched our tickets but don't think we ever addressed this (maybe upstream) problem in a proper way or implemented workarounds/retries.

Actions #7

Updated by nicksinger 27 days ago

  • Description updated (diff)
Actions #8

Updated by dheidler 27 days ago

Hm - I wonder why it expects xml for this call.

There were no zypper calls emmitting xml on that minute with the info subcommand.

monitor:~ # zypper-log|grep 11:47 | grep info
================================================================================
Collect from /var/log/zypper.log ...

2025-02-05 11:47  25099  1.14.79  zypper --non-interactive --no-refresh info -t package nginx
2025-02-05 11:47  25141  1.14.79  zypper --non-interactive --no-refresh info -t package grafana
2025-02-05 11:47  25707  1.14.79  zypper --non-interactive --no-refresh info -t package nginx
2025-02-05 11:47  25873  1.14.79  zypper --non-interactive --no-refresh info -t package grafana
2025-02-05 11:47  26531  1.14.79  zypper --non-interactive --no-refresh info -t package loki
2025-02-05 11:47  26626  1.14.79  zypper --non-interactive --no-refresh info -t package influxdb


Actions #9

Updated by okurz 21 days ago

  • Status changed from Workable to New
Actions #10

Updated by okurz 21 days ago

  • Parent task set to #155182
Actions #11

Updated by okurz 21 days ago

  • Description updated (diff)
  • Assignee set to okurz
Actions #12

Updated by okurz 21 days ago

  • Description updated (diff)
Actions #13

Updated by okurz 21 days ago

  • Copied to action #176949: [timeboxed:10h][research] understand/prevent sporadic zypper related stack traces in salt size:S added
Actions #14

Updated by okurz 21 days ago

  • Status changed from New to Blocked
Actions #15

Updated by okurz 7 days ago

  • Parent task changed from #155182 to #161414
Actions #16

Updated by okurz 5 days ago

  • Target version changed from Ready to Tools - Next
Actions

Also available in: Atom PDF