Project

General

Profile

Actions

action #124398

closed

salt gitlab CI jobs fail due to exceeding log length of 500kB size:M

Added by okurz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-02-13
Due date:
2023-03-26
% Done:

0%

Estimated time:
Tags:

Description

Observation

https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/1396813 fails with a message

Job's log exceeded limit of 4194304 bytes.
Job execution will continue but no more output will be collected.

so we don't what the actual problem was.

Acceptance criteria

  • AC1: salt gitlab CI jobs show obvious error messages in all relevant cases

Suggestions

Actions #1

Updated by jbaier_cz about 1 year ago

  • Status changed from Workable to In Progress
  • Assignee set to jbaier_cz
Actions #2

Updated by jbaier_cz about 1 year ago

Having salt directly storing output into file will store it on the remote server, so in case of failure we would still need to download them. Instead, I tried to store the output form the ssh which should provide more or less the same: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/798

Actions #3

Updated by openqa_review about 1 year ago

  • Due date set to 2023-03-07

Setting due date based on mean cycle time of SUSE QE Tools

Actions #4

Updated by jbaier_cz about 1 year ago

The whole raw log is nicely served by gitlab, for example the latest artifact right now: https://openqa.io.suse.de/-/salt-states-openqa/-/jobs/1417570/artifacts/salt_highstate.log

Actions #5

Updated by jbaier_cz about 1 year ago

  • Status changed from In Progress to Resolved
Actions #6

Updated by livdywan about 1 year ago

  • Status changed from Resolved to Feedback

It seems like this is still a problem? I can see that the log is trimmed, the pipeline fails but there's no artifacts available:

https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/1446797

Actions #7

Updated by jbaier_cz about 1 year ago

The reason is in the header:

The script exceeded the maximum execution time set for the job

So the post-fail hook uploading the artifact did not have chance to run.

Actions #8

Updated by jbaier_cz about 1 year ago

If the long running time was expected, we should tweak the timeout (Settings -> CI/CD -> General pipelines -> Timeout); if there is some unwanted deadlock we want to investigate, we can sacrifice some verbosity (by piping the output through grep).

From what I see in the raw log, some operations took surprisingly long time like:

          ID: /etc/zypp/zypp.conf
    Function: ini.options_present
      Result: True
     Comment: No anomaly detected
     Started: 09:27:57.362744
    Duration: 19921.266 ms
     Changes:   
Actions #9

Updated by okurz about 1 year ago

  • Due date changed from 2023-03-07 to 2023-03-17
Actions #10

Updated by nicksinger about 1 year ago

  • Status changed from Feedback to Workable

We discussed in the infra daily that we still need to get rid of the error due to too big log files. @jbaier_cz wants to look into piping a verbose log into an artifact while showing critical errors on stderr to display them in gitlab

Actions #11

Updated by jbaier_cz about 1 year ago

The ideal solution is apparently still a wanted but not implemented feature: https://gitlab.com/gitlab-org/gitlab/-/issues/284186

Actions #12

Updated by okurz about 1 year ago

From salt --help there is:

    -l LOG_LEVEL, --log-level=LOG_LEVEL
                        Console logging log level. One of 'all', 'garbage',
                        'trace', 'debug', 'profile', 'info', 'warning',
                        'error', 'critical', 'quiet'. Default: 'warning'.
    --log-file=LOG_FILE
                        Log file path. Default: '/var/log/salt/master'.
    --log-file-level=LOG_LEVEL_LOGFILE
                        Logfile logging log level. One of 'all', 'garbage',
                        'trace', 'debug', 'profile', 'info', 'warning',
                        'error', 'critical', 'quiet'. Default: 'warning'.
…
    --state-output=STATE_OUTPUT, --state_output=STATE_OUTPUT
                        Override the configured state_output value for minion
                        output. One of 'full', 'terse', 'mixed', 'changes' or
                        'filter'. Default: 'none'.

so maybe we can use ssh "salt … --log-file=/dev/stdout --log-file-level=debug --state-output=changes" > log and upload log as gitlab artifact and show stderr in the gitlab console output window (or vice-versa). If piping to /dev/stdout or similar isn't possible then save to an explicit file which is copied over at the end of execution.

Actions #14

Updated by jbaier_cz about 1 year ago

  • Due date changed from 2023-03-17 to 2023-03-26
  • Status changed from Workable to In Progress

The output still looks quite verbose despite the change, I will need to reiterate.

Actions #15

Updated by jbaier_cz about 1 year ago

I had a typo in the --state-output parameter value, https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/812 should fix that.

Actions #16

Updated by jbaier_cz about 1 year ago

  • Status changed from In Progress to Resolved

Now we have quite a nice log which will fit into the CI log size with a debug log as an artifact.

Actions

Also available in: Atom PDF