Project

General

Profile

Actions

action #109792

closed

[qe-core] Offline extraction of logs from Serial console

Added by szarate about 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Enhancement to existing tests
Target version:
Start date:
2022-07-21
Due date:
% Done:

0%

Estimated time:
Difficulty:
Sprint:
QE-Core: August Sprint (Aug 03 - Aug 31)

Description

There might be cases where it could be convenient to extract the logs via journal or serial console, due to limitations, network availability or simply convenience, for post processing.

Ideally openQA would do something like:

cat EOF ---BEGIN OPENQA LOGFILE $test_module $log_filename ---
$log
 ---END OPENQA LOGFILE $test_module $log_filename ---
EOF > $serialdev

Ideally or something like this:

   `echo "MAKER" $( $log_string | gzip | base64encode) "END MAKER"  > $serialdev`

as a concept, this is what we want: journalctl -k | tail --lines 100 | gzip | base64 | base64 -d | zcat, os-autoinst acts here in the decoding part base64 -d | zcat to extract the files from the logs.

Upon test completion, a program monitoring the amqp events would parse the serial log, and split it into the multiple parts and upload the results to the corresponding job. It should, base64 encoding + gzip could be used to reduce the ammount of lines in the serial log.

See https://amqp.opensuse.org for o3 and for OSD the URL is configured here: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/server.sls, see https://suse.slack.com/archives/C02CANHLANP/p1649672715876659?thread_ts=1649671726.929779&cid=C02CANHLANP for a thread in Slack

This would help further down the line, making investigation of isolated systems, or certain corner cases easier to debug.

Acceptance Criteria

  1. A method that allows streaming data to the serial console, compresed and encoded with base64 (check if jeos, sle-micro don't have required binaries)
  2. Write a daemon that listens for events on jobs running on functional, that extracts the logs and uploads them to the corresponding job, once it has finished

See also: https://openqa.suse.de/tests/8533774#step/systemd_sapstart_check/44 for a test that runs without network

Constraints

Use serial console, if it proves unreliable, system journal should be looked at (do mind the kernel rate limiting for messages)

suggestions:

See https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/14681/files#diff-05f9f9adf140de4c57ec6f1c3f1034ef825689b5d16949ac3de6ea8655d768d3R258

Another idea is to save the logs to the ulogs directory, directly from the perl code


Related issues 4 (1 open3 closed)

Related to openQA Tests - action #109789: [qe-core] generate a list of installed packages in the system at the end of a testResolveddheidler2022-04-11

Actions
Related to openQA Tests - action #110097: [sle][migration][sle15sp4] Need investigate how to get scsi id from grub shell pageResolvedtinawang1232022-04-20

Actions
Related to openQA Tests - action #114433: [qe-core] call systemctl list-timers --all prior shutdown of the system Resolvedmgrifalconi2022-07-20

Actions
Related to openQA Tests - coordination #68794: [qe-core][functional][epic] rework postfail hooksBlockedszarate2020-03-31

Actions
Actions #1

Updated by szarate about 2 years ago

  • Subject changed from [qe-core] Offline extraction of logs to [qe-core] Offline extraction of logs from Serial console
Actions #2

Updated by szarate about 2 years ago

sub log2serial {
   `journalctl -k | gzip | base64encode  > $serialdev`
}
Actions #3

Updated by szarate about 2 years ago

  • Related to action #109789: [qe-core] generate a list of installed packages in the system at the end of a test added
Actions #4

Updated by szarate about 2 years ago

  • Description updated (diff)
Actions #5

Updated by szarate about 2 years ago

  • Checklist item deleted (Add method in the test distribution to content to the serial console, taking compression and encoding into account)
  • Checklist item deleted (Write a demon that listens for events on jobs running on functional, that extracts the logs and uploads them to the corresponding job, once it has finished)
  • Description updated (diff)
  • Category set to Enhancement to existing tests
Actions #7

Updated by szarate about 2 years ago

  • Priority changed from Normal to High
Actions #8

Updated by szarate about 2 years ago

  • Description updated (diff)
Actions #9

Updated by szarate about 2 years ago

  • Target version set to QE-Core: Ready
Actions #10

Updated by szarate about 2 years ago

  • Sprint set to QE-Core: April Sprint (Apr 13 - May 11)
Actions #11

Updated by szarate about 2 years ago

  • Status changed from New to Workable
Actions #12

Updated by szarate almost 2 years ago

  • Description updated (diff)
Actions #13

Updated by szarate almost 2 years ago

  • Related to action #110097: [sle][migration][sle15sp4] Need investigate how to get scsi id from grub shell page added
Actions #14

Updated by szarate almost 2 years ago

  • Status changed from Workable to New
Actions #15

Updated by szarate almost 2 years ago

  • Sprint deleted (QE-Core: April Sprint (Apr 13 - May 11))
Actions #16

Updated by szarate almost 2 years ago

  • Sprint set to QE-Core: July Sprint (Jul 06 - Aug 03)
Actions #17

Updated by dheidler almost 2 years ago

Why would we need to have any external service reacting on some AMQP events?
We could simply use script_output().

Actions #18

Updated by szarate almost 2 years ago

dheidler wrote:

Why would we need to have any external service reacting on some AMQP events?
We could simply use script_output().

So long the test ends up with those files, I'm ok with whichever solution you design :p

Actions #19

Updated by szarate almost 2 years ago

  • Status changed from New to Workable
Actions #20

Updated by szarate almost 2 years ago

  • Description updated (diff)
Actions #21

Updated by zluo almost 2 years ago

  • Assignee set to zluo

will check this.

@szarate which stage could be involved? POST_FALL_HOOK or at stage uploading logs( file name would be helpful)?

Actions #22

Updated by szarate almost 2 years ago

@zluo, for the first part,ideally this would happen on the uploading_logs stage, if the filename is not provided, use the name of the test module...

We're looking to wrap the curl call that uploads the logs, and detect if the SUT has no network and do the thing... if you're going to try to save the script_output to ulogs in the worker, as Dominik mentions earlier, it's even easier I would say

Actions #24

Updated by szarate almost 2 years ago

well, I found out that is working on my local host like:

sub log2serial {
journalctl -k | gzip | base64 -w0 > /tmp/test_module.zip
}

you can do:

use Mojo::File;
my $data = script_output(`journalctl -k | gzip | base64 -w0`);
path("ulogs/$logname")->sprut($data); # save the logs to the ulogs directory on the worker directly

when the test is finished, the worker should upload the files on its own, script_output should be able to handle all that data

PS: no need to mark the comments as private, this is a public project :)

Actions #25

Updated by zluo almost 2 years ago

  • Status changed from Workable to In Progress
Actions #26

Updated by zluo almost 2 years ago

  • Start date changed from 2022-04-11 to 2022-07-21
Actions #27

Updated by szarate over 1 year ago

  • Related to action #114433: [qe-core] call systemctl list-timers --all prior shutdown of the system added
Actions #28

Updated by szarate over 1 year ago

Santiago to pair with Zaoliang

Actions #29

Updated by szarate over 1 year ago

  • Sprint changed from QE-Core: July Sprint (Jul 06 - Aug 03) to QE-Core: August Sprint (Aug 03 - Aug 31)
Actions #31

Updated by szarate over 1 year ago

Actions #32

Updated by zluo over 1 year ago

  • Status changed from In Progress to Resolved

PR got merged now.

Actions

Also available in: Atom PDF