Project

General

Profile

Actions

action #88053

open

Improve logging in scripts (was: openQA in openQA test fails with cryptic error enumerating jobs)

Added by livdywan almost 4 years ago. Updated almost 4 years ago.

Status:
Workable
Priority:
Low
Assignee:
-
Category:
Regressions/Crashes
Target version:
QA (public, currently private due to #173521) - future
Start date:
2021-01-20
Due date:
% Done:

0%

Estimated time:

Description

start_test appears to fail like so:

Test died: command 'job_id=$(openqa-client --host http://openqa.opensuse.org --json-output jobs get version=Tumbleweed scope=relevant arch=x86_64 build=$last_tw_build flavor=NET latest=1 | jq '.jobs | .[] | select(.test == "minimalx") | .id')' failed at /var/lib/openqa/cache/openqa1-opensuse/tests/openqa//tests/osautoinst/start_test.pm line 22

So wondering what this means, I attempt to reproduce quickly and get a long list of job id's after a considerable delay. So $job_id is not a single id, but why would that fail 🤔

Upon inspection of the screenshots there's a more specific error message:

parse error: Invalid numeric literal at line 1, column 6

I still don't know what that means since the list I saw is of the form 12345 23456 45678. But, this should be visible w/o a screenshot.


Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #82067: Unable to use openqa-clone-custom-git-refspec - Returns 403 Not authorizedResolvedokurz2021-01-08

Actions
Actions #1

Updated by livdywan almost 4 years ago

So I tried to reproduce running the snippet from start_test.pm in os-autoinst-distri-opensuse thinking it's a quoting issue, because the test output won't seemingly expand $last_tw_build unlike ttest.

It's worth noting the almost identical code this test fails in is start_test.pm in os-autoinst-distri-openQA (the zypper command has an extra command guard).

So, I'm guessing there must be something in the environment, and not with the quoting of the embedded shellscript?

Actions #2

Updated by okurz almost 4 years ago

  • Category set to Regressions/Crashes
  • Status changed from New to Workable
  • Target version set to Ready

First, in theory it should be expected that the openQA API route returns a list of ids. And the jq command is already written in a way that it would walk over multiple ids. The assignment to the bash variable would have list instead of a single job then. But the "parse error" sounds like openqa-client --json-output would not actually output json output.

Multiple things we could make out of this:

  • Find out why openqa-client --json-output should output a plain id list (let's start with that)
  • Fix the actual issue in https://github.com/os-autoinst/os-autoinst-distri-openQA/
  • Suggest to use serial terminal by default for terminal tests
  • Make sure serial terminal based tests still have a nice video (or at least exclude from the video)
Actions #3

Updated by tinita almost 4 years ago

I think I found the bug:

https://openqa.opensuse.org/tests/1594919#step/start_test/15

As you can see, the output from openqa-client is:

Otput: [INFO] Reading config from /etc/openqa/client.conf
{
  "jobs": [
...
Actions #4

Updated by tinita almost 4 years ago

  • Status changed from Workable to In Progress
  • Assignee set to tinita
Actions #5

Updated by livdywan almost 4 years ago

@tinita Yikes. Good catch! That's an info level log, but somehow it's still printed there.

So that explains the "invalid numeral"

Actions #6

Updated by tinita almost 4 years ago

Maybe it would be a good idea to make that output depend on if the command was called interactively.
If it's piped or redirected to a file, like in this case, the output should not appear.
There's a handy module for that: IO::Interactive.

Actions #7

Updated by tinita almost 4 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (tinita)
Actions #8

Updated by okurz almost 4 years ago

  • Priority changed from Normal to Urgent

Geez, now I see the relation. It's a recent regression. Please fix urgently

Actions #10

Updated by okurz almost 4 years ago

  • Priority changed from Urgent to Low
  • Target version changed from Ready to future

revert was merged. That is making the issue less severe. The ideas we discussed were that we could improve the logging. Underlying libraries should not log directly but be introspectable, e.g. provide the current configuration to be read by applications. Applications should be quiet by default, only output error messages. Log messages should be written to stderr rather than stdout. Higher levels of verbosity could be enabled with a -v|--verbose command line parameter or allowing multiple levels like is done in https://github.com/os-autoinst/openQA/blob/master/script/openqa-label-all#L27

Actions #11

Updated by okurz almost 4 years ago

  • Subject changed from openQA in openQA test fails with cryptic error enumerating jobs to Improve logging in scripts (was: openQA in openQA test fails with cryptic error enumerating jobs)
Actions #12

Updated by tinita almost 4 years ago

  • Related to action #82067: Unable to use openqa-clone-custom-git-refspec - Returns 403 Not authorized added
Actions #13

Updated by livdywan almost 4 years ago

https://github.com/os-autoinst/openQA/pull/3684 In this PR I proposed the "obvious" fix which led to a more general discussion about logging, or not logging as it were. Closed it for now.

Actions

Also available in: Atom PDF