Project

General

Profile

Actions

action #121579

open

Logs of openqa-worker-cacheservice-minion are incomplete and inconsistent

Added by nicksinger about 2 years ago. Updated about 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
QA (public, currently private due to #173521) - future
Start date:
2022-12-06
Due date:
% Done:

0%

Estimated time:

Description

Observation

While collecting logs for poo#121573 we realized that the journal of the service is different from the minion-log (visible in the webui) which is different from what we expect from reading the code. An example:

minion log on webui:

    [info] [#45742] Cache size of "/var/lib/openqa/cache" is 48 GiB, with limit 50 GiB
    [info] [#45742] Downloading "autoyast_SLES-12-SP4-ppc64le-HA-updated.qcow2" from "http://openqa.suse.de/tests/10091929/asset/hdd/autoyast_SLES-12-SP4-ppc64le-HA-updated.qcow2"
    [info] [#45742] Content of "/var/lib/openqa/cache/openqa.suse.de/autoyast_SLES-12-SP4-ppc64le-HA-updated.qcow2" has not changed, updating last use

journalctl -u openqa-worker-cacheservice-minion.service:

Dec 06 13:29:15 powerqaworker-qam-1 openqa-worker-cacheservice-minion[50194]: [50194] [i] Downloading: "sle-12-SP4-ppc64le-ha-alpha-alpha-node01.qcow2"
Dec 06 13:29:52 powerqaworker-qam-1 openqa-worker-cacheservice-minion[50194]: [50194] [i] Cache size of "/var/lib/openqa/cache" is 50 GiB, with limit 50 GiB

@okurz also found some code which should print "purging" messages when cached assets are deleted. These I can't find at all in the journal.

It also seems that just sometimes the minion-id is logged in the journal which makes it quite hard to find the corresponding minion job from the journal and vice versa.

Acceptance criteria

  • AC1: Log output on the webui of a minion is consistent and the same as in the system journal
  • AC2: Asset deletion is also logged inside the system journal
  • AC3: System journal includes a reference to the according minion job id

Suggestions

  • Try to understand why #121573 went wrong and take a look at several logs we have for the cacheservice. Observe how all these logfiles are telling different things and it is quite hard to link them together to create a complete picture of what happens at that time with the job and cacheservice.

Related issues 1 (1 open0 closed)

Related to openQA Project (public) - action #121573: Asset/HDD goes missing while job is runningNew2022-12-06

Actions
Actions #1

Updated by nicksinger about 2 years ago

  • Related to action #121573: Asset/HDD goes missing while job is running added
Actions #2

Updated by okurz about 2 years ago

  • Subject changed from Logs of openqa-worker-cacheservice-minion are incomplete to Logs of openqa-worker-cacheservice-minion are incomplete and inconsistent
  • Description updated (diff)
  • Category set to Feature requests
  • Target version set to future
Actions

Also available in: Atom PDF