Project

General

Profile

Actions

action #68305

closed

failed systemd services on grenache-1 after bootup

Added by okurz over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
2020-06-22
Due date:
% Done:

0%

Estimated time:

Description

Observation

grafana alerts about failed systemd services on https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services

the specific failed alert:

 # systemctl --failed
  UNIT                         LOAD   ACTIVE SUB    DESCRIPTION                                               
● systemd-modules-load.service loaded failed failed Load Kernel Modules                                       

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
grenache-1:/home/okurz # systemctl status systemd-modules-load.service
● systemd-modules-load.service - Load Kernel Modules
   Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sun 2020-06-21 03:33:37 CEST; 1 day 10h ago
     Docs: man:systemd-modules-load.service(8)
           man:modules-load.d(5)
 Main PID: 1264 (code=exited, status=1/FAILURE)

Jun 21 03:33:37 grenache-1 systemd-modules-load[1264]: Failed to insert 'kvm_hv': No such device
Jun 21 03:33:37 grenache-1 systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=>
Jun 21 03:33:37 grenache-1 systemd[1]: Failed to start Load Kernel Modules.
Jun 21 03:33:37 grenache-1 systemd[1]: systemd-modules-load.service: Unit entered failed state.
Jun 21 03:33:37 grenache-1 systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.

due to

# cat /etc/modules-load.d/kvm.conf
kvm_hv

We need this module for kvm-on-power but not on grenache-1 which is not running any kvm tests.


Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #32563: [functional][u] fix salt for powerResolvedszarate2018-02-28

Actions
Actions #1

Updated by okurz over 4 years ago

  • Related to action #32563: [functional][u] fix salt for power added
Actions #2

Updated by okurz over 4 years ago

  • Status changed from New to Feedback
  • Assignee set to okurz

Originally this has been introduced with #32563 . Trying to distinguish grenache-1 which IIUC is a powerVM LPAR from the bare metal installations, e.g. with

okurz@openqa:~> sudo salt -l error -C 'G@roles:worker and G@osarch:ppc64le' grains.item os os_family osarch virtual kernel host cpu_flags cpu_model cpuarch
QA-Power8-4-kvm.qa.suse.de:
    ----------
    cpu_flags:
    cpu_model:
        Unknown
    cpuarch:
        ppc64le
    host:
        QA-Power8-4-kvm
    kernel:
        Linux
    os:
        SUSE
    os_family:
        Suse
    osarch:
        ppc64le
    virtual:
        physical
grenache-1.qa.suse.de:
    ----------
    cpu_flags:
    cpu_model:
        Unknown
    cpuarch:
        ppc64le
    host:
        grenache-1
    kernel:
        Linux
    os:
        SUSE
    os_family:
        Suse
    osarch:
        ppc64le
    virtual:
        physical

Found an approach now with custom grain to check on:
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/322

Actions #3

Updated by okurz over 4 years ago

  • Status changed from Feedback to Resolved

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/323 as followup as the custom grains were not properly synced. I manually deleted the kvm module load file as cleanup. I rebooted grenache-1.qa manually once as the system was not executing any jobs right now and no services failed on startup.

Actions

Also available in: Atom PDF