Project

General

Profile

Actions

tickets #62204

closed

new VMs are "physical" for salt

Added by cboltz almost 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Core services and virtual infrastructure
Target version:
-
Start date:
2020-01-16
Due date:
% Done:

100%

Estimated time:

Description

Some newly created VMs get recognized as physical instead of kvm by salt. Besides being funny (and causing a slightly different setup, see pillar/virtual/*), this results in an error when trying to run a highstate:

highstate 
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'virt_cluster.atreju.physical' in environment 'production' is not available on the salt master

Affected VMs are:

# salt \* grains.get virtual | grep -B1 physical  # output slightly beautified
nue-ns1.infra.opensuse.org:
    physical
nue-ns2.infra.opensuse.org:
    physical
jekyll.infra.opensuse.org:
    physical
provo-gate.infra.opensuse.org:
    physical

while all other VMs show kvm for grains.get virtual

Lars, IIRC you mentioned that you used a slightly different libvirt config when setting up these VMs. Could this be the reason why the new VMs look like physical machines to salt?

(We'lll probably end up reporting this as a salt bug, but let's first collect some information so that the salt devs can understand and fix this ;-)

Actions #1

Updated by cboltz almost 5 years ago

  • Private changed from Yes to No
Actions #3

Updated by lrupp almost 5 years ago

~> salt 'jekyll*' grains.get manufacturer
   jekyll.infra.opensuse.org:
      QEMU

~> systemd-detect-virt
   kvm

???

Actions #4

Updated by cboltz almost 5 years ago

It seems some VMs (nue-ns1 and jekyll) fixed themself (no idea why) - but nue-ns2 still thinks it's made of iron, and forum.i.o.o also joined the iron club.

However, grains.get manufacturer results in QEMU and systemd-detect-virt says kvm for both nue-ns2 and forum.

provo-gate currently results in Minion did not return. [Not connected] which is something you might want to check and fix ;-)

Actions #5

Updated by cboltz almost 5 years ago

The code for setting the virtual grain is in /usr/lib/python3.6/site-packages/salt/grains/core.py in the _virtual function.

The default is grains = {'virtual': 'physical'} which is set at the beginning of the function.

Another default is _cmds = ['systemd-detect-virt', 'virt-what', 'dmidecode'] - that's the list of check commands.

However, if virt-what exists (which is only true on a few machines, including those mentioned in this ticket), then _cmds gets set to only virt-what:

    if not salt.utils.platform.is_windows() and osdata['kernel'] not in skip_cmds:
        if salt.utils.path.which('virt-what'):
            _cmds = ['virt-what']

The machines that show up as physical have one thing in common - if you manually call virt-what, you get

virt-what: virt-what-cpuid-helper program not found in $PATH

Therefore it's not too surprising that the default grains[virtual] = physical doesn't get changed.

(As a funny sidenote - after reading the source, I noticed that /v/l/salt/minion shows an error message about this ;-)

For completeness: Some other machines (daffy1, daffy2, matomo, jekyll, baloo, svn, os-rt, riesling3, mybackup-opensuse) have a working virt-what that returns kvm as expected.

Everything above means that the problem is in virt-what, not salt itsself.

virt-what is a script which changes PATH to include /usr/lib/, so in theory, it should find /usr/lib/virt-what-cpuid-helper - but in practise which virt-what-cpuid-helper fails :-(

The reason for that is even more entertaining - I added a plain which virt-what-cpuid-helper to the script (without 2>/dev/null as the script originally does when trying to write the path into a variable, and the result is (line 90 is the which call I added):

/usr/sbin/virt-what: line 90: which: command not found

Unsurprisingly, the machines listed in this ticket also don't have which installed. There are some more machines without which (especially caasp*) - but those also don't have virt-what installed.

After so much debugging, having some good news would be good, right?

One zypper in which (on forum.i.o.o) later, virt-what reports kvm, and the virtual grain now also says kvm :-) (salt-minion might need a restart (or time) to update the grains)

Oh, and to make things even more funny - the RPM changelog of which says it was split off util-linux in January 2013, but OTOH virt-what still requires util-linux. I'm quite surprised that nobody noticed this before...

I reported this as https://bugzilla.opensuse.org/show_bug.cgi?id=1161850 and submitted MR 320 on gitlab to get the package installed on all 15.x VMs.

Actions #7

Updated by lrupp almost 5 years ago

  • Status changed from New to In Progress
  • Assignee changed from lrupp to cboltz
  • % Done changed from 0 to 100
Actions #9

Updated by cboltz almost 5 years ago

  • Status changed from In Progress to Closed

The workaround is merged in salt and deployed to the affected machines (except provo-gate, which currently isn't reachable by the salt-master).

Therefore I'd say we can close this ticket as fixed, even if the "fix" is a workaround ;-)

Actions

Also available in: Atom PDF