Project

General

Profile

Actions

action #162494

closed

telegraf error on some OSD controlled machines "W! [inputs.diskio] Error gathering disk info: no such file or directory" size:S

Added by okurz 10 months ago. Updated 29 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Start date:
2024-06-19
Due date:
% Done:

0%

Estimated time:

Description

Observation

sudo salt \* cmd.run 'journalctl -u telegraf | grep -c "inputs\.diskio"'

openqaworker18.qa.suse.cz:
    0
backup-qam.qe.nue2.suse.org:
    0
…
worker40.oqa.prg2.suse.org:
    1896
…
worker33.oqa.prg2.suse.org:
    4162

just as example, not complete.

Acceptance criteria

  • AC1: No errors or warnings related to gathering disk info in telegraf journal

Acceptance tests

  • AT1-1: sudo salt \* cmd.run 'journalctl -u telegraf | grep -c "inputs\.diskio"' is 0

Suggestions

  • web research for W! [inputs.diskio] Error gathering disk info: no such file or directory and try telegraf -test on machines to reproduce. Probably we need to exclude something from looking up disks that don't exist?
Actions #1

Updated by okurz 8 months ago

  • Subject changed from telegraf error on some OSD controlled machines "W! [inputs.diskio] Error gathering disk info: no such file or directory" to telegraf error on some OSD controlled machines "W! [inputs.diskio] Error gathering disk info: no such file or directory" size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #2

Updated by tinita 7 months ago

  • Target version changed from Tools - Next to Ready
Actions #3

Updated by okurz 7 months ago

  • Target version changed from Ready to Tools - Next
Actions #4

Updated by okurz about 1 month ago

  • Target version changed from Tools - Next to Ready
Actions #5

Updated by jbaier_cz about 1 month ago

  • Assignee set to jbaier_cz
Actions #6

Updated by jbaier_cz about 1 month ago

  • Status changed from Workable to In Progress
Actions #7

Updated by jbaier_cz about 1 month ago

Meh, this leads to telegraf trying to read a non-existent /dev/nvme0c0n1 (because there is /sys/block/nvme0c0n1 detected). But this is some sort of NVMe multipath thingy which is not supposed to have a corresponding dev node. So this looks like telegraf diskio plugin cannot handle that nicely.

I was able to find some similar problems in other software[3].

It looks like we do have that feature enabled [2] (and it probably is relevant only for some, usually the larger ones, disks), it probably is also enabled by default [1]:

#  cat /sys/module/nvme_core/parameters/multipath
Y

Afaik we do not really use it so one path out of it could be just disabling it? According to [1] it should just be a kernel parameter.

[1]: https://documentation.suse.com/sles/15-SP6/html/SLES-all/cha-nvmeof.html#sec-nvmeof-host-configuration-multipathing
[2]: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_device_mapper_multipath/enabling-multipathing-on-nvme-devices_configuring-device-mapper-multipath#proc_enabling-native-nvme-multipathing_enabling-multipathing-on-nvme-devices
[3]: https://github.com/google/cadvisor/issues/3340

Actions #8

Updated by jbaier_cz about 1 month ago

  • Status changed from In Progress to Feedback

I created a draft https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1388 and would like to gather some feedback before merging/applying this.

Actions #9

Updated by jbaier_cz about 1 month ago

Maybe it is worth noting, that inputs.diskio.tagdrop is useless in this case as the attempt to get info from the /dev was already made at that point. Maybe specifying the devices instead the implicit devices=["*"] might work, but there might be the risk of not collecting everything we want/need.

Actions #10

Updated by nicksinger about 1 month ago

I've taken worker33 as example because you mention it as one of the machines causing problems and indeed I found your mentioned device-node:

worker33:~ # ls -lah /sys/class/block/nvme?c*
lrwxrwxrwx 1 root root 0 Feb 23 03:34 /sys/class/block/nvme1c1n1 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.0/nvme/nvme1/nvme1c1n1

unfortunately I couldn't figure out what exactly in udev causes this node to be created. I checked what we have on worker33 according to the kernel docs to see, if we loose any functionality. All of the 3 NVMes in worker33 have just one namespace so there can't be any multi-pathing here (The NVMe multipath feature in Linux integrates namespaces with the same identifier into a single block device.). The currently used policy is "NUMA" which indicates shorter paths in multi CPU machines (The NUMA policy selects the path closest to the NUMA node of the current CPU) which at least worker33 is not. I think this is why we can currently go ahead with the proposed change but can imagine several options to look into:

  1. Check out if the nvme-tool has a way to disable this feature on the disk itself so udev does not create this node
  2. Dig into udev if there is a way to avoid creating these nodes by e.g. setting a flag/env-variable (/usr/lib/udev/rules.d/56-multipath.rules could be interesting)
  3. Understand why telegraf considers them with devices = ["*"] - what does this *-wildcard mean? Can it be influenced?
Actions #11

Updated by jbaier_cz about 1 month ago · Edited

nicksinger wrote in #note-10:

  1. Check out if the nvme-tool has a way to disable this feature on the disk itself so udev does not create this node

I even wasn't able to find out a way to list those, but I didn't read the documentation much.

  1. Dig into udev if there is a way to avoid creating these nodes by e.g. setting a flag/env-variable (/usr/lib/udev/rules.d/56-multipath.rules could be interesting)

Good idea. That would be probably more elegant than disabling it directly in the kernel.

  1. Understand why telegraf considers them with devices = ["*"] - what does this *-wildcard mean? Can it be influenced?

I can answer that right away, telegraf scans /sys/block (in a newer version it will read /sys/class/block) for devices according to the mask. All entries there are considered as a block device and used (unfortunately as we can see, not every device there has a corresponding /dev entry and we see the result). To me it looks like a bug in telegraph, but I am no expert in this area to decide that.

Actions #12

Updated by jbaier_cz about 1 month ago

  • Status changed from Feedback to In Progress
Actions #13

Updated by jbaier_cz about 1 month ago

  • Status changed from In Progress to Feedback

I did a test on worker39. Bellow is the log from udev about the nvme0c0n1 device. To me it looks like it is added as a part of standard nvme.

udevadm test /sys/class/block/nvme0c0n1
...
nvme0c0n1: /usr/lib/udev/rules.d/56-multipath.rules:32 Importing properties from results of '/sbin/multipath -u nvme0c0n1'
nvme0c0n1: Starting '/sbin/multipath -u nvme0c0n1'
Successfully forked off '(spawn)' as PID 118661.
nvme0c0n1: Process '/sbin/multipath -u nvme0c0n1' failed with exit code 1.
nvme0c0n1: /usr/lib/udev/rules.d/56-multipath.rules:32 Command "/sbin/multipath -u nvme0c0n1" returned 1 (error), ignoring
nvme0c0n1: /usr/lib/udev/rules.d/60-persistent-storage.rules:51 Replaced 1 slash(es) from result of ENV{ID_SERIAL}="$env{ID_MODEL}_$env{ID_SERIAL_SHORT}"
nvme0c0n1: /usr/lib/udev/rules.d/60-persistent-storage.rules:53 Replaced 1 slash(es) from result of ENV{ID_SERIAL}="$env{ID_MODEL}_$env{ID_SERIAL_SHORT}_$env{ID_NSID}"
nvme0c0n1: /usr/lib/udev/rules.d/60-persistent-storage.rules:110 Importing properties from results of builtin command 'path_id'
nvme0c0n1: /usr/lib/udev/rules.d/60-persistent-storage.rules:133 Importing properties from results of builtin command 'blkid'
nvme0c0n1: Failed to get device name: No such file or directory
nvme0c0n1: /usr/lib/udev/rules.d/60-persistent-storage.rules:133 Failed to run builtin 'blkid': No such file or directory
nvme0c0n1: /usr/lib/udev/rules.d/61-persistent-storage-compat.rules:48 Importing properties from '/usr/lib/udev/compat-symlink-generation'
nvme0c0n1: /usr/lib/udev/rules.d/90-iocost.rules:18 Importing properties from results of builtin command 'hwdb 'block::name:SAMSUNG MZPLJ6T4HALA-00007:fwrev:EPK9CB5Q:''
nvme0c0n1: No entry found from hwdb.
nvme0c0n1: /usr/lib/udev/rules.d/90-iocost.rules:18 Failed to run builtin 'hwdb 'block::name:SAMSUNG MZPLJ6T4HALA-00007:fwrev:EPK9CB5Q:'': No data available
nvme0c0n1: sd-device: Created db file '/run/udev/data/+block:nvme0c0n1' for '/devices/pci0000:80/0000:80:01.1/0000:81:00.0/nvme/nvme0/nvme0c0n1'
DEVPATH=/devices/pci0000:80/0000:80:01.1/0000:81:00.0/nvme/nvme0/nvme0c0n1
DEVTYPE=disk
DISKSEQ=1
ACTION=add
SUBSYSTEM=block
.SAVED_FM_WAIT_UNTIL=
ID_SERIAL_SHORT=S55KNC0TA00631
ID_WWN=eui.35354b3054a006310025384300000002
ID_MODEL=SAMSUNG MZPLJ6T4HALA-00007
ID_REVISION=EPK9CB5Q
ID_NSID=1
ID_SERIAL=SAMSUNG_MZPLJ6T4HALA-00007_S55KNC0TA00631_1
ID_PATH=pci-0000:81:00.0-nvme-1
ID_PATH_TAG=pci-0000_81_00_0-nvme-1
COMPAT_SYMLINK_GENERATION=2
.MODEL=SAMSUNG MZPLJ6T4HALA-00007
TAGS=:systemd:
CURRENT_TAGS=:systemd:
USEC_INITIALIZED=734347554766

Nevertheless, I added the kernel parameter to the grub configuration and after the reboot there are only devices which telegraf can handle.

Actions #14

Updated by okurz 29 days ago

  • Status changed from Feedback to Resolved

MR merged and effective

sudo salt -t 10 \* cmd.run 'journalctl -u telegraf | grep -c "inputs\.diskio" | grep 2025-03-05' is clean so I assume we are good

Actions

Also available in: Atom PDF