Project

General

Profile

Actions

action #170347

closed

NUE2 DHCP server dhcpd says "host unknown" (red), possibly related to squidbilly

Added by okurz 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-11-27
Due date:
% Done:

0%

Estimated time:

Description

Observation

While trying to help pcervinka in https://suse.slack.com/archives/C02CANHLANP/p1732699255390289 regarding potential DHCP problems from walter1.qe.nue2.suse.org journalctl -u dhcpd I see recurring messages like

Nov 27 08:40:58 walter1 dhcpd[26087]: walter1: host unknown.
Nov 27 08:40:58 walter1 dhcpd[26087]: DHCPDISCOVER from 0c:42:a1:49:cc:90 via eth0
Nov 27 08:40:58 walter1 dhcpd[26087]: DHCPOFFER on 10.168.193.13 to 0c:42:a1:49:cc:90 via eth0
Nov 27 08:41:08 walter1 dhcpd[26087]: DHCPREQUEST for 10.168.193.13 (10.168.192.2) from 0c:42:a1:49:cc:90 via e>
Nov 27 08:41:08 walter1 dhcpd[26087]: DHCPACK on 10.168.193.13 to 0c:42:a1:49:cc:90 via eth0
Nov 27 08:41:19 walter1 dhcpd[26087]: DHCPDISCOVER from 0c:42:a1:49:cc:90 via eth0
Nov 27 08:41:19 walter1 dhcpd[26087]: DHCPOFFER on 10.168.193.13 to 0c:42:a1:49:cc:90 via eth0
Nov 27 08:41:22 walter1 dhcpd[26087]: DHCPREQUEST for 10.168.193.13 (10.168.192.2) from 0c:42:a1:49:cc:90 via e>
Nov 27 08:41:22 walter1 dhcpd[26087]: DHCPACK on 10.168.193.13 to 0c:42:a1:49:cc:90 via eth0

maybe this is also a red herring as in the end there is a DCPACK which looks fine according to what I see in https://gitlab.suse.de/search?search=squidbilly&nav_source=navbar&project_id=419&group_id=8&search_code=true&repository_ref=production

Actions #1

Updated by dheidler 3 months ago

  • Status changed from New to In Progress
  • Assignee set to dheidler
Actions #2

Updated by dheidler 3 months ago

The error is most likely this one which indicates failure to resolve a DNS name.

For whatever reason the config on walter1 has the entry:

server-identifier walter1;

which doesn't make any sense because the server-identifier is to be set to an IP address and defaults to the IP of the first NIC.
And the server only has one.

        server-identifier hostname;

        The  server-identifier  statement  can be used to define the value that is sent in the DHCP
        Server Identifier option for a given scope.  The value specified must be an IP address  for
        the DHCP server, and must be reachable by all clients served by a particular scope.

        The  use  of the server-identifier statement is not recommended - the only reason to use it
        is to force a value other than the default value to be sent on occasions where the  default
        value  would  be  incorrect.  The default value is the first IP address associated with the
        physical network interface on which the request arrived.

        The usual case where the server-identifier statement needs to be sent is  when  a  physical
        interface has more than one IP address, and the one being sent by default isn't appropriate
        for some or all clients served by that interface.  Another common case is when an alias  is
        defined  for  the  purpose of having a consistent IP address for the DHCP server, and it is
        desired that the clients use this IP address when contacting the server.

So we could either remove them or set them to the proper hostname so that DNS lookup actually works.

Actions #3

Updated by dheidler 3 months ago ยท Edited

The server-identifier is taken from salt['grains.get']('fqdn') in the salt template

The difference between suttner1 (which doesn't have this issue) and walter1:

  • The dhcpd config has the full suttner1 fqdn as server-identifier
  • walter1 didn't have these entries in /etc/hosts, but suttner1 had - so I added them:
10.168.192.1 walter1.qe.nue2.suse.org walter1
2a07:de40:a102:5:10:168:192:1 walter1.qe.nue2.suse.org walter1

This allows hostname -f to work on walter1. Also it enables the server-identifier walter1; entry to work.

But as it also changes the grain info, this will fix the config on the next salt deployment:

# before 
% salt-call grains.get fqdn
local:
    walter1

# after
% salt-call grains.get fqdn
local:
    walter1.qe.nue2.suse.org
Actions #4

Updated by dheidler 3 months ago

Same for walter2

Actions #5

Updated by dheidler 3 months ago

  • Status changed from In Progress to Resolved

I'm resolving this issue without actually running salt-call state.apply as it would overwrite the hotfixes that are still pending in https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5854.

Actions

Also available in: Atom PDF