Project

General

Profile

Actions

action #165434

closed

OSD SSL certificates not always refreshed within expected time, probably only after system reboots size:S

Added by okurz 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Start date:
2024-08-18
Due date:
% Done:

0%

Estimated time:

Description

Observation

See Screenshot_20240818_115546.png

or live on
https://stats.openqa-monitor.qa.suse.de/d/E9tyiQ17k/ssl-certificate-alerts?orgId=1&from=1712673350510&to=1723974839587&viewPanel=5

it is expected that the SSL certificates are refreshed as visible on the left-hand side so that the validity is always at least 3 weeks. Since around 2024-05-19 it seems that the certificates are not always refreshed as before anyway although we have never ran out of a valid certificate. Today, 2024-08-18, it seems that the certificate validity period decreased to just below the 6 days alerting threshold just before OSD rebooted which seemingly triggered a refresh.

Acceptance criteria

  • AC1: OSD SSL certificates are ensured to refresh consistently well above the alerting threshold

Suggestions


Files

Screenshot_20240818_115546.png (75.9 KB) Screenshot_20240818_115546.png OSD SSL certificates not always refreshed within expected time, probably only after system reboots okurz, 2024-08-18 09:56

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #167458: openqa.oqa.prg2.suse.org SAN validity alertResolvednicksinger

Actions
Copied to openQA Infrastructure (public) - action #169078: dashboard.qam.suse.de SSL certificate not deployed within expiry size:SResolvedrobert.richardson2024-11-29

Actions
Actions #1

Updated by okurz 4 months ago

  • Subject changed from OSD SSL certificates not always refreshed within expected time, probably only after system reboots to OSD SSL certificates not always refreshed within expected time, probably only after system reboots size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #2

Updated by nicksinger 4 months ago

  • Status changed from Workable to Resolved
  • Assignee set to nicksinger

We have a hook-script in /etc/dehydrated/postrun-hooks.d/reload-webserver.sh which is populated based on the grain webserver (see https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/certificates/dehydrated.sls?ref_type=heads#L37) but this grain was never changed to nginx and was still set to apache2. I manually changed /etc/salt/grains to contain webserver: apache2 and called salt 'openqa.suse.de' saltutil.sync_grains, verified with:

openqa:/etc/dehydrated # salt 'openqa.suse.de' grains.get webserver
openqa.suse.de:
    nginx

and deployed with:

openqa:/etc/dehydrated # salt 'openqa.suse.de' state.sls_id /etc/dehydrated/postrun-hooks.d/reload-webserver.sh certificates.dehydrated
openqa.suse.de:
----------
          ID: /etc/dehydrated/postrun-hooks.d/reload-webserver.sh
    Function: file.managed
      Result: True
     Comment: File /etc/dehydrated/postrun-hooks.d/reload-webserver.sh updated
     Started: 14:27:29.212692
    Duration: 45.262 ms
     Changes:
              ----------
              diff:
                  ---
                  +++
                  @@ -1,2 +1,2 @@
                   #!/bin/sh
                  -systemctl reload apache2
                  +systemctl reload nginx

Summary for openqa.suse.de
------------
Succeeded: 1 (changed=1)
Failed:    0
------------
Total states run:     1
Total run time:  45.262 ms

Unfortunately I reloaded nginx while testing so we will have to wait another 3 weeks before we see if it worked. But our alerts will let us know.

Actions #3

Updated by nicksinger 3 months ago

  • Related to action #167458: openqa.oqa.prg2.suse.org SAN validity alert added
Actions #4

Updated by livdywan about 2 months ago

  • Copied to action #169078: dashboard.qam.suse.de SSL certificate not deployed within expiry size:S added
Actions

Also available in: Atom PDF