Project

General

Profile

action #103539

Update expired SSL certificate on monitor.qa.suse.de with dehydrated and salt, same as on OSD size:M

Added by okurz about 2 months ago. Updated 10 days ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2021-12-06
Due date:
% Done:

0%

Estimated time:

Description

Observation

Same as for OSD done in #103149 we need to update SSL certificates on all the domains we have linked to monitor.qa.suse.de (4 domains). See https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/615

Acceptance criteria

  • AC1: monitor.qa.suse.de has a valid certificate
  • AC2: stats.openqa-monitor.qa.suse.de has a valid certificate
  • AC3: there is monitoring for the certificate, same as we have for osd
  • AC4: The certificates are automatically refreshed

Suggestions

grafik.png (119 KB) grafik.png cdywan, 2022-01-18 18:23
12409

Related issues

Related to openQA Project - action #103527: osd-deployment pipelines fail and alerts are not handled size:MResolved

History

#1 Updated by cdywan about 2 months ago

  • Subject changed from Update SSL certificates on monitor.qa.suse.de with dehydrated and salt, same as on OSD to Update expired SSL certificate on monitor.qa.suse.de with dehydrated and salt, same as on OSD size:M
  • Description updated (diff)
  • Status changed from New to Workable

#2 Updated by nicksinger about 2 months ago

  • Assignee set to nicksinger

#3 Updated by nicksinger about 2 months ago

  • Status changed from Workable to In Progress

#4 Updated by nicksinger about 2 months ago

  • Status changed from In Progress to Feedback

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/617
Changes are ready to be merged. For testing I already run the commands on OSD so even before anyone merges this we already have a valid certificate on https://monitor.qa.suse.de again :)

#5 Updated by okurz about 2 months ago

  • Related to action #103527: osd-deployment pipelines fail and alerts are not handled size:M added

#6 Updated by cdywan about 2 months ago

monitor.qa.suse.de and stats.openqa-monitor.qa.suse.de seems to work - any remaining steps wrt making sure renewal happens automatically?

#7 Updated by cdywan about 2 months ago

  • Priority changed from Urgent to Normal

Lowering priority since, as far as I can tell, the sites work without exceptions in the browser

#8 Updated by okurz about 2 months ago

I suggest to add monitoring, same as we already have for OSD

#9 Updated by nicksinger about 1 month ago

  • Description updated (diff)
  • Due date set to 2022-01-07
  • Status changed from Feedback to Workable

Monitoring is left to do. Will carry this over into the new year. If anybody is bored feel free to take over :) Otherwise I will add the monitoring in the new year. Our current certificate is valid until the 20th of Jan so we have headroom to implement this monitoring before the current cert expires

#10 Updated by okurz about 1 month ago

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/634 should be a start to bring in the according telegraf config. For this we should read out the domain list from pillars but I realized that pillars have not been updated on monitor.qa since 2021-04 so that should be the first thing to check. Then in grafana copy the certificate related panel from the webui dashboard into https://monitor.qa.suse.de/d/EML0bpuGk/monitoring

#11 Updated by cdywan 23 days ago

  • Due date changed from 2022-01-07 to 2022-01-14
  • Priority changed from Normal to High

Seems like a good thing we already set the due date prematurely. nicksinger I recall your saying you were already looking into it. If you can, please update here at the next opportunity - or if there's something others can help you with, you're welcome to ask.

#12 Updated by nicksinger 16 days ago

  • Status changed from Workable to In Progress

okurz wrote:

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/634 should be a start to bring in the according telegraf config. For this we should read out the domain list from pillars but I realized that pillars have not been updated on monitor.qa since 2021-04 so that should be the first thing to check. Then in grafana copy the certificate related panel from the webui dashboard into https://monitor.qa.suse.de/d/EML0bpuGk/monitoring

not sure how you determined that the pillar cache was not updated that long. According to https://salt-users.narkive.com/vJmNliU0/why-must-saltutil-refresh-pillar-be-run#post2 this should happen with every highstate. I can't imagine we didn't do changes to that host in that time frame. Also checking with salt-call pillar.get "dehydrated" I see the most recent pillar data (which according to the post on salt-users should be from the cache).

#13 Updated by nicksinger 16 days ago

I've opened https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/639 as a first draft how I would imagine the monitoring. It should be more generic and work for every host we eventually add in the future. I still need to figure out how to properly loop over the SANs of the hosts.txt entries per host to have a proper check for each of them (this is what happens in https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/639/diffs#85e7a4e39662e846e9a7e8c1660d41d28a0389c3_0_8) - it might be even possible to shrink the certificates.conf down to a single [[inputs.x509_cert]] section

#14 Updated by cdywan 11 days ago

  • Due date changed from 2022-01-14 to 2022-01-18

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/639 is still in progress, I'm starting to get worried that we'll see alerts for broken certificates afterall... bumping the due date to tomorrow in any case

nicksinger Please prepare manual steps to do the update tomorrow

#15 Updated by cdywan 11 days ago

nicksinger wrote:

Monitoring is left to do. Will carry this over into the new year. If anybody is bored feel free to take over :) Otherwise I will add the monitoring in the new year. Our current certificate is valid until the 20th of Jan so we have headroom to implement this monitoring before the current cert expires

Apparently we're at Feb 14 now, which suggests automated renewal worked at least once.

#16 Updated by nicksinger 11 days ago

cdywan wrote:

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/639 is still in progress, I'm starting to get worried that we'll see alerts for broken certificates afterall... bumping the due date to tomorrow in any case

nicksinger Please prepare manual steps to do the update tomorrow

Just to make it clear here in the ticket too:

AC4: The certificates are automatically refreshed
is already done. Therefore we don't need to worry about an expiring certificate and we won't see any alerts because this is why this ticket is still open - creating monitoring/alerts for it. I polished up https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/639 and think it can be merged now to receive the metrics. A dashboard for grafana is still left to do but I'd like to cover it in a separate MR

#17 Updated by nicksinger 11 days ago

I came up with a new dashboard including alerts and proper instructions here: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/640
It might not work after merging because I couldn't really test the template for the dashboard.

#18 Updated by cdywan 10 days ago

12409

nicksinger wrote:

I came up with a new dashboard including alerts and proper instructions here: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/640
It might not work after merging because I couldn't really test the template for the dashboard.

The MR got merged. Looks very nice

#19 Updated by okurz 10 days ago

  • Due date deleted (2022-01-18)
  • Status changed from Feedback to Resolved

I agree. All looks good

Also available in: Atom PDF