action #88450
closedFlaky NTP offset alert
0%
Description
Observation¶
Metric name
Stratum 3Value
3073.250
Alert seen 8.44, OK seen 10.28
Acceptance criteria¶
- AC1: Alert not flaky
Suggestions¶
- To remove urgency just bump up the threshold time to prevent flaky alerts
- Check how ntp is kept in sync, adjust timeouts, servers used
- Possibly the same as #71224 but totally guessing here. That was supposed to have been fixed according to the comments.
Updated by okurz over 3 years ago
- Description updated (diff)
- Status changed from New to Workable
- Priority changed from Normal to Urgent
- Target version set to Ready
The urgency could be resolved by simply bumping the threshold time.
Updated by mkittler over 3 years ago
- Priority changed from Urgent to High
Considering the frequency of these spikes it doesn't seem to be too urgent: https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=query&editPanel=86&viewPanel=86&orgId=1&from=now-90d&to=now
We need to allow at least a 5 second offset (at least for a certain time period) to clear out the warnings (judging by the data from the last 90 days). That sounds quite high.
Note that the current warning only checks for positive offsets so far and not negative ones. This could also be fixed (if we're touching the warning conditions anyways).
Updated by mkittler over 3 years ago
- Status changed from Workable to In Progress
- Assignee set to mkittler
Updated by mkittler over 3 years ago
SR to adjust the alert: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/452
Updated by okurz over 3 years ago
- Status changed from In Progress to Resolved
merged. Thank you. As we currently don't see the alert we can close the ticket. We can reopen the ticket if we see a false-alert for NTP soon again.