action #166721
open
openQA Project (public) - coordination #58184: [saga][epic][use case] full version control awareness within openQA
openQA Project (public) - coordination #152847: [epic] version control awareness within openQA for test distributions
[alert] Waves of emails due to kex_exchange_identification: Connection closed by remote host errors size:S
Added by livdywan 3 months ago.
Updated 15 days ago.
Category:
Regressions/Crashes
Description
Observation¶
Many emails with the subject Cron <geekotest@ariel> git -C /opt/os-autoinst-scripts pull --quiet --rebase origin master
and Cron <geekotest@ariel> env updateall=1 force=1 /usr/share/openqa/script/fetchneedles
:
kex_exchange_identification: Connection closed by remote host
Connection closed by 140.82.121.4 port 22
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Suggestions¶
- Replace such cron jobs with systemd timers
- Add the timer definition to the git repo
- Copy the service/timer to avoid it being changed if the git repo is rolled back
- Rely on #164898 to deal with fetchneedles
- Copied from action #166433: [alert] Waves of emails due to manual changes in /opt/openqa-trigger-from-obs size:S added
- Description updated (diff)
- Related to action #164898: Replace fetchneedles with a minion job for the regular update of git repos size:M added
- Status changed from New to In Progress
- Assignee set to livdywan
- Replace such cron jobs with systemd timers
- Add the timer definition to the git repo
- Copy the service/timer to avoid it being changed if the git repo is rolled back
I'll propose something like in #166433 for os-autoinst-scripts and then block on #164898.
- Status changed from In Progress to Feedback
- Status changed from Feedback to Blocked
livdywan wrote in #note-6:
Blocking on #164898
As this is tracked and in progress I modified the cron job o env updateall=1 force=1 /usr/share/openqa/script/fetchneedles || true
so we don't see unactionable emails about temporary connectivity issues. The kex_exchange_identification: Connection closed by remote host
are sporadic.
cron sends an email if there is any output so I replaced the || true
with > /dev/null 2>&1
but it would be better to only exclude the very specific problematic lines when they are covered by retries which is probably necessary to be done within fetchneedles.
okurz wrote in #note-8:
cron sends an email if there is any output so I replaced the || true
with > /dev/null 2>&1
but it would be better to only exclude the very specific problematic lines when they are covered by retries which is probably necessary to be done within fetchneedles.
Thanks! Yes, that will be handled in the blocker. This is just temporary.
Still pending on #164898 which I expect we'll address next week.
livdywan wrote in #note-10:
Still pending on #164898 which I expect we'll address next week.
I'm still periodically checking via sudo -u geekotest env updateall=1 force=1 /usr/share/openqa/script/fetchneedles
that needles are being fetched.
#164898 is almost done. we dont expect any progress today due to the germans' holiday
#164898 has deployed.
Last email with this subject was from 13/09. lets wait to see what Liv will find when she is back
- Priority changed from High to Low
See #164898#note-47 and further comments. I assume we're still waiting here. And maybe we can make this Low, since this isn't about major breakage but temporary issues on the remote end.
livdywan wrote in #note-5:
https://github.com/os-autoinst/scripts/pull/346
I found out today that the scripts repo on o3 hadn't been updated since september 12. /etc/cron.d/os-autoinst-scripts-update-git
had been removed around that time.
I found no mentioning of removing /etc/cron.d/os-autoinst-scripts-update-git
in progress, so I enabled it again. But I assume you removed the file as part of this ticket, right?
Now I found this pull request. apparently that service never got installed on o3:
% systemctl status os-autoinst-scripts-update-git.service
Unit os-autoinst-scripts-update-git.service could not be found.
I'm currently working on the scripts repo, so please coordinate with me if you want to enable the cronjob/timer.
- Category set to Regressions/Crashes
- Status changed from Blocked to Feedback
blocker resolved. Also let's be explicit to answer the open points raised by tinita
- Related to action #166772: openqa-label-known-issues overrides size:S added
- Subject changed from [alert] Waves of emails due to kex_exchange_identification: Connection closed by remote host errors to [alert] Waves of emails due to kex_exchange_identification: Connection closed by remote host errors size:S
- Description updated (diff)
- Status changed from Feedback to Blocked
tinita wrote in #note-15:
livdywan wrote in #note-5:
https://github.com/os-autoinst/scripts/pull/346
I found out today that the scripts repo on o3 hadn't been updated since september 12. /etc/cron.d/os-autoinst-scripts-update-git
had been removed around that time.
I found no mentioning of removing /etc/cron.d/os-autoinst-scripts-update-git
in progress, so I enabled it again. But I assume you removed the file as part of this ticket, right?
Now I found this pull request. apparently that service never got installed on o3:
% systemctl status os-autoinst-scripts-update-git.service
Unit os-autoinst-scripts-update-git.service could not be found.
I'm currently working on the scripts repo, so please coordinate with me if you want to enable the cronjob/timer.
So I checked with @tinita. The git_auto_update
feature is not enabled yet and it seems we ended up without a ticket for that. Filed #170464 now.
- Parent task set to #152847
- Status changed from Blocked to Workable
So I checked with @tinita. The git_auto_update
feature is not enabled yet and it seems we ended up without a ticket for that. Filed #170464 now.
Apparently I misunderstood. This is enabled meaning this ticket is no longer blocked (see #170464#note-9).
- Assignee deleted (
livdywan)
Also available in: Atom
PDF