action #92176
closed[alert] openqaworker-arm-3 offline and CI pipeline unable to send email but stating "passed"
0%
Description
Observation¶
https://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?orgId=1&editPanel=7&tab=alert shows that the machine openqaworker-arm-3 is offline and https://gitlab.suse.de/openqa/grafana-webhook-actions/-/jobs/415264 is green but shows:
Attempting to reboot openqaworker-arm-3
Error: Unable to establish IPMI v2 / RMCP+ session
/usr/sbin/sendmail: No such file or directory
. . . message not sent.
so two problems: email could not be sent but also that did not fail the pipeline
Acceptance criteria¶
- AC1: pipeline fails in case email sending does not work
- AC2: email sending does work again for now for the above observed case
Updated by okurz over 3 years ago
- Related to action #89815: osd-deployment blocked by openqaworker-arm-3 offline and not recovered automatically added
Updated by mkittler over 3 years ago
- Status changed from Workable to In Progress
I've now rebooted the machine manually and it came up normally. Nothing special was required (… chassis power cycle
did the trick).
I will look into the problem with the automatic recovery.
Updated by mkittler over 3 years ago
Looks like the sendmail binary (which would be provided by the postfix
package) is configured as mail
-application but it is missing in the container the jobs run in (registry.opensuse.org/home/okurz/container/containers/tumbleweed:ipmitool-ping-nc-mailx
). Apparently the intention is to use mailx
so it should likely be configured explicitly. Maybe the following change to the Dockerfile helps: https://build.opensuse.org/package/view_file/home:mkittler:branches:home:okurz:container/ipmitool-ping-nc-mailx/Dockerfile?expand=1
Updated by okurz over 3 years ago
hm, ok. But previously sending emails was working. Maybe something changed in the package setup. Within osd-deployment we also send emails. AFAIK we use "mutt" for sending emails in these cases so I suggest to use the same here.
Updated by openqa_review over 3 years ago
- Due date set to 2021-05-21
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler over 3 years ago
I've tested your container locally (not my version) and it can resolve the mail
command. It links to /etc/alternatives/mail
which links to /usr/bin/mailx
. Maybe this was just a temporary issue which has already been fixed in the current container version? (The image is based on Tumbleweed and Docker says the latest version is only 4 hours old so apparently it is automatically updated.)
Where comes mutt
into play? Your Dockerfile explicitly installs mailx
(and not mutt
). The osd-deployment pipeline uses also just the mail command but a different image (which doesn't seem to install a special mail client).
Updated by okurz over 3 years ago
mkittler wrote:
I've tested your container locally (not my version) and it can resolve the
/etc/alternatives/mail
which links to/usr/bin/mailx
. Maybe this was just a temporary issue which has already been fixed in the current container version? (The image is based on Tumbleweed and Docker says the latest version is only 4 hours old so apparently it is automatically updated.)Where comes
mutt
into play? Your Dockerfile explicitly installsmailx
(and notmutt
). The osd-deployment pipeline uses also just the mail command but a different image (which doesn't seem to install a special mail client).
Right. I apparently got that confused.
Ok, assuming that the issue might be fixed again upstream we should still not ignore errors when a mail could not be sent, right?
Updated by okurz over 3 years ago
- Related to action #76876: Find a better (automated) way to inform infra about hanging (arm) workers added
Updated by okurz over 3 years ago
Originally we assumed that email sending did work as that should have been done in #76876 but we never ensured that. With mkittler I tried out to get email sending done without resorting to what we do in e.g. osd-deployment where we login over ssh to osd which is already capable of sending emails itself. So we found one that could be seen as simplest how to send emails:
zypper -n in msmtp
echo -e "Subject: email from msmtp\n\ntest" | SMTPSERVER=relay.suse.de msmtp --from okurz@suse.de -t okurz@suse.de
what should be changed of course is to have a container image that already provides msmtp and then use variables with defaults instead of hardcoded values.
Updated by mkittler over 3 years ago
Updated by mkittler over 3 years ago
- Status changed from In Progress to Resolved
The SR has been merged and I've been testing whether sending mails works using the msmtp
command within the container locally.