action #106933
closed
Use PSU capabilites to power cycle openqaworker-arm-[1-3] instead of infra tickets size:M
Added by nicksinger almost 3 years ago.
Updated almost 3 years ago.
Description
Observation¶
Today we installed a controllable PSU (qaps06nue.qa.suse.de) into the rack for openqaworker-arm-[1-3]. We should make use of them in our automatic recovery pipeline to power cycle the BMC if it is down
Here is the mapping for each machine:
ARM1: Plug 1
ARM2: Plug 2+3
ARM3: Plug 4+5
ARM4: Plug 6
ARM6: Plug 7
Suggestions¶
Research how we can automate the power-cycle on the PSU side. The PSUs have a webinterface which can be scripted/scraped (no real API AFAIK) and several access options like e.g. ssh, telnet, ftp.
Ask nsinger for the password if you want to browse through the web ui. Keep in mind to choose a security sensible option (e.g. an encrypted channel).
Integrate this automation into our pipeline at https://gitlab.suse.de/openqa/grafana-webhook-actions/-/blob/master/ipmi-recover-worker. It should replace the create_ticket()
function (https://gitlab.suse.de/openqa/grafana-webhook-actions/-/blob/master/ipmi-recover-worker#L26-28).
Acceptance criteria¶
- AC1: Infra tickets are no longer created
- AC2: grafana-webhook-actions uses some API of the PSU to automate the power cycle
- Copied from action #102575: Prevent false-positive ticket reporting for openqaworker-arm-3 added
- Copied from deleted (action #102575: Prevent false-positive ticket reporting for openqaworker-arm-3)
- Related to action #102575: Prevent false-positive ticket reporting for openqaworker-arm-3 added
I think a "High" priority is reasonable because currently we flood infra with mails/tickets and they are already overloaded. I asked them to ignore these tickets for now as we have full access to the system our self.
- Status changed from New to In Progress
- Assignee set to okurz
I am trying with an expect script or something
- Assignee deleted (
okurz)
- Target version deleted (
Ready)
I just recently scripted closing the FTP port. You could reuse at least the login part:
s = requests.Session()
login_params = {
"login_username": "admin",
"login_password": "<INSERT_PASSWORD_HERE>",
"submit": "Log On"
}
disable_ftp_params = {
"ftpPort": "21",
"submit": "Apply"
}
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
print("Opening admin page…", end="")
req0 = s.get("http://" + host)
print("Done.")
print("Login…", end="")
req1 = s.post("http://" + host + "/Forms/login1", data=login_params, headers=headers)
print("Done.")
print("Disabling FTP…", end="")
req2 = s.post("http://" + host + "/Forms/ftpserv1", data=disable_ftp_params, headers=headers)
print("Done.")
print("Logout…", end="")
req3 = s.get("http://" + host + "/logout.htm")
print("Done.")
- Assignee set to okurz
- Target version set to Ready
- Due date set to 2022-03-02
- Status changed from In Progress to Feedback
- Subject changed from Use PSU capabilites to power cycle openqaworker-arm-[1-3] instead of infra tickets to Use PSU capabilites to power cycle openqaworker-arm-[1-3] instead of infra tickets size:M
- Due date deleted (
2022-03-02)
- Status changed from Feedback to Resolved
Also available in: Atom
PDF