action #106056
Updated by okurz over 2 years ago
## Observation refer to osd test failure log https://openqa.suse.de/tests/8113762, Fail to connect openqaipmi5-sp.qa.suse.de on our osd environment. [2022-02-07T07:36:27.624742+01:00] [debug] IPMI: Chassis Power Control: Up/On [2022-02-07T07:36:40.726270+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines: ipmitool -I lanplus -H openqaipmi5-sp.qa.suse.de -U admin -P XX chassis power status: Error: Unable to establish IPMI v2 / RMCP+ session at /usr/lib/os-autoinst/backend/ipmi.pm line 45. But, we can connect openqaipmi5-sp.qa.suse.de successfully on our testing environment ipmitool -I lanplus -H openqaipmi5-sp.qa.suse.de -U admin -P XX chassis power status Chassis Power is on ip a 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether d2:0a:cd:f5:97:40 brd ff:ff:ff:ff:ff:ff inet 10.161.159.120/20 brd 10.161.159.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::d00a:cdff:fef5:9740/64 scope link valid_lft forever preferred_lft forever FYI, refer to https://openqa.suse.de/admin/workers/1207 for more details. ## Acceptance criteria * **AC1:** os-autoinst backend::ipmi retries consistently in more cases of network related unavailabilities and instabilities ## Suggestions * Try to reproduce the issue and find the current fail-ratio, e.g. using openqaipmi5-sp.qa.suse.de directly, with simple scripted commands as in https://progress.opensuse.org/issues/106056#note-30 * Implement based on pseudo-code from https://progress.opensuse.org/issues/106056#note-15 : ``` ipmi power reset for i in (1 .. 10): ipmi power status && break echo "Retrying ipmi connection $i of 10 after sleep" sleep 10 ... ``` * Crosscheck with production openQA test cases, e.g. using https://progress.opensuse.org/projects/openqatests/wiki/Wiki#Statistical-investigation