action #23368
closedipmi worker openqaw1:4 has issue to connect to the ipmi machine.
0%
Description
The worker's bounded SUT name can not be translated to IP.
Key log:
DIE ipmitool -I lanplus -H openqa4-sp.qa.suse.de -U admin -P qatesting mc guid: Address lookup for openqa4-sp.qa.suse.de failed
Could not open socket!
Error: Unable to establish IPMI v2 / RMCP+ session at /usr/lib/os-autoinst/backend/ipmi.pm line 62.
Job link:
https://openqa.suse.de/tests/1108863
Updated by okurz over 7 years ago
- Category set to Infrastructure
- Status changed from New to Feedback
- Assignee changed from mgriessmeier to okurz
was a typo introduced by nicksinger causing the wrong ipmi host to be addressed -> https://gitlab.suse.de/openqa/salt-pillars-openqa/merge_requests/46
https://openqa.suse.de/tests/1114208#live working now on openqaw1:4 as expected?
Updated by xlai over 7 years ago
Yes, the reported issue is fixed. SUT can be connected now via ipmitool.
But this worker still fails job when install host via pxe, see job https://openqa.suse.de/tests/1114208. Please temporarily disable the worker.
Updated by okurz over 7 years ago
Sorry, did not disable the worker myself but anyone can request it with a merge request on salt pillars -> https://gitlab.suse.de/openqa/salt-pillars-openqa
But when there is a problem about that then IMHO we should be more explicit -> make sure we have an issue and someone is working on it
Updated by nicksinger over 7 years ago
For now openqaw1:3 and openqaw1:4 have 64bit-ipmi_debug as WORKER_CLASS to assign jobs specifically to them.
I've also tried to investigate this issue further and I'm pretty certain that it's no longer a network issue.
What I found out is that the linked test (https://openqa.suse.de/tests/1114208) definitely tries to access media on OSD which is not any longer available (e.g. UPGRADE_REPO=ftp://openqa.suse.de/SLE-12-SP3-Server-DVD-x86_64-Build0473-Media1).
What I've noticed is that non existing assets cause exactly this behavior:
This test worked in before: https://openqa.suse.de/tests/1058835
This test uses exactly the same assets but they don't exist anymore: https://openqa.suse.de/tests/1123603
As you can see, it behaves exactly the same as in xlai's linked test.
The real physical machine shows a network interface selection which is for unknown reasons not visible in the SOL session and therefore not visible in the openqa screenshots/live view.
@xlai: I'd kindly ask you to try to fix the assets paths (e.g. use GM instead of "Build0473") and create a new ticket for this since the initial issue with a wrongly configured DNS is already resolved.
Updated by xlai over 7 years ago
- Category deleted (
Infrastructure) - Status changed from Feedback to New
- Assignee changed from okurz to mgriessmeier
nicksinger wrote:
For now openqaw1:3 and openqaw1:4 have 64bit-ipmi_debug as WORKER_CLASS to assign jobs specifically to them.
Thanks for this, and we will not be disturbed any more :).
I've also tried to investigate this issue further and I'm pretty certain that it's no longer a network issue.
What I found out is that the linked test (https://openqa.suse.de/tests/1114208) definitely tries to access media on OSD which is not any longer available (e.g. UPGRADE_REPO=ftp://openqa.suse.de/SLE-12-SP3-Server-DVD-x86_64-Build0473-Media1).
What I've noticed is that non existing assets cause exactly this behavior:This test worked in before: https://openqa.suse.de/tests/1058835
This test uses exactly the same assets but they don't exist anymore: https://openqa.suse.de/tests/1123603As you can see, it behaves exactly the same as in xlai's linked test.
The root cause here is that the installation media can not be accessed via openqa.suse.de, but it should be accessed via http://openqa.suse.de. The upgrade_repo is not used here at all(used in latter test steps after installation succeeds). I have told oliver about it, and will fix in openqa boot_from_pxe test code.
The real physical machine shows a network interface selection which is for unknown reasons not visible in the SOL session and therefore not visible in the openqa screenshots/live view.
Yes, I also believe this to be a issue.
@xlai: I'd kindly ask you to try to fix the assets paths (e.g. use GM instead of "Build0473") and create a new ticket for this since the initial issue with a wrongly configured DNS is already resolved.
I will open a new ticket.
Updated by xlai over 7 years ago
- Related to action #23514: [labs][64bit-ipmi_debug worker] SLE15 shows interface selection (because 2 NICs are connected?) added
Updated by xlai over 7 years ago
- Status changed from New to Resolved
The original access issue is fixed. Now create a new ticket to follow the issue when using it as a real ipmi worker, #23514
Updated by nicksinger over 7 years ago
- Copied to action #23724: [infrastructure][ipmi] openQA is unable to reconnect to quinn (openQA ipmi worker) added
Updated by nicksinger over 7 years ago
- Copied to deleted (action #23724: [infrastructure][ipmi] openQA is unable to reconnect to quinn (openQA ipmi worker))