Project

General

Profile

action #158266

Updated by okurz about 2 months ago

## Observation 
 From https://suse.slack.com/archives/C02CANHLANP/p1711700522125619 
 > Warning: tests are failing on ppc64 worker host diesel around 5 hours ago, seem qemu VM can't start. https://openqa.suse.de/admin/workers/3393 https://openqa.suse.de/admin/workers/3388 https://openqa.suse.de/admin/workers/3390 

 autoinst-log.txt says 

 ``` 
 [2024-03-29T09:37:43.496499+01:00] [debug] [pid:18748] QEMU: error: kvm run failed Device or resource busy 
 [2024-03-29T09:37:43.496606+01:00] [debug] [pid:18748] QEMU: This is probably because your SMT is enabled. 
 [2024-03-29T09:37:43.496679+01:00] [debug] [pid:18748] QEMU: VCPU can only run on primary threads with all secondary threads offline. 
 ``` 

 There is the "smt_off" service https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/worker.sls?ref_type=heads#L263 to fix the problem regarding SMT. the service was running fine but I restarted the service and restarted https://openqa.suse.de/tests/13906928#live. But it seems it reproduces the problem. 

 Only diesel is affected, mania and petrol seem fine. 

 ## Suggestions 
 * *DONE* `ssh osd 'sudo salt-key -y -d diesel.qe.nue2.suse.org'` 
 * *DONE* `ssh diesel.qe.nue2.suse.org 'sed -i 's/qemu_ppc64le,/qemu_ppc64le-poo158266,/' /etc/openqa/workers.ini && systemctl restart openqa-worker-auto-restart@{1..8} && systemctl disable --now salt-minion telegraf'` 
 * *DONE* `host=openqa.suse.de WORKER=diesel result="result='failed'" comment="label:poo158266" ./openqa-advanced-retrigger-jobs` 
 * Investigate what is different on diesel vs. mania+petrol. Maybe mania+petrol are also affected but not noticed yet, maybe they haven't rebooted yet 
 * Fix the problem, optionally wait for reported bug problem 
 * verify 
 * rollback 

 ## Rollback actions 
 * *DONE* `ssh diesel.qe.nue2.suse.org 'sed -i 's/qemu_ppc64le-poo158266,/qemu_ppc64le,/' /etc/openqa/workers.ini && systemctl restart openqa-worker-auto-restart@{1..8} && systemctl enable --now salt-minion telegraf'` 
 * *DONE* `ssh osd 'sudo salt-key -y -a diesel.qe.nue2.suse.org'` 
 * `ssh root@kerosene.qe.nue2.suse.org 'zypper rl powerpc-utils && zypper -n in powerpc-utils'` 
 * Revert https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/763

Back