action #110545
Updated by livdywan almost 2 years ago
## Motivation See parent #10148 . In #109232#note-5 ggardet_arm gave some additional hints that we could try. We should try all and run tests as mkittler did in #109232 ## Acceptance criteria * **AC1:** All concrete ideas from #109232#note-5 have been tried and openQA tests have been executed with a statement regarding stability ## Suggestions * Remind mkittler that he should always write down the commands he used in tickets as otherwise his colleagues will ask him anyway what he did in in #109232 to run openQA tests ;) * See my notes on exporting job IDs via `psql`: https://github.com/Martchus/openQA-helper#useful-sql-queries= * Change the parameters on the systems as written in #109232#note-5 , one by one or in combination, reconduct tests and gather stability figures * Come up with final assessment ## Concrete ideas to try out * Disable mitigation (KPTI, etc.) * Use kernel parameter `mitigations=off` (see https://www.kernel.org/doc/html/v5.15-rc1/admin-guide/kernel-parameters.html) * Enable/disable huge pages * Disable hardware threading in firmware (it will lower the number of CPU seen by the kernel) * Check actual CPU frequency * Check temperature (cpu throttling could slow down cpu freq and you get lower perfs) * Use single socket instead of dual sockets (may be configurable in the firmware) * Use a distribution without LSE-atomics (known to be slow on TX2) * You can also run sudo perf stat while the system is busy with openQA tests