Project

General

Profile

action #110545

Updated by livdywan almost 2 years ago

## Motivation 
 See parent #10148 . In #109232#note-5 ggardet_arm gave some additional hints that we could try. We should try all and run tests as mkittler did in #109232 

 ## Acceptance criteria 
 * **AC1:** All concrete ideas from #109232#note-5 have been tried and openQA tests have been executed with a statement regarding stability 

 ## Suggestions 
 * Remind mkittler that he should always write down the commands he used in tickets as otherwise his colleagues will ask him anyway what he did in in #109232 to run openQA tests ;) 
     * See my notes on exporting job IDs via `psql`: https://github.com/Martchus/openQA-helper#useful-sql-queries= 
 * Change the parameters on the systems as written in #109232#note-5 , one by one or in combination, reconduct tests and gather stability figures 
 * Come up with final assessment 

 ## Concrete ideas to try out 
 * Disable mitigation (KPTI, etc.) 
     * Use kernel parameter `mitigations=off` (see https://www.kernel.org/doc/html/v5.15-rc1/admin-guide/kernel-parameters.html) 
 * Enable/disable huge pages 
 * Disable hardware threading in firmware (it will lower the number of CPU seen by the kernel) 
 * Check actual CPU frequency 
 * Check temperature (cpu throttling could slow down cpu freq and you get lower perfs) 
 * Use single socket instead of dual sockets (may be configurable in the firmware) 
 * Use a distribution without LSE-atomics (known to be slow on TX2) 
 * You can also run sudo perf stat while the system is busy with openQA tests

Back