Actions
action #158116
openopenQA Project - coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances
openQA Project - coordination #158110: [epic] Prevent worker overload
typing issue on ppc64 worker - crosscheck performance impact of ffmpeg on ppc64le (Power8 kvm) size:M
Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Regressions/Crashes
Target version:
Start date:
2024-03-27
Due date:
% Done:
0%
Estimated time:
Description
Motivation¶
In #158104 system overload on ppc64le machines was found which was likely triggered by #157636. As a snapshot the current process list output from htop looks like this:
PID USER PRI NI VIRT RES SHR S DISK R/W CPU% MEM% TIME+ â–½Command
1541 root 20 0 320M 194M 182M S 0.00 B/s 0.0 0.0 2h29:59 /usr/lib/systemd/systemd-j
96369 root 20 0 623M 98880 14336 S 0.00 B/s 0.0 0.0 54:05.86 /usr/bin/python3 /usr/bin/
1 root 20 0 178M 25024 11776 S 0.00 B/s 0.0 0.0 48:46.08 /usr/lib/systemd/systemd n
2000 root 20 0 9728 6208 2176 S 0.00 B/s 0.0 0.0 40:44.69 /usr/sbin/haveged -w 1024
157105 _openqa-wo 20 0 427M 189M 23808 R 0.00 B/s 68.4 0.0 32:22.39 ffmpeg -y -hide_banner -no
157062 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 42.1 0.0 32:07.83 ffmpeg -y -hide_banner -no
157107 _openqa-wo 20 0 427M 189M 23808 R 0.00 B/s 68.4 0.0 30:29.03 ffmpeg -y -hide_banner -no
157063 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 5.3 0.0 29:30.58 ffmpeg -y -hide_banner -no
6267 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 63.2 0.0 25:54.22 ffmpeg -y -hide_banner -no
157108 _openqa-wo 20 0 427M 189M 23808 R 0.00 B/s 63.2 0.0 25:03.79 ffmpeg -y -hide_banner -no
157064 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 2.6 0.0 23:50.53 ffmpeg -y -hide_banner -no
156485 _openqa-wo 20 0 427M 189M 23808 R 0.00 B/s 34.2 0.0 22:18.78 ffmpeg -y -hide_banner -no
6268 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 57.9 0.0 21:48.92 ffmpeg -y -hide_banner -no
156601 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 10.5 0.0 20:19.58 ffmpeg -y -hide_banner -no
6269 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 55.3 0.0 16:33.02 ffmpeg -y -hide_banner -no
5898 _openqa-wo 20 0 427M 193M 23808 R 0.00 B/s 86.8 0.0 14:48.15 ffmpeg -y -hide_banner -no
31080 _openqa-wo 20 0 5720M 758M 28416 R 0.00 B/s 57.9 0.1 12:58.63 /usr/bin/qemu-system-ppc64
15778 _openqa-wo 20 0 6767M 1779M 28480 R 0.00 B/s 81.6 0.2 12:50.94 /usr/bin/qemu-system-ppc64
15781 _openqa-wo 20 0 6767M 1779M 28480 S 0.00 B/s 0.0 0.2 10:13.25 /usr/bin/qemu-system-ppc64
156709 _openqa-wo 20 0 6762M 1766M 28288 S 0.00 B/s 13.2 0.2 10:08.67 /usr/bin/qemu-system-ppc64
33559 _openqa-wo 20 0 6756M 1724M 28416 R 0.00 B/s 86.8 0.2 10:05.56 /usr/bin/qemu-system-ppc64
35017 _openqa-wo 20 0 3946M 753M 28416 R 0.00 B/s 84.2 0.1 9:30.77 /usr/bin/qemu-system-ppc64
24085 _openqa-wo 20 0 6901M 1781M 28480 S 0.00 B/s 0.0 0.2 9:13.94 /usr/bin/qemu-system-ppc64
24092 _openqa-wo 20 0 6901M 1781M 28480 R 0.00 B/s 78.9 0.2 8:40.60 /usr/bin/qemu-system-ppc64
28718 _openqa-wo 20 0 7135M 1787M 28480 S 0.00 B/s 50.0 0.2 8:17.91 /usr/bin/qemu-system-ppc64
28720 _openqa-wo 20 0 7135M 1787M 28480 R 0.00 B/s 13.2 0.2 6:51.75 /usr/bin/qemu-system-ppc64
39280 _openqa-wo 20 0 5712M 755M 28416 R 0.00 B/s 65.8 0.1 6:41.38 /usr/bin/qemu-system-ppc64
39683 _openqa-wo 20 0 6731M 1549M 28416 R 0.00 B/s 65.8 0.2 6:24.06 /usr/bin/qemu-system-ppc64
3699 root 20 0 3968 3200 2368 S 0.00 B/s 0.0 0.0 6:04.21 /sbin/agetty -o -p -- \u -
34903 _openqa-wo 20 0 6334M 1483M 28416 R 0.00 B/s 50.0 0.2 5:29.90 /usr/bin/qemu-system-ppc64
34902 _openqa-wo 20 0 6334M 1483M 28416 S 0.00 B/s 0.0 0.2 4:40.00 /usr/bin/qemu-system-ppc64
38988 _openqa-wo 20 0 6790M 1376M 28480 R 0.00 B/s 107.9 0.2 3:52.33 /usr/bin/qemu-system-ppc64
38599 _openqa-wo 20 0 8040M 4187M 28480 R 0.00 B/s 47.4 0.5 3:41.13 /usr/bin/qemu-system-ppc64
45395 _openqa-wo 20 0 3732M 757M 28416 R 0.00 B/s 71.1 0.1 3:38.90 /usr/bin/qemu-system-ppc64
38600 _openqa-wo 20 0 8040M 4187M 28480 S 0.00 B/s 0.0 0.5 3:18.94 /usr/bin/qemu-system-ppc64
43853 _openqa-wo 20 0 5641M 1696M 28480 R 0.00 B/s 63.2 0.2 3:12.66 /usr/bin/qemu-system-ppc64
38456 _openqa-wo 20 0 9087M 4195M 28480 R 0.00 B/s 78.9 0.5 3:08.68 /usr/bin/qemu-system-ppc64
38986 _openqa-wo 20 0 6790M 1376M 28480 R 0.00 B/s 86.8 0.2 3:06.34 /usr/bin/qemu-system-ppc64
so ffmpeg shows significantly higher accumulated CPU time usage compared to the according qemu processes. We should investigate if ffmpeg is having a "too high" impact on machine performance, if it should be running with nice level to prevent typing issues, if ffmpeg parameters can be tweaked or if ffmpeg should be avoided at all on ppc64le.
Acceptance criteria¶
- AC1: openQA test video compression is ensured to not significantly impact system performance causing typing issues
- AC2: openQA tests pass consistently without typing issues due to video encoding
- AC3: openQA tests can still provide useful videos with exceptions (e.g. keep videos completely disabled as last resort)
Suggestions¶
- Be aware that as of 2024-04-04 NOVIDEO=1 was again set for ppc64le openQA machine definitions, see #157636
- Check if ffmpeg CPU usage as visible in the above htop output is considered expected or something unusual
- Try and compare ffmpeg manually on x86_64 and ppc64le to see if ppc64le is maybe much less efficient
- Consider introducing a nice-level for calling ffmpeg in os-autoinst although this might counter-productive as the video encoder works on a queue and shouldn't be delayed, maybe in combination with some bigger buffers or bigger "pipe size"?
- Crosscheck if ffmpeg can be tweaked, in particular for ppc64le qemu workers
- We still have the alternative to not use the external ffmpeg encoder but use the internal OGV encoder
- Decide if ffmpeg or even complete video encoding should be completely forbidden on ppc64le, see #157636
Out of scope¶
- Actually enabling/disabling ffmpeg in production is handled as part of #157636
Actions