action #90974: Make it obvious if qemu gets terminated unexpectedly due to out-of-memory - openQA Project (public) - openSUSE Project Management Tool

Actions

action #90974

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #62420: [epic] Distinguish all types of incompletes

Make it obvious if qemu gets terminated unexpectedly due to out-of-memory

Added by okurz over 3 years ago. Updated about 3 years ago.

Status:

Resolved

Priority:

High

Assignee:

Xiaojing_liu

Category:

Feature requests

Target version:

Ready

Start date:

Due date:

% Done:

Estimated time:

40.00 h

Description

Motivation¶

qemu can need a lot of memory and is influenced by how openQA users configure the test jobs. This can lead to "out-of-memory" conditions and we should feedback this situation to the test reviewers. #90161 is a recent example where jobs failed on malbec.arch due to OOM but the feedback was suboptimal as the corresponding openQA test is https://openqa.suse.de/tests/5674784 which was incomplete with reason "Reason: backend died: QEMU exited unexpectedly, see log for details" and auto-review labeled with #71188 but not specifically pointing to an OOM condition

Acceptance criteria¶

AC1: if qemu dies due to being killed due to OOM this should be obvious from the incomplete reason

Suggestions¶

So far what okurz could find out the best way to detect OOM is to check the system logs, e.g. with dmesg | grep 'Out of memory: Killed process' which would also reveal the PID of the killed process. Then one could check that PID against the PID of the qemu process that the qemu backend monitors and feed that information back as incomplete reason.
Ensure that these conditions are not linked anymore to #71188
Crosscheck what other reasons could explain #71188 or close that as well if it's very likely only OOM that would explain such problems

further references:

https://stackoverflow.com/questions/6132333/how-to-detect-out-of-memory-segfaults
https://unix.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages
It would also be possible to change if we want to completely disable or allow memory overcommit, see https://www.eurovps.com/faq/how-to-troubleshoot-high-memory-usage-in-linux/

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #90974

Make it obvious if qemu gets terminated unexpectedly due to out-of-memory

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by okurz over 3 years ago

Updated by mkittler over 3 years ago

Updated by okurz over 3 years ago

Updated by mkittler over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by openqa_review over 3 years ago

Updated by okurz over 3 years ago

Updated by livdywan over 3 years ago

Updated by livdywan over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by okurz over 3 years ago

Updated by okurz over 3 years ago

Updated by livdywan over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by Xiaojing_liu over 3 years ago

Updated by okurz over 3 years ago

Updated by Xiaojing_liu about 3 years ago

Updated by Xiaojing_liu about 3 years ago