Project

General

Profile

action #56036

[kernel][ltp] Fix killall call in cgroup_fj_stress_cpu

Added by cfconrad about 2 years ago. Updated 11 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
QE Kernel - QE Kernel Done
Start date:
2019-08-28
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Discovered problem on s390x, that killall doesn't reach each process.
Maybe a race of creating proc-fs entry.

Possible solution could be, to collect PIDs and kill them separately.


Related issues

Related to openQA Tests - action #54260: [kernel][s390x] Occasional failures due timeoutRejected2019-07-15

History

#1 Updated by cfconrad about 2 years ago

  • Related to action #54260: [kernel][s390x] Occasional failures due timeout added

#2 Updated by okurz almost 2 years ago

  • Category set to Bugs in existing tests

#3 Updated by cfconrad almost 2 years ago

  • Assignee set to cfconrad

#4 Updated by cfconrad almost 2 years ago

JOB_NAME=${1:-cgroup_fj_stress_cpu}
VERSION=${2:-12-SP5}
ssh openqa.suse.de -- \
sudo -u geekotest  psql openqa <<EOT
    SELECT id,result,test,version,build,t_finished from jobs where result='failed' AND id in
        (SELECT job_id from job_modules WHERE name LIKE '%$JOB_NAME%' ORDER BY job_id DESC)
    AND version='$VERSION'
    ORDER BY t_finished DESC
;
EOT

#6 Updated by cfconrad almost 2 years ago

Last failure is 10 builds ago, take a look to the kernels:

Build 0331(ok)->0333(fail)->0339(ok)

Kernel-default: 113.1 -> 114.1 -> 115.1
GIT: f4acd955772c2c5d498b0ba420cebc34d511f814..8b1f909bc529e495045d5c89483f3c72ea1ba9c7..21718fdf45194f08a946ed637641f94a2b6e6104

Change log: 114->115 8b1f909bc529e495045d5c89483f3c72ea1ba9c7..21718fdf45194f08a946ed637641f94a2b6e6104
21718fdf451 - (tag: rpm-4.12.14-115--SLE-12-SP5-Server-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-SAP-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-HPC-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-Desktop-RC2, tag: rpm-4.12.14-115, tag: SLE-12-SP5-Server-RC2, tag: SLE-12-SP5-SAP-RC2, tag: SLE-12-SP5-HPC-RC2, tag: SLE-12-SP5-Desktop-RC2) Merge 'users/hare/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
7173b2d861b - scsi: qla2xxx: Include the header file from qla_dsd.h (bsc#1150973). (4 weeks ago)
c219c413b05 - Drop patches causing I/O errors (bsc#1150973) (4 weeks ago)
53fe9c65a8c - scsi: lpfc: Fix reset recovery paths that are not recovering (bsc#1144375). (4 weeks ago)
4b41fff26fa - Refresh patches.suse/scsi-lpfc-fix-12.4.0.0-GPF-at-boot.patch (4 weeks ago)
d546738efbb - scsi: lpfc: Remove bg debugfs buffers (bsc#1144375). (4 weeks ago)
69b750cfbe8 - scsi: lpfc: Resolve checker warning for lpfc_new_io_buf() (bsc#1144375). (4 weeks ago)
b3f5be86368 - Move qla2xxx patches to upstream section (4 weeks ago)
e5c8c54c416 - scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses to firmware (bsc#1082635 bsc#1141340 bsc#1143706). (4 weeks ago)
9d37bd4cba0 - scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures (bsc#1082635 bsc#1141340 bsc#1143706). (4 weeks ago)
99c1743f239 - Move patches to upstream section (4 weeks ago)
846b396cb42 - Merge 'users/jroedel/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
92f27e76ee3 - Merge 'users/msuchanek/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
b395237fb37 - supported.conf: Add vfio_ccw (bsc#1151192 jsc#SLE-6138). (4 weeks ago)
4d0e53d5ee2 - Update s390 config files (bsc#1151192). (4 weeks ago)
ea9cc185749 - iommu: Don't use sme_active() in generic code (bsc#1151700). (4 weeks ago)
ece19020217 - iommu/dma: Fix for dereferencing before null checking (bsc#1151699). (4 weeks ago)
074ecadd6ef - iommu/iova: Avoid false sharing on fq_timer_on (bsc#1151701). (4 weeks ago)
f591e6d87f9 - iommu/amd: Fix race in increase_address_space() (bsc#1151697). (4 weeks ago)
1b0df2cfca5 - iommu/amd: Flush old domains in kdump kernel (bsc#1151698). (4 weeks ago)

Change log: f4acd955772c2c5d498b0ba420cebc34d511f814..8b1f909bc529e495045d5c89483f3c72ea1ba9c7
8b1f909bc52 - Merge 'users/hare/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
152baf45bda - scsi: lib/sg_pool.c: clear 'first_chunk' in case of no preallocation (bsc#1141707,bsc#1150973). (4 weeks ago)

#7 Updated by cfconrad almost 2 years ago

Still no new failures:

JOB_NAME=cgroup_fj_stress_cpu
VERSION=12-SP5
ARCH=s390x
ssh openqa.suse.de -- \
sudo -u geekotest  psql openqa <<EOT
    SELECT id,result,test,version,build,arch,t_finished from jobs WHERE arch='$ARCH' AND id in
        (SELECT job_id from job_modules WHERE name LIKE '%$JOB_NAME%'  ORDER BY job_id DESC)
    AND version='$VERSION'
    ORDER BY build DESC;
EOT


id    | result |      test       | version | build | arch  |     t_finished      
---------+--------+-----------------+---------+-------+-------+---------------------
3554208 | passed | ltp_controllers | 12-SP5  | 0369  | s390x | 2019-11-05 07:57:12
3529991 | passed | ltp_controllers | 12-SP5  | 0368  | s390x | 2019-10-25 18:11:17
3507508 | passed | ltp_controllers | 12-SP5  | 0368  | s390x | 2019-10-23 05:53:54
3496369 | passed | ltp_controllers | 12-SP5  | 0366  | s390x | 2019-10-20 01:38:16
3485024 | passed | ltp_controllers | 12-SP5  | 0363  | s390x | 2019-10-17 02:42:25
3477702 | passed | ltp_controllers | 12-SP5  | 0358  | s390x | 2019-10-15 17:01:23
3470691 | passed | ltp_controllers | 12-SP5  | 0357  | s390x | 2019-10-12 13:47:34
3453346 | passed | ltp_controllers | 12-SP5  | 0350  | s390x | 2019-10-09 06:04:04
3448216 | passed | ltp_controllers | 12-SP5  | 0341  | s390x | 2019-10-08 07:25:40
3418668 | passed | ltp_controllers | 12-SP5  | 0341  | s390x | 2019-09-28 22:38:17
3414068 | passed | ltp_controllers | 12-SP5  | 0339  | s390x | 2019-09-27 22:21:14
3401080 | failed | ltp_controllers | 12-SP5  | 0333  | s390x | 2019-09-25 15:58:35
3415073 | passed | ltp_controllers | 12-SP5  | 0333  | s390x | 2019-09-28 01:37:04
3395469 | passed | ltp_controllers | 12-SP5  | 0331  | s390x | 2019-09-24 02:22:32
3389399 | passed | ltp_controllers | 12-SP5  | 0330  | s390x | 2019-09-21 17:44:52
3369113 | failed | ltp_controllers | 12-SP5  | 0322  | s390x | 2019-09-17 21:40:38
3354363 | passed | ltp_controllers | 12-SP5  | 0319  | s390x | 2019-09-13 20:14:49
3341218 | passed | ltp_controllers | 12-SP5  | 0313  | s390x | 2019-09-10 01:54:04
3340116 | passed | ltp_controllers | 12-SP5  | 0307  | s390x | 2019-09-10 09:36:27
3321306 | failed | ltp_controllers | 12-SP5  | 0303  | s390x | 2019-09-04 02:31:52
3313411 | failed | ltp_controllers | 12-SP5  | 0301  | s390x | 2019-08-30 22:15:07
3308948 | failed | ltp_controllers | 12-SP5  | 0296  | s390x | 2019-08-29 11:14:59
3276789 | failed | ltp_controllers | 12-SP5  | 0287  | s390x | 2019-08-22 19:08:29
3262695 | failed | ltp_controllers | 12-SP5  | 0283  | s390x | 2019-08-18 22:55:18
3255091 | failed | ltp_controllers | 12-SP5  | 0268  | s390x | 2019-08-16 02:37:53
3235202 | failed | ltp_controllers | 12-SP5  | 0261  | s390x | 2019-08-12 06:09:58
3225238 | failed | ltp_controllers | 12-SP5  | 0259  | s390x | 2019-08-08 22:54:04
3218839 | failed | ltp_controllers | 12-SP5  | 0258  | s390x | 2019-08-07 10:15:23
3205029 | failed | ltp_controllers | 12-SP5  | 0256  | s390x | 2019-08-03 04:54:20
3206873 | failed | ltp_controllers | 12-SP5  | 0256  | s390x | 2019-08-03 14:40:10
3168109 | failed | ltp_controllers | 12-SP5  | 0251  | s390x | 2019-07-30 22:35:45
3063543 | failed | ltp_controllers | 12-SP5  | 0222  | s390x | 2019-07-15 12:04:24
3055029 | failed | ltp_controllers | 12-SP5  | 0222  | s390x | 2019-07-12 11:27:29
3040557 | failed | ltp_controllers | 12-SP5  | 0216  | s390x | 2019-07-08 18:28:20
2977616 | failed | ltp_controllers | 12-SP5  | 0198  | s390x | 2019-06-13 11:30:16
2978844 | failed | ltp_controllers | 12-SP5  | 0198  | s390x | 2019-06-13 13:15:46
2973213 | failed | ltp_controllers | 12-SP5  | 0197  | s390x | 2019-06-13 23:38:21
(37 rows)

#8 Updated by cfconrad almost 2 years ago

A fix could be [1] but I'm not sure about it, cause from my understanding killall should do it as well.

[1] https://github.com/cfconrad/ltp/commit/ac91d4c6486235b4e11572daa46c57c02ac72998

#10 Updated by cfconrad almost 2 years ago

  • Status changed from New to Feedback

#11 Updated by cfconrad almost 2 years ago

  • Status changed from Feedback to Resolved

merged

#12 Updated by jlausuch almost 2 years ago

  • Target version changed from 445 to 457

#13 Updated by pcervinka 11 months ago

  • Target version changed from 457 to QE Kernel Done

Also available in: Atom PDF