action #56036
[kernel][ltp] Fix killall call in cgroup_fj_stress_cpu
0%
Description
Discovered problem on s390x, that killall
doesn't reach each process.
Maybe a race of creating proc-fs entry.
Possible solution could be, to collect PIDs and kill them separately.
Related issues
History
#1
Updated by cfconrad almost 4 years ago
- Related to action #54260: [kernel][s390x] Occasional failures due timeout added
#2
Updated by okurz over 3 years ago
- Category set to Bugs in existing tests
#3
Updated by cfconrad over 3 years ago
- Assignee set to cfconrad
#4
Updated by cfconrad over 3 years ago
JOB_NAME=${1:-cgroup_fj_stress_cpu} VERSION=${2:-12-SP5} ssh openqa.suse.de -- \ sudo -u geekotest psql openqa <<EOT SELECT id,result,test,version,build,t_finished from jobs where result='failed' AND id in (SELECT job_id from job_modules WHERE name LIKE '%$JOB_NAME%' ORDER BY job_id DESC) AND version='$VERSION' ORDER BY t_finished DESC ; EOT
#5
Updated by cfconrad over 3 years ago
Latest failure: https://openqa.suse.de/tests/3401080
#6
Updated by cfconrad over 3 years ago
Last failure is 10 builds ago, take a look to the kernels:
Build 0331(ok)->0333(fail)->0339(ok)¶
Kernel-default: 113.1 -> 114.1 -> 115.1
GIT: f4acd955772c2c5d498b0ba420cebc34d511f814..8b1f909bc529e495045d5c89483f3c72ea1ba9c7..21718fdf45194f08a946ed637641f94a2b6e6104
Change log: 114->115 8b1f909bc529e495045d5c89483f3c72ea1ba9c7..21718fdf45194f08a946ed637641f94a2b6e6104
21718fdf451 - (tag: rpm-4.12.14-115--SLE-12-SP5-Server-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-SAP-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-HPC-RC2, tag: rpm-4.12.14-115--SLE-12-SP5-Desktop-RC2, tag: rpm-4.12.14-115, tag: SLE-12-SP5-Server-RC2, tag: SLE-12-SP5-SAP-RC2, tag: SLE-12-SP5-HPC-RC2, tag: SLE-12-SP5-Desktop-RC2) Merge 'users/hare/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
7173b2d861b - scsi: qla2xxx: Include the header file from qla_dsd.h (bsc#1150973). (4 weeks ago)
c219c413b05 - Drop patches causing I/O errors (bsc#1150973) (4 weeks ago)
53fe9c65a8c - scsi: lpfc: Fix reset recovery paths that are not recovering (bsc#1144375). (4 weeks ago)
4b41fff26fa - Refresh patches.suse/scsi-lpfc-fix-12.4.0.0-GPF-at-boot.patch (4 weeks ago)
d546738efbb - scsi: lpfc: Remove bg debugfs buffers (bsc#1144375). (4 weeks ago)
69b750cfbe8 - scsi: lpfc: Resolve checker warning for lpfc_new_io_buf() (bsc#1144375). (4 weeks ago)
b3f5be86368 - Move qla2xxx patches to upstream section (4 weeks ago)
e5c8c54c416 - scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses to firmware (bsc#1082635 bsc#1141340 bsc#1143706). (4 weeks ago)
9d37bd4cba0 - scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures (bsc#1082635 bsc#1141340 bsc#1143706). (4 weeks ago)
99c1743f239 - Move patches to upstream section (4 weeks ago)
846b396cb42 - Merge 'users/jroedel/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
92f27e76ee3 - Merge 'users/msuchanek/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
b395237fb37 - supported.conf: Add vfio_ccw (bsc#1151192 jsc#SLE-6138). (4 weeks ago)
4d0e53d5ee2 - Update s390 config files (bsc#1151192). (4 weeks ago)
ea9cc185749 - iommu: Don't use sme_active() in generic code (bsc#1151700). (4 weeks ago)
ece19020217 - iommu/dma: Fix for dereferencing before null checking (bsc#1151699). (4 weeks ago)
074ecadd6ef - iommu/iova: Avoid false sharing on fq_timer_on (bsc#1151701). (4 weeks ago)
f591e6d87f9 - iommu/amd: Fix race in increase_address_space() (bsc#1151697). (4 weeks ago)
1b0df2cfca5 - iommu/amd: Flush old domains in kdump kernel (bsc#1151698). (4 weeks ago)
Change log: f4acd955772c2c5d498b0ba420cebc34d511f814..8b1f909bc529e495045d5c89483f3c72ea1ba9c7
8b1f909bc52 - Merge 'users/hare/SLE12-SP5/for-next' into SLE12-SP5 (4 weeks ago)
152baf45bda - scsi: lib/sg_pool.c: clear 'first_chunk' in case of no preallocation (bsc#1141707,bsc#1150973). (4 weeks ago)
#7
Updated by cfconrad over 3 years ago
Still no new failures:
JOB_NAME=cgroup_fj_stress_cpu VERSION=12-SP5 ARCH=s390x ssh openqa.suse.de -- \ sudo -u geekotest psql openqa <<EOT SELECT id,result,test,version,build,arch,t_finished from jobs WHERE arch='$ARCH' AND id in (SELECT job_id from job_modules WHERE name LIKE '%$JOB_NAME%' ORDER BY job_id DESC) AND version='$VERSION' ORDER BY build DESC; EOT id | result | test | version | build | arch | t_finished ---------+--------+-----------------+---------+-------+-------+--------------------- 3554208 | passed | ltp_controllers | 12-SP5 | 0369 | s390x | 2019-11-05 07:57:12 3529991 | passed | ltp_controllers | 12-SP5 | 0368 | s390x | 2019-10-25 18:11:17 3507508 | passed | ltp_controllers | 12-SP5 | 0368 | s390x | 2019-10-23 05:53:54 3496369 | passed | ltp_controllers | 12-SP5 | 0366 | s390x | 2019-10-20 01:38:16 3485024 | passed | ltp_controllers | 12-SP5 | 0363 | s390x | 2019-10-17 02:42:25 3477702 | passed | ltp_controllers | 12-SP5 | 0358 | s390x | 2019-10-15 17:01:23 3470691 | passed | ltp_controllers | 12-SP5 | 0357 | s390x | 2019-10-12 13:47:34 3453346 | passed | ltp_controllers | 12-SP5 | 0350 | s390x | 2019-10-09 06:04:04 3448216 | passed | ltp_controllers | 12-SP5 | 0341 | s390x | 2019-10-08 07:25:40 3418668 | passed | ltp_controllers | 12-SP5 | 0341 | s390x | 2019-09-28 22:38:17 3414068 | passed | ltp_controllers | 12-SP5 | 0339 | s390x | 2019-09-27 22:21:14 3401080 | failed | ltp_controllers | 12-SP5 | 0333 | s390x | 2019-09-25 15:58:35 3415073 | passed | ltp_controllers | 12-SP5 | 0333 | s390x | 2019-09-28 01:37:04 3395469 | passed | ltp_controllers | 12-SP5 | 0331 | s390x | 2019-09-24 02:22:32 3389399 | passed | ltp_controllers | 12-SP5 | 0330 | s390x | 2019-09-21 17:44:52 3369113 | failed | ltp_controllers | 12-SP5 | 0322 | s390x | 2019-09-17 21:40:38 3354363 | passed | ltp_controllers | 12-SP5 | 0319 | s390x | 2019-09-13 20:14:49 3341218 | passed | ltp_controllers | 12-SP5 | 0313 | s390x | 2019-09-10 01:54:04 3340116 | passed | ltp_controllers | 12-SP5 | 0307 | s390x | 2019-09-10 09:36:27 3321306 | failed | ltp_controllers | 12-SP5 | 0303 | s390x | 2019-09-04 02:31:52 3313411 | failed | ltp_controllers | 12-SP5 | 0301 | s390x | 2019-08-30 22:15:07 3308948 | failed | ltp_controllers | 12-SP5 | 0296 | s390x | 2019-08-29 11:14:59 3276789 | failed | ltp_controllers | 12-SP5 | 0287 | s390x | 2019-08-22 19:08:29 3262695 | failed | ltp_controllers | 12-SP5 | 0283 | s390x | 2019-08-18 22:55:18 3255091 | failed | ltp_controllers | 12-SP5 | 0268 | s390x | 2019-08-16 02:37:53 3235202 | failed | ltp_controllers | 12-SP5 | 0261 | s390x | 2019-08-12 06:09:58 3225238 | failed | ltp_controllers | 12-SP5 | 0259 | s390x | 2019-08-08 22:54:04 3218839 | failed | ltp_controllers | 12-SP5 | 0258 | s390x | 2019-08-07 10:15:23 3205029 | failed | ltp_controllers | 12-SP5 | 0256 | s390x | 2019-08-03 04:54:20 3206873 | failed | ltp_controllers | 12-SP5 | 0256 | s390x | 2019-08-03 14:40:10 3168109 | failed | ltp_controllers | 12-SP5 | 0251 | s390x | 2019-07-30 22:35:45 3063543 | failed | ltp_controllers | 12-SP5 | 0222 | s390x | 2019-07-15 12:04:24 3055029 | failed | ltp_controllers | 12-SP5 | 0222 | s390x | 2019-07-12 11:27:29 3040557 | failed | ltp_controllers | 12-SP5 | 0216 | s390x | 2019-07-08 18:28:20 2977616 | failed | ltp_controllers | 12-SP5 | 0198 | s390x | 2019-06-13 11:30:16 2978844 | failed | ltp_controllers | 12-SP5 | 0198 | s390x | 2019-06-13 13:15:46 2973213 | failed | ltp_controllers | 12-SP5 | 0197 | s390x | 2019-06-13 23:38:21 (37 rows)
#8
Updated by cfconrad over 3 years ago
A fix could be [1] but I'm not sure about it, cause from my understanding killall
should do it as well.
[1] https://github.com/cfconrad/ltp/commit/ac91d4c6486235b4e11572daa46c57c02ac72998
#9
Updated by cfconrad over 3 years ago
Patch send to LTP-ML https://patchwork.ozlabs.org/project/ltp/list/?series=140731
#10
Updated by cfconrad over 3 years ago
- Status changed from New to Feedback
#12
Updated by jlausuch over 3 years ago
- Target version changed from 445 to 457
#13
Updated by pcervinka over 2 years ago
- Target version changed from 457 to QE Kernel Done