Project

General

Profile

Actions

action #121903

closed

[sporadic?] CI checks failing in https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/actions size:M

Added by okurz over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2022-12-13
Due date:
2023-01-20
% Done:

0%

Estimated time:

Description

Observation

See https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/actions

Error: Can't use an undefined value as filehandle reference at lib/Mojo/IOLoop/ReadWriteProcess.pm line 298.
Error: 11262: Sleeping inside locked section at t/13_shared.t line 181.
Error: 11350: Sleeping inside locked section at t/13_shared.t line 260.
Error: 11448: Sleeping inside locked section at t/13_shared.t line 333.

Acceptance Criteria:

AC1: The test passes consistently

Suggestions

  • Try to replicate the failure locally, if that doesn't work try to replicate it in CI (ssh)
  • Run the test like a 100 times to reproduce thie issue locally
  • Exclude the test in CI environments if it cannot be replicated
  • Retry only the failing test module with timeout, e.g. using build.opensuse.org/package/show/openSUSE:Factory/retry, or maybe better a simple for-loop within the perl test module itself, like in https://github.com/os-autoinst/openQA/blob/master/t/43-scheduling-and-worker-scalability.t#L132
Actions #1

Updated by dheidler over 1 year ago

  • Subject changed from CI checks failing in https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/actions to [sporadic?] CI checks failing in https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/actions size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #2

Updated by mkittler over 1 year ago

  • Assignee set to mkittler
Actions #3

Updated by mkittler over 1 year ago

None of the "errors" mentioned in the ticket description are actual errors and the reason why these CI runs have been considered failing. I've nevertheless created a PR to get rid of these warnings: https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/pull/50

The real failure is in

 t/01_run.t             (Wstat: 139 Tests: 12 Failed: 0)
  Non-zero wait status: 139

although I'm not quite sure yet why it fails. That's the test output:

 TEST error print
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 176
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10898
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 240
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10899
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 232
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10900
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 152
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10901
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    is process('execute', '/bin/true')->quirkiness(1)->start->wait_stop->exit_status, 0, 'process(execute) from process(code) -- retval check true';
    is process('execute', '/bin/false')->quirkiness(1)->start->wait_stop->exit_status, 1, 'process(execute) from process(code) -- retval check false';
    is process(sub {
        print 'sub-sub-process';
    }
    )->start->wait_stop->read_all_stdout, 'sub-sub-process', 'process(code) works from process(code)';
    print 'DONE';
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 40
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: 1
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10902
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: '/I/do/not/exist'
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: 'sleep 0.2'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: 'perl', '/home/runner/work/Mojo-IOLoop-ReadWriteProcess/Mojo-IOLoop-ReadWriteProcess/t/data/simple_rwp.pl'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10908
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: '/home/runner/work/Mojo-IOLoop-ReadWriteProcess/Mojo-IOLoop-ReadWriteProcess/t/data/sigchld_handler.pl'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 10910
t/01_run.t ............... 
All 12 subtests passed 
Actions #5

Updated by openqa_review over 1 year ago

  • Due date set to 2023-01-04

Setting due date based on mean cycle time of SUSE QE Tools

Actions #6

Updated by mkittler over 1 year ago

A successful test run with the same verbosity would look like this:

t/01_run.t .. 1/? TEST error print
t/01_run.t .. 8/? >> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 80
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18934
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 144
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18935
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 136
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18936
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    shift();
    print "$_$/" foreach (@_);
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 136
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: nothing
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18937
>> Mojo::IOLoop::ReadWriteProcess::_fork(): Fork: {
    use warnings;
    use strict;
    is process('execute', '/bin/true')->quirkiness(1)->start->wait_stop->exit_status, 0, 'process(execute) from process(code) -- retval check true';
    is process('execute', '/bin/false')->quirkiness(1)->start->wait_stop->exit_status, 1, 'process(execute) from process(code) -- retval check false';
    is process(sub {
        print 'sub-sub-process';
    }
    )->start->wait_stop->read_all_stdout, 'sub-sub-process', 'process(code) works from process(code)';
    print 'DONE';
}
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Exit status: 208
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Returns: 1
>> Mojo::IOLoop::ReadWriteProcess::_fork_collect_status(): Forked code Process Errors: 
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18938
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: '/I/do/not/exist'
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: 'sleep 0.2'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: 'perl', '/hdd/openqa-devel/repos/Mojo-IOLoop-ReadWriteProcess/t/data/simple_rwp.pl'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18944
>> Mojo::IOLoop::ReadWriteProcess::_open(): Execute: '/hdd/openqa-devel/repos/Mojo-IOLoop-ReadWriteProcess/t/data/sigchld_handler.pl'
>> Mojo::IOLoop::ReadWriteProcess::_open_collect_status(): Forked code Process Exit status: 0
>> Mojo::IOLoop::ReadWriteProcess::_shutdown(): Shutdown 18946
t/01_run.t .. ok    
All tests successful.
Files=1, Tests=12,  2 wallclock secs ( 0.03 usr  0.01 sys +  0.58 cusr  0.16 csys =  0.78 CPU)
Result: PASS

I fail to see the difference except that the last "ok" is missing. There are also slightly different error codes but 0 is always 0.

Actions #7

Updated by mkittler over 1 year ago

Note that exit code 139 corresponds to a segfault. Not sure why it is happening, though.

By the way, the problem has only happened once so far.

Actions #8

Updated by okurz over 1 year ago

  • Due date changed from 2023-01-04 to 2023-01-20

christmas grace due date bump :)

Actions #9

Updated by okurz over 1 year ago

  • Status changed from In Progress to Workable
Actions #10

Updated by kraih over 1 year ago

If the 139 exit code has only been seen once, it might really have been a freak accident. A segfault would have to have been caused by some of the C code involved that we don't maintain. Maybe a certain version of a library that Perl used in the container. So if it can't be replicated anymore i'd just consider it resolved for now.

Actions #11

Updated by mkittler over 1 year ago

  • Status changed from Workable to Resolved

I only saw the 139 exit code once¹ and it is indeed most likely caused by code we don't maintain. Likely it is some problem in the cleanup of resources before the process would normally exit anyways (because the test itself runs fine). That means I'd also like to resolve it for now. I could at least the fix warnings mentioned in the ticket description (that were not actually causing the fail, though).


¹At least for t/01_run.t. For t/02_parallel.t there's actually still https://github.com/openSUSE/Mojo-IOLoop-ReadWriteProcess/issues/40 but that's a different issue.

Actions

Also available in: Atom PDF