Project

General

Profile

action #62108

auto_review:"Can.t fcntl.*Operation not permitted at .*virtio_terminal.pm" test incompletes

Added by okurz 8 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Concrete Bugs
Target version:
-
Start date:
2020-01-14
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

In https://openqa.suse.de/tests/3782734

/usr/lib/os-autoinst/consoles/vnc_base.pm:62:{
  "connect_timeout" => 3,
  "hostname" => "localhost",
  "port" => 6007
}
[2020-01-14T03:06:11.810 CET] [debug] <<< consoles::virtio_terminal::open_pipe(pipe_prefix="/var/lib/openqa/pool/17/virtio_console")
[2020-01-14T03:06:11.810 CET] [info] ::: consoles::virtio_terminal::open_pipe: Set PIPE_SZ from 65536 to 1048576
[2020-01-14T03:06:11.810 CET] [info] ::: consoles::virtio_terminal::open_pipe: Set PIPE_SZ from 65536 to 1048576
[2020-01-14T03:06:11.811 CET] [debug] <<< consoles::virtio_terminal::open_pipe(pipe_prefix="/var/lib/openqa/pool/17/virtio_console1")
[2020-01-14T03:06:11.811 CET] [info] ::: consoles::virtio_terminal::open_pipe: Set PIPE_SZ from 65536 to 1048576
[2020-01-14T03:06:11.813 CET] [debug] Backend process died, backend errors are reported below in the following lines:
Can't fcntl($fh, '1031', '1048576'): Operation not permitted at /usr/lib/os-autoinst/consoles/virtio_terminal.pm line 147


Related issues

Blocks openQA Infrastructure - action #59858: "Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 260." broken NVMe on openqaworker13, jobs incomplete trying to save snapshotsResolved2019-11-14

History

#1 Updated by okurz 8 months ago

  • Assignee set to cfconrad
  • Priority changed from Normal to High

cfconrad this sounds like something for you :)

#2 Updated by pvorel 8 months ago

Oh, no, permissions on virtio strikes again :).
I remember somebody checked AppArmor profile before, but couldn't it be caused by it?
BTW wicked has VIRTIO_CONSOLE_NUM=2 and thus virtio_console1.log (IMHO not a commonly used setup).

#3 Updated by cfconrad 8 months ago

Hmmm very strange. Maybe the error message isn't really helpful here.

We can see that it worked /var/lib/openqa/pool/17/virtio_console but just 1ms later it failed on /var/lib/openqa/pool/17/virtio_console1.
And during the same job it worked 9 times successful, but then it failed.

#4 Updated by cfconrad 8 months ago

Maybe it is a system load problem. We could do something like https://github.com/os-autoinst/os-autoinst/pull/1338

But I wonder why this test always fail on osd. I just cloned it in my environment... and it run through http://cfconrad-vm.qa.suse.de/tests/6693#dependencies

It looks like something with that serial-console, also if the PIPE_SZ syscall works:

https://openqa.suse.de/tests/3787042/file/autoinst-log.txt

The console 'sut' is not responding (half-open socket?). Make sure the console is reachable or disable stall detection on expected disconnects with '$console->disable_vnc_stalls', for example in case of intended machine shutdown

#5 Updated by okurz 8 months ago

  • Related to action #62417: os-autoinst occasionally crashing on startup added

#6 Updated by cfconrad 8 months ago

  • Status changed from New to Resolved

#7 Updated by mkittler 8 months ago

  • Related to deleted (action #62417: os-autoinst occasionally crashing on startup)

#8 Updated by okurz 8 months ago

  • Status changed from Resolved to Workable

yes, PR was merged but unfortunately the problem not fixed.

E.g. see https://openqa.suse.de/tests/3860547/file/autoinst-log.txt which is running a current "4.6.1580718127.98503bd5" :

Can't fcntl($fh, '1031', '1048576'): Operation not permitted at /usr/lib/os-autoinst/consoles/virtio_terminal.pm line 128

#9 Updated by cfconrad 8 months ago

My first assumption is, that it's not working cause "autodie" has lexical scope: https://perldoc.perl.org/autodie.html

The autodie pragma has lexical scope, meaning that functions and subroutines altered with autodie will only change their behaviour until the end of the enclosing block, file, or eval.

So my latest change, when I moved the fcntl() calls in separate methods for mocking it, it broke it :(

But before I will provide something, okurz offered me a testsetup on openqaworker13 which I will use for verification! thx!

#10 Updated by cfconrad 8 months ago

  • Blocks action #59858: "Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 260." broken NVMe on openqaworker13, jobs incomplete trying to save snapshots added

#11 Updated by cfconrad 8 months ago

  • Status changed from Workable to In Progress

#13 Updated by okurz 8 months ago

PR was merged. Do you have any further points you want to follow up with or do you consider this issue done? Do you still need openqaworker13?

#14 Updated by cfconrad 8 months ago

  • Status changed from In Progress to Resolved

Thx for asking,
from my side I consider it as done and don't need openqaworker13 anymore.

Also available in: Atom PDF