action #34267

osd instance unresponsive (HTTP 502)

Added by szarate almost 2 years ago. Updated almost 2 years ago.

Status:ResolvedStart date:04/04/2018
Priority:ImmediateDue date:
Assignee:EDiGiacinto% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

From the journal log:

Apr 04 20:45:17 openqa openqa[13612]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 20:45:17 openqa openqa[13612]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 20:45:17 openqa openqa[13612]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 20:45:17 openqa openqa[13612]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.

from the /var/log/openqa

[2018-04-04T21:05:42.0767 CEST] [debug] Worker openqaworker5:1 not seen since 820 seconds
[2018-04-04T21:05:42.0770 CEST] [debug] Failed dead job detection : {UNKNOWN}: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 327. at /usr/share/openqa/script/../lib/OpenQA/WebSockets/Server.pm line 422

However a call to df -ah shows that everything seems to be in order: http://paste.suse.de/17292

And eventually websockets service also has the same problem:

Apr 04 21:18:55 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:18:55 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:19:42 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 327.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:37 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm line 97.
Apr 04 21:20:42 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 327.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.
Apr 04 21:20:52 openqa openqa-websockets[13607]: Mojo::Reactor::Poll: Timer failed: Can't write to log: Resource temporarily unavailable at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm line 125.

Related issues

Related to openQA Project - action #34042: [tools] Worker goes to infinite loop during upload of scr... Resolved 05/04/2018

History

#1 Updated by szarate almost 2 years ago

  • Status changed from New to Feedback

Somehow, I forgot to update the salt recipe when doing the upgrade to leap 42.3, this caused rwp to stay at version 0.19, Still doesn't tells me why such errors, however openQA seems to be working fine now.

https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/41

#2 Updated by EDiGiacinto almost 2 years ago

szarate wrote:

Somehow, I forgot to update the salt recipe when doing the upgrade to leap 42.3, this caused rwp to stay at version 0.19, Still doesn't tells me why such errors, however openQA seems to be working fine now.


https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/41

The webui is not using rwp (indeed the version requirement is forced just on the worker package).

[2018-04-04T21:05:42.0770 CEST] [debug] Failed dead job detection : {UNKNOWN}: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 327. at /usr/share/openqa/script/../lib/OpenQA/WebSockets/Server.pm line 422

This sounds like an apparmor issue.

#3 Updated by coolo almost 2 years ago

No apparmor involved.

#4 Updated by szarate almost 2 years ago

Indeed, wasn't apparmor, but the last update seems to have fixed the problem (Maybe something in the Leap 42.2 repos?) In any case, Should we disable repos for leap 42.2?

#5 Updated by EDiGiacinto almost 2 years ago

For reference, similar thing (maybe related to some set of updates) happened to the workers as well: #34042 (and disk wasn't full neither)

#6 Updated by szarate almost 2 years ago

  • Related to action #34042: [tools] Worker goes to infinite loop during upload of screenshots in case of writing failure added

#7 Updated by szarate almost 2 years ago

  • Status changed from Feedback to Resolved
  • Assignee set to EDiGiacinto

So I think the theory of the updates holds... As we couldn't find anything that would tell us something in the end.

mudler, if you have more info, or things to add, now it's the time :)

NOTE: By suggestion of coolo: Services are not to be restarted until we know what happened.

PS: Closing ticket as it didn't seem to happen until today.

#8 Updated by szarate almost 2 years ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF