Project

General

Profile

Actions

action #34042

closed

[tools] Worker goes to infinite loop during upload of screenshots in case of writing failure

Added by EDiGiacinto over 6 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2018-04-05
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)

Description

This happened on loewe, the disk wasn't full during the upload of the screenshots:

Mar 29 13:51:08 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:51:08 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:58 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:58 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:40 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:40 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:28 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:27 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't call method "write" on an undefined value at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 362.
Mar 29 13:50:24 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:05 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:50:05 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:53 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:51 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:43 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:43 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:30 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:30 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:18 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:49:10 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:55 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:55 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:47 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't call method "write" on an undefined value at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 362.
Mar 29 13:48:45 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:21 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:21 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:17 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:17 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:12 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:48:12 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:46 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:46 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:39 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:39 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:11 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:11 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:07 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't call method "write" on an undefined value at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 362.
Mar 29 13:47:06 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:47:06 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:52 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:52 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:42 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:36 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:36 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:33 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:02 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:01 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:01 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:46:00 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log:  at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: Timer failed: Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: Mojo::Reactor::Poll: I/O watcher failed: Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 311.
Mar 29 13:45:27 loewe worker[43184]: [info] Isotovideo exit status: 28

Aside from the disk full issue, imo the worker should not loop forever (maybe set a retrial limit?) in case of persistent error, as it prevented the job to be cleaned up, and left an inconsistent state in the worker.

Also possibly: implement catching errors in all our log_* functions, as they are relied in all other parts in the code and would prevent cleanup phases from being executed.


Subtasks 2 (0 open2 closed)

action #34282: implement catching errors in all our log_* functionsResolvedmkittler2018-04-05

Actions
action #34285: Add retry limit on the worker for errors related to job handlingResolvedokurz2018-04-05

Actions

Related issues 3 (0 open3 closed)

Related to openQA Project (public) - action #34267: osd instance unresponsive (HTTP 502)ResolvedEDiGiacinto2018-04-04

Actions
Related to openQA Project (public) - action #40220: Scheduler died due to Mojo::Log failing to write to logRejected2018-08-24

Actions
Related to openQA Project (public) - action #40862: Out of disk space killed the webui (on osd)Rejected2018-09-11

Actions
Actions

Also available in: Atom PDF