Project

General

Profile

Actions

action #167105

closed

Worker imagetester:5 status is broken

Added by xlai about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-09-20
Due date:
% Done:

0%

Estimated time:

Description

This worker is problematic today.

See broken status below:

There are already 33 jobs re-triggered automatically on it, and all fail the same

See per-job failure:


Files

Actions #1

Updated by nicksinger about 1 month ago · Edited

imagetester:~ # systemctl list-units | grep openqa-worker
  openqa-worker-auto-restart@1.service                                                     loaded active running   openQA Worker #1
  openqa-worker-auto-restart@10.service                                                    loaded active running   openQA Worker #10
  openqa-worker-auto-restart@11.service                                                    loaded active running   openQA Worker #11
  openqa-worker-auto-restart@12.service                                                    loaded active running   openQA Worker #12
  openqa-worker-auto-restart@13.service                                                    loaded active running   openQA Worker #13
  openqa-worker-auto-restart@14.service                                                    loaded active running   openQA Worker #14
  openqa-worker-auto-restart@15.service                                                    loaded active running   openQA Worker #15
  openqa-worker-auto-restart@16.service                                                    loaded active running   openQA Worker #16
  openqa-worker-auto-restart@2.service                                                     loaded active running   openQA Worker #2
  openqa-worker-auto-restart@3.service                                                     loaded active running   openQA Worker #3
  openqa-worker-auto-restart@4.service                                                     loaded active running   openQA Worker #4
  openqa-worker-auto-restart@5.service                                                     loaded active running   openQA Worker #5
  openqa-worker-auto-restart@6.service                                                     loaded active running   openQA Worker #6
  openqa-worker-auto-restart@7.service                                                     loaded active running   openQA Worker #7
  openqa-worker-auto-restart@8.service                                                     loaded active running   openQA Worker #8
  openqa-worker-auto-restart@9.service                                                     loaded active running   openQA Worker #9
  openqa-worker-cacheservice-minion.service                                                loaded active running   OpenQA Worker Cache Service Minion
  openqa-worker-cacheservice.service                                                       loaded active running   OpenQA Worker Cache Service
  openqa-worker.slice                                                                      loaded active active    Slice for openqa-worker units
imagetester:~ # ls -lah /var/lib/openqa/pool/5/
total 4.0K
drwxr-xr-x 1 _openqa-worker root     14 Sep 20 09:48 .
drwxr-xr-x 1 root           root    110 Sep 18 20:43 ..
-rw-r--r-- 1 _openqa-worker nogroup   6 Sep 19 05:17 .locked
imagetester:~ # lsof /var/lib/openqa/pool/5/
COMMAND   PID           USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
worker   4931 _openqa-worker  cwd    DIR   0,61       14 19533 /var/lib/openqa/pool/5
worker  12227 _openqa-worker  cwd    DIR   0,61       14 19533 /var/lib/openqa/pool/5
imagetester:~ # ps aux | grep ^C
imagetester:~ # systemctl status 4931
● openqa-worker-auto-restart@5.service - openQA Worker #5
     Loaded: loaded (/usr/lib/systemd/system/openqa-worker-auto-restart@.service; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/openqa-worker-auto-restart@.service.d
             └─30-openqa-max-inactive-caching-downloads.conf
     Active: active (running) since Fri 2024-09-20 08:40:02 UTC; 1h 15min ago
   Main PID: 4931 (worker)
      Tasks: 2 (limit: 4915)
     CGroup: /openqa.slice/openqa-worker.slice/openqa-worker-auto-restart@5.service
             ├─  4931 /usr/bin/perl /usr/share/openqa/script/worker --instance 5
             └─ 12227 /usr/bin/perl /usr/share/openqa/script/worker --instance 5

Sep 20 09:55:17 imagetester worker[4931]:         Mojo::EventEmitter::emit(Mojo::IOLoop::Stream=HASH(0x5631d94462f0), "read", "HTTP/1.1 101 Switching Protocols\>
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::IOLoop::Stream::_read(Mojo::IOLoop::Stream=HASH(0x5631d94462f0)) called at /usr/lib/perl5/vendor_perl/5.>
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::IOLoop::Stream::__ANON__(Mojo::Reactor::Poll=HASH(0x5631d5fa8290)) called at /usr/lib/perl5/vendor_perl/>
Sep 20 09:55:17 imagetester worker[4931]:         eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/Reactor/Poll.pm line 141
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::Reactor::Poll::_try(Mojo::Reactor::Poll=HASH(0x5631d5fa8290), "I/O watcher", CODE(0x5631d70e0548), 0) ca>
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::Reactor::Poll::one_tick(Mojo::Reactor::Poll=HASH(0x5631d5fa8290)) called at /usr/lib/perl5/vendor_perl/5>
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::Reactor::Poll::start(Mojo::Reactor::Poll=HASH(0x5631d5fa8290)) called at /usr/lib/perl5/vendor_perl/5.26>
Sep 20 09:55:17 imagetester worker[4931]:         Mojo::IOLoop::start(Mojo::IOLoop=HASH(0x5631d7760430)) called at /usr/share/openqa/script/../lib/OpenQA/Worker>
Sep 20 09:55:17 imagetester worker[4931]:         OpenQA::Worker::exec(OpenQA::Worker=HASH(0x5631d920b228)) called at /usr/share/openqa/script/worker line 125
Sep 20 09:55:17 imagetester worker[4931]:  - checking again for web UI 'openqa.suse.de' in 126.99 s
imagetester:~ # systemctl status 12227
● openqa-worker-auto-restart@5.service - openQA Worker #5
     Loaded: loaded (/usr/lib/systemd/system/openqa-worker-auto-restart@.service; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/openqa-worker-auto-restart@.service.d
             └─30-openqa-max-inactive-caching-downloads.conf
     Active: active (running) since Fri 2024-09-20 08:40:02 UTC; 1h 15min ago
   Main PID: 4931 (worker)
      Tasks: 2 (limit: 4915)
     CGroup: /openqa.slice/openqa-worker.slice/openqa-worker-auto-restart@5.service
             ├─  4931 /usr/bin/perl /usr/share/openqa/script/worker --instance 5
             └─ 12227 /usr/bin/perl /usr/share/openqa/script/worker --instance 5

for some reason instance 5 was started two times.

Actions #2

Updated by okurz about 1 month ago

  • Tags set to infra, reactive work
  • Category set to Regressions/Crashes
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready

I might have caused this as I found the service disabled initially. I will reboot the complete host to ensure consistency.

Actions #3

Updated by okurz about 1 month ago · Edited

@xlai for the future in tickets please include the normal content from the "report issue" action in openQA for more context, e.g. links to the failing scenario, etc.

Actions #4

Updated by okurz about 1 month ago

  • Status changed from In Progress to Resolved
Actions #5

Updated by xlai about 1 month ago

okurz wrote in #note-3:

@xlai for the future in tickets please include the normal content from the "report issue" action in openQA for more context, e.g. links to the failing scenario, etc.

Sure, thank you guys for the fix!

Actions

Also available in: Atom PDF