action #55061

openqa-metrics.service failed on openqaworker-arm-2 since "Jul 24 17:07:08"

Added by okurz 7 months ago. Updated 7 months ago.

Status:ResolvedStart date:30/07/2019
Priority:HighDue date:
Assignee:nicksinger% Done:

0%

Category:-
Target version:openQA Project - Current Sprint
Duration:

Description

Observation

On openqaworker-arm-2 since Jul 24 17:07:08 in journalctl -u openqa-metrics.service:

Jul 24 17:07:06 openqaworker-arm-2 systemd[1]: Starting Collect system and openqa metrics and send them to graphite server...
Jul 24 17:07:06 openqaworker-arm-2 sudo[87325]:     root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/journalctl --since=-5min --no-pager
Jul 24 17:07:06 openqaworker-arm-2 sudo[87325]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jul 24 17:07:07 openqaworker-arm-2 sudo[87336]:     root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/coredumpctl --no-pager --no-legend
Jul 24 17:07:07 openqaworker-arm-2 sudo[87336]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jul 24 17:07:08 openqaworker-arm-2 systemd[1]: openqa-metrics.service: Main process exited, code=exited, status=1/FAILURE
Jul 24 17:07:08 openqaworker-arm-2 systemd[1]: Failed to start Collect system and openqa metrics and send them to graphite server.
Jul 24 17:07:08 openqaworker-arm-2 systemd[1]: openqa-metrics.service: Unit entered failed state.
Jul 24 17:07:08 openqaworker-arm-2 systemd[1]: openqa-metrics.service: Failed with result 'exit-code'.

Related issues

Copied from openQA Project - action #54869: improve feedback in case job is incompleted due to too lo... New 30/07/2019
Copied to openQA Infrastructure - action #55064: nscd.service failed on openqaworker-arm-2 (and other arm ... Resolved 30/07/2019 23/10/2019

History

#1 Updated by okurz 7 months ago

  • Copied from action #54869: improve feedback in case job is incompleted due to too long uploading (was: Test fails as incomplete most of the time, no clue what happens from the logs.) added

#2 Updated by okurz 7 months ago

  • Copied to action #55064: nscd.service failed on openqaworker-arm-2 (and other arm machines as well) added

#3 Updated by nicksinger 7 months ago

  • Status changed from New to Resolved

Based on https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/93/diffs#a20543c308ca03fc9cac519240c1919cf6ebd95d, I removed and disabled the service from this worker:

nsinger@openqaworker-arm-2:~> systemctl status openqa-metrics.service
● openqa-metrics.service - Collect system and openqa metrics and send them to graphite server
   Loaded: loaded (/etc/systemd/system/openqa-metrics.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-08-06 12:58:25 UTC; 1min 19s ago
  Process: 7769 ExecStart=/usr/local/share/get-metrics 10.86.0.11 (code=exited, status=1/FAILURE)
 Main PID: 7769 (code=exited, status=1/FAILURE)
nsinger@openqaworker-arm-2:~> sudo -i
openqaworker-arm-2:~ # cd /etc/systemd/system/
openqaworker-arm-2:/etc/systemd/system # rm /etc/systemd/system/openqa-metrics.service
openqaworker-arm-2:/etc/systemd/system # systemctl daemon-reload
openqaworker-arm-2:/etc/systemd/system # systemctl reset-failed
openqaworker-arm-2:/etc/systemd/system # systemctl --failed
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
openqaworker-arm-2:/etc/systemd/system # 

Also available in: Atom PDF