action #56447
closedopenqaworker-arm-2 is out-of-space on /was: openQA on osd fails with empty logs
0%
Description
For sle12sp5 in arm there is some jobs failing with empty logs with latest build Build0303:
https://openqa.suse.de/tests/3323071
https://openqa.suse.de/tests/3323085
https://openqa.suse.de/tests/3323086
https://openqa.suse.de/tests/3323089
https://openqa.suse.de/tests/3323090
Updated by JERiveraMoya about 5 years ago
- Copied from action #54902: openQA on osd fails at "incomplete" status when uploading, "502 response: Proxy Error" added
Updated by JERiveraMoya about 5 years ago
- Copied from deleted (action #54902: openQA on osd fails at "incomplete" status when uploading, "502 response: Proxy Error")
Updated by okurz about 5 years ago
- Related to action #55328: job is considered incomplete by openQA but worker still pushes updates so that "job is not considered dead" added
Updated by okurz about 5 years ago
- Status changed from New to Workable
- Assignee set to kraih
- Priority changed from Normal to High
- Target version set to Current Sprint
as this is very likely caused by https://github.com/os-autoinst/openQA/pull/2270 assigning to you to crosscheck.
Updated by okurz about 5 years ago
- Status changed from Workable to New
- Assignee deleted (
kraih) - Target version deleted (
Current Sprint)
sorry, I am all wrong. the PR is not yet deployed.
Updated by okurz about 5 years ago
- Subject changed from openQA on osd fails with empty logs to openqaworker-arm-2 is out-of-space on /was: openQA on osd fails with empty logs
- Status changed from New to In Progress
- Assignee set to okurz
- Priority changed from High to Urgent
- Target version set to Current Sprint
Updated by okurz about 5 years ago
- Related to action #41882: all arm worker die after some time added
Updated by okurz about 5 years ago
- Related to action #54128: [tools] openqaworker-arm-3 is broken added
Updated by okurz about 5 years ago
I stopped salt-minion and openqa-worker.target. It looks like /var/lib/openqa/pool is on the same partition as / . I don't know what changed or how it looked like in before. Probably pool should be on NVME as well. systemctl cat openqa_nvme_prepare.service
creates the pool but does not seem to do anything with it. This looks similar to #53261 only about "pool", not "cache". Could it be we deleted the "pool" symlink by mistake and should use a bind mount as well? Probably to be done properly with salt.
- change to bind mount for all,
- add that to salt
- add -3 the same and monitor all three
Done first two with https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/160
Updated by okurz about 5 years ago
today we have some incompletes on aarch64 but seems like only openqaworker-arm-2. I disabled the worker target on the host and will retrigger incompletes. They should be picked up on openqaworker-arm-1. See e.g. https://openqa.suse.de/tests/3326179 from https://openqa.suse.de/tests/?&resultfilter=Incomplete
Updated by okurz about 5 years ago
- Status changed from In Progress to Resolved
all problems resolved. The nvme preparation is done as available in salt and a workaround for nscd is applied, see https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/162 . The worker was able to successfully test build 0307 of SLES12SP5 so we should be good.