Project

General

Profile

Actions

action #47858

closed

[u] test fails in first_boot - pflash overlay deleted causing: mkdir vm-snapshots: Structure needs cleaning

Added by szarate about 5 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
SUSE QA - Milestone 24
Start date:
2019-02-13
Due date:
% Done:

0%

Estimated time:

Description

Observation

Some jobs are incomplete due to problems related to the pflash apparently:

[2019-02-13T12:33:17.767 UTC] [debug] Saving snapshot (Current VM state is running).
[2019-02-13T12:33:17.818 UTC] [debug] EVENT {"event":"STOP","timestamp":{"microseconds":818501,"seconds":1550061197}}
[2019-02-13T12:33:17.828 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'format' => 'qcow2','node-name' => 'hd0','snapshot-file' => '/var/lib/openqa/pool/16/raid/hd0-overlay1','snapshot-node-name' => 'hd0-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'return' => {}}
[2019-02-13T12:33:17.837 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'format' => 'qcow2','node-name' => 'cd0-overlay0','snapshot-file' => '/var/lib/openqa/pool/16/raid/cd0-overlay1','snapshot-node-name' => 'cd0-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'return' => {}}
[2019-02-13T12:33:17.842 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'format' => 'qcow2','node-name' => 'pflash-code-overlay0','snapshot-file' => '/var/lib/openqa/pool/16/raid/pflash-code-overlay1','snapshot-node-name' => 'pflash-code-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'error' => {'class' => 'GenericError','desc' => 'Cannot find device= nor node_name=pflash-code-overlay0'}}
[2019-02-13T12:33:17.851 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'device' => 'pflash-code-overlay0','format' => 'qcow2','snapshot-file' => '/var/lib/openqa/pool/16/raid/pflash-code-overlay1','snapshot-node-name' => 'pflash-code-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'return' => {}}
[2019-02-13T12:33:17.856 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'format' => 'qcow2','node-name' => 'pflash-vars-overlay0','snapshot-file' => '/var/lib/openqa/pool/16/raid/pflash-vars-overlay1','snapshot-node-name' => 'pflash-vars-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'error' => {'class' => 'GenericError','desc' => 'Cannot find device= nor node_name=pflash-vars-overlay0'}}
[2019-02-13T12:33:17.865 UTC] [debug] blockdev-snapshot-sync({'arguments' => {'device' => 'pflash-vars-overlay0','format' => 'qcow2','snapshot-file' => '/var/lib/openqa/pool/16/raid/pflash-vars-overlay1','snapshot-node-name' => 'pflash-vars-overlay1'},'execute' => 'blockdev-snapshot-sync'}) -> {'return' => {}}
[2019-02-13T12:33:17.867 UTC] [debug] Backend process died, backend errors are reported below in the following lines:
mkdir vm-snapshots: Structure needs cleaning at /usr/lib/os-autoinst/backend/qemu.pm line 413.

Reproducible

It's sporadic:


Related issues 1 (0 open1 closed)

Has duplicate openQA Infrastructure - action #49583: [arm][fs] cannot access '/var/lib/openqa/pool/16/vm-snapshots': Structure needs cleaning on openqaworker-arm-1Resolvednicksinger2019-03-22

Actions
Actions #1

Updated by szarate about 5 years ago

  • Assignee set to rpalethorpe

I wonder if Richie can shed some light here

Actions #2

Updated by rpalethorpe about 5 years ago

  • Status changed from New to In Progress
  • Assignee changed from rpalethorpe to szarate

The inner error message appears to be from the file system:
https://unix.stackexchange.com/questions/330742/cannot-remove-file-structure-needs-cleaning#330767

So the directory/inode is being corrupted maybe.

Looks like the worker has this problem on every job, but other workers don't. It always happens with the pflash overlay, but that might just be because it is always the first overlay QEMU tries to access. So it is probably an issue with the worker's file system.

Actions #3

Updated by szarate about 5 years ago

  • Project changed from openQA Project to openQA Infrastructure
  • Subject changed from test fails in first_boot - pflash overlay deleted causing: mkdir vm-snapshots: Structure needs cleaning to [u] test fails in first_boot - pflash overlay deleted causing: mkdir vm-snapshots: Structure needs cleaning
  • Status changed from In Progress to Workable
  • Assignee deleted (szarate)

Moving to infraestructure for the time being, addin tag to keep it under radar :), I guess it will become annoying in the short future

Actions #4

Updated by okurz about 5 years ago

  • Target version set to Milestone 24

"under radar"? Wouldn't that imply we don't see it? ;)

Actions #5

Updated by okurz about 5 years ago

  • Status changed from Workable to Rejected
  • Assignee set to okurz

In the same scenario I have seen an incomplete in two consecutive jobs but not later: https://openqa.suse.de/tests/latest?distri=sle&arch=aarch64&flavor=Installer-DVD&machine=aarch64&version=15-SP1&test=allmodules%2Ballpatterns#next_previous . We already have many more jobs past that that look ok again. I assume for now that we do not need to do anything. I am not aware of this same error appearing anywhere else as well so I assume we are good to call it "Rejected" for now as the problem does not appear again so far and we did not do anything. Please reopen if you see it again. @rpalethorpe thanks for the investigation help and the helpful explanation.

Actions #6

Updated by okurz about 5 years ago

  • Status changed from Rejected to Workable
  • Assignee deleted (okurz)
Actions #7

Updated by SLindoMansilla about 5 years ago

  • Has duplicate action #49583: [arm][fs] cannot access '/var/lib/openqa/pool/16/vm-snapshots': Structure needs cleaning on openqaworker-arm-1 added
Actions #8

Updated by SLindoMansilla about 5 years ago

  • Status changed from Workable to Rejected

Resolved in #49583

Actions

Also available in: Atom PDF