action #91347: [spike][timeboxed:18h] Support for archived jobs - openQA Project (public) - openSUSE Project Management Tool

Actions

action #91347

closed

coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results

coordination #80546: [epic] Scale up: Enable to store more results

[spike][timeboxed:18h] Support for archived jobs

Added by okurz about 4 years ago. Updated about 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

mkittler

Category:

Feature requests

Target version:

Ready

Start date:

2021-04-19

Due date:

% Done:

Estimated time:

Description

Motivation¶

See ideas in parent #80546

Suggestions¶

add "archived" flag, check what would be needed to read job results based on archived flag
move one test job to a different path, e.g. /archive/…/$job
if archived; read resultdir with configured archive path prefix, not normal resultdir prefix
add minion job to archive job, triggered on what conditions?
add feedback in UI "Archived job: Loading can take longer than usual"

Open questions to answer¶

which jobs to archive? maybe boolean config: If "archive" then move important jobs to archive if they expire, else delete; or same for all
when/how to trigger the archiving minion job?
is it performant enough to have one minion job for one time or one minion job for every job or some bunching in the middle?
How can admins and users control the archiving decisions?
Is there a need to unarchive (move back to non-archived results)?
How to monitor and cleanup from the archive?
How does the current cleanup encounter archived results, delete them? fail?
Should we consider "archived" == "important" after some grace time when an important job is moved to a potentially slower archive? So at the time when we trigger the cleanup for non-important jobs we also look at the important jobs and make sure that they are archived or queued for archiving (if that's a more costly process)

Actions

Copy link

Updated by okurz about 4 years ago

Tracker changed from coordination to action

Actions

Copy link

Updated by mkittler about 4 years ago

Assignee set to mkittler

Actions

Copy link

Updated by openqa_review about 4 years ago

Due date set to 2021-05-04

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

Updated by mkittler about 4 years ago

Status changed from Workable to In Progress

Draft implementing some of the suggestions: https://github.com/os-autoinst/openQA/pull/3858

Actions

Copy link

Updated by mkittler about 4 years ago

Status changed from In Progress to Feedback

which jobs to archive? maybe boolean config: If "archive" then move important jobs to archive if they expire, else delete; or same for all
when/how to trigger the archiving minion job?
Should we consider "archived" == "important" after some grace time when an important job is moved to a potentially slower archive? So at the time when we trigger the cleanup for non-important jobs we also look at the important jobs and make sure that they are archived or queued for archiving (if that's a more costly process)

The draft PR shows now triggering the archiving during cleanup. It would archive logs of important jobs which are only preserved because the job is considered important. This should be what the last question has asked for. This way of introducing the archiving feature has the advantage that we don't need to introduce new retention periods. The disadvantage is of course that only important jobs benefit from the archiving. However, I suppose we can still improve that later so I'd say it is still a good start.

is it performant enough to have one minion job for one time or one minion job for every job or some bunching in the middle?

The cleanup is already one long Minion job itself so I've been splitting the archiving up. I assume archiving can potentially take a while if lots of jobs are considered at the same time. I suppose it all depends on the I/O performance.

Is there a need to unarchive (move back to non-archived results)?

Not covered so far but it wouldn't be hard to implement the reverse and an admin could trigger it by enqueuing a Minion job manually on the command-line.

How to monitor and cleanup from the archive?
How does the current cleanup encounter archived results, delete them? fail?

I suppose we should add file system checks for the monitoring host in the same way we have them for OSD.
Since I've implemented archiving as an intermediate step of the cleanup (see answer to first question) the archive would be cleaned up as part of the usual cleanup when the important job expires completely.

Screenshots haven't been considered at all because they're shared between jobs which would make things complicated. Of course we could still consider them in the future.

Actions

Copy link

Updated by okurz about 4 years ago

Status changed from Feedback to Resolved

Perfect. I created three new stories #91785, #91782, #91779 so I think we are good here. Thank you, perfect work! :)

Actions

Copy link

Updated by okurz about 4 years ago

Due date deleted (~~2021-05-04~~)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #91347

[spike][timeboxed:18h] Support for archived jobs

Motivation¶

Suggestions¶

Open questions to answer¶

Updated by okurz about 4 years ago

Updated by mkittler about 4 years ago

Updated by openqa_review about 4 years ago

Updated by mkittler about 4 years ago

Updated by mkittler about 4 years ago

Updated by okurz about 4 years ago

Updated by okurz about 4 years ago