Project

General

Profile

action #88546

openQA Project - coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results

openQA Project - coordination #80546: [epic] Scale up: Enable to store more results

Make use of the new "Storage Server", e.g. complete OSD backup

Added by okurz 5 months ago. Updated 19 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Acceptance criteria

  • AC1: The SUSE QA storage server is used within our production and we (the team) know what is used for

Suggestions

  • Ask nsinger where to connect to, what steps to start with? The hostname is storage.qa.suse.de
  • Try to connect to storage.qa with the same credentials as for osd machines maintained with salt, e.g. just ssh with your user should work SSH access should work at this point
  • Add changes and make sure the changes are in salt
    • When populating the btrfs filesystem storage.qa.suse.de:/storage it would make sense to create dedicated subvolumes for different things
      • e.g. do a full or partial backup of OSD
      • e.g. mount storage.qa.suse.de:/storage on OSD and configure the archiving feature to use it

Further details

If we try to conduct a "complete OSD backup" by this we can also learn the performance impact, e.g. how long does it initially take to synchronize, how long does it take to do individual, e.g. daily syncs


Related issues

Related to openQA Infrastructure - action #92701: backup of etc/ from both o3 was not working since some days due to OOM on backup.qa.suse.de (was: … and osd not updated anymore since 2019)Resolved2021-05-142021-06-30

Blocks openQA Project - action #92788: Use openQA archiving feature on osd size:SBlocked

Copied from openQA Infrastructure - action #69577: Handle installation of the new "Storage Server"Resolved2020-08-04

History

#1 Updated by okurz 5 months ago

  • Copied from action #69577: Handle installation of the new "Storage Server" added

#2 Updated by cdywan 4 months ago

Could we add some suggestions here and make it Workable? Like where to connect to, what steps to start with?

#3 Updated by okurz 4 months ago

  • Description updated (diff)
  • Status changed from New to Workable

Yes, we should. I can't do much on that on my own though. nsinger knows more

#4 Updated by okurz 3 months ago

  • Description updated (diff)

#5 Updated by okurz 3 months ago

  • Description updated (diff)

#6 Updated by okurz 3 months ago

  • Description updated (diff)

#7 Updated by mkittler about 2 months ago

Now with the archiving feature enabled one could try to mount storage.qa.suse.de:/storage on OSD and configure the archiving feature to use it.

#8 Updated by mkittler about 2 months ago

  • Related to action #92701: backup of etc/ from both o3 was not working since some days due to OOM on backup.qa.suse.de (was: … and osd not updated anymore since 2019) added

#9 Updated by mkittler about 2 months ago

  • Description updated (diff)

I've been updating the ticket description:

  • There's an overlap between this ticket and #92701. I suppose if we opt for the full backup of OSD here we wouldn't need #92701 anymore. It also leads to the idea of only backing up /etc (and maybe some other important directories) first.
  • It looks like /storage on storage.qa.suse.de is using btrfs. That makes sense and I suppose if we populate it with various things, e.g. an archive or backups we should create an own subvolume for these.

#10 Updated by mkittler about 2 months ago

  • Status changed from Workable to In Progress
  • Assignee set to mkittler

Now since we have the archiving feature enabling it is likely the easiest use of the storage server so I'll start with that.

#11 Updated by openqa_review about 2 months ago

  • Due date set to 2021-06-16

Setting due date based on mean cycle time of SUSE QE Tools

#13 Updated by okurz about 2 months ago

before we accept the MR we should do #91779 first. Also see the problem of this morning about storage.qa: #93683

#14 Updated by okurz about 2 months ago

We need to rethink. With https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/501 storage.qa.suse.de would become a critical component. So far what we have is a central VM that is backed by multiple levels of high-grade redundancy which we can see confirmed in years of flawless availability. And all the workers combined also provide built-in redundancy with our scheduling algorithm. The physical machine storage.qa.suse.de shows the single point of failure as visible in #93683 . I still see using storage.qa.suse.de as backup target as a good first approach. And I recommend to do that first. For the archiving feature we can still make use of that immediately as on osd we have expensive+fast and cheap+slow storage to be used

EDIT: Because we talked about if rsnapshot supports btrfs snapshots. The gentoo wiki mentions http://web.archive.org/web/20190910001551/http://it.werther-web.de/2011/10/23/migrate-rsnapshot-based-backup-to-btrfs-snapshots/ (the original page yields 404)

#15 Updated by mkittler about 1 month ago

  • Status changed from In Progress to Workable

#16 Updated by mkittler about 1 month ago

  • Assignee deleted (mkittler)

I haven't progressed here since we decided to focus on #92701 first. I'm unassigning because I won't be able to work on this until next Tuesday.

#17 Updated by okurz 26 days ago

  • Due date deleted (2021-06-16)

#18 Updated by okurz 19 days ago

  • Status changed from Workable to New

moving all tickets without size confirmation by the team back to "New". The team should move the tickets back after estimating and agreeing on a consistent size

#19 Updated by okurz 9 days ago

  • Blocks action #92788: Use openQA archiving feature on osd size:S added

Also available in: Atom PDF