action #88546
closedopenQA Project - coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results
openQA Project - coordination #80546: [epic] Scale up: Enable to store more results
Make use of the new "Storage Server", e.g. complete OSD backup
0%
Description
Acceptance criteria¶
- AC1: The SUSE QA storage server is used within our production and we (the team) know what is used for
Suggestions¶
Ask nsinger where to connect to, what steps to start with?The hostname isstorage.qa.suse.de
Try to connect to storage.qa with the same credentials as for osd machines maintained with salt, e.g. just ssh with your user should workSSH access should work at this point- Add changes and make sure the changes are in salt
- When populating the
btrfs
filesystemstorage.qa.suse.de:/storage
it would make sense to create dedicated subvolumes for different things- e.g. do a full or partial backup of OSD
- e.g. mount storage.qa.suse.de:/storage on OSD and configure the archiving feature to use it
- When populating the
- What is included in a complete OSD backup?: To be answered by #96269
- Also include postgres? okurz: No, to be covered by #94015
- Which backups solution to use, e.g. rsnapshot?: okurz: Yes, use rsnapshot, same as we currently do on backup.qa.suse.de already
Further details¶
If we try to conduct a "complete OSD backup" by this we can also learn the performance impact, e.g. how long does it initially take to synchronize, how long does it take to do individual, e.g. daily syncs
Updated by okurz over 3 years ago
- Copied from action #69577: Handle installation of the new "Storage Server" added
Updated by livdywan over 3 years ago
Could we add some suggestions here and make it Workable? Like where to connect to, what steps to start with?
Updated by okurz over 3 years ago
- Description updated (diff)
- Status changed from New to Workable
Yes, we should. I can't do much on that on my own though. nsinger knows more
Updated by mkittler over 3 years ago
Now with the archiving feature enabled one could try to mount storage.qa.suse.de:/storage
on OSD and configure the archiving feature to use it.
Updated by mkittler over 3 years ago
- Related to action #92701: backup of etc/ from both o3 was not working since some days due to OOM on backup.qa.suse.de (was: … and osd not updated anymore since 2019) added
Updated by mkittler over 3 years ago
- Description updated (diff)
I've been updating the ticket description:
- There's an overlap between this ticket and #92701. I suppose if we opt for the full backup of OSD here we wouldn't need #92701 anymore. It also leads to the idea of only backing up
/etc
(and maybe some other important directories) first. - It looks like
/storage
onstorage.qa.suse.de
is usingbtrfs
. That makes sense and I suppose if we populate it with various things, e.g. an archive or backups we should create an own subvolume for these.
Updated by mkittler over 3 years ago
- Status changed from Workable to In Progress
- Assignee set to mkittler
Now since we have the archiving feature enabling it is likely the easiest use of the storage server so I'll start with that.
Updated by openqa_review over 3 years ago
- Due date set to 2021-06-16
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler over 3 years ago
Updated by okurz over 3 years ago
We need to rethink. With https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/501 storage.qa.suse.de would become a critical component. So far what we have is a central VM that is backed by multiple levels of high-grade redundancy which we can see confirmed in years of flawless availability. And all the workers combined also provide built-in redundancy with our scheduling algorithm. The physical machine storage.qa.suse.de shows the single point of failure as visible in #93683 . I still see using storage.qa.suse.de as backup target as a good first approach. And I recommend to do that first. For the archiving feature we can still make use of that immediately as on osd we have expensive+fast and cheap+slow storage to be used
EDIT: Because we talked about if rsnapshot supports btrfs snapshots. The gentoo wiki mentions http://web.archive.org/web/20190910001551/http://it.werther-web.de/2011/10/23/migrate-rsnapshot-based-backup-to-btrfs-snapshots/ (the original page yields 404)
Updated by mkittler over 3 years ago
- Status changed from In Progress to Workable
Updated by mkittler over 3 years ago
- Assignee deleted (
mkittler)
I haven't progressed here since we decided to focus on #92701 first. I'm unassigning because I won't be able to work on this until next Tuesday.
Updated by okurz over 3 years ago
- Status changed from Workable to New
moving all tickets without size confirmation by the team back to "New". The team should move the tickets back after estimating and agreeing on a consistent size
Updated by okurz over 3 years ago
- Blocks action #92788: Use openQA archiving feature on osd size:S added
Updated by ilausuch over 3 years ago
- Description updated (diff)
We need to answer the two last questions in suggestions section before do it workable
Updated by okurz over 3 years ago
- Copied to action #96269: Define what a "complete OSD backup" should or can include added
Updated by okurz over 3 years ago
- Related to action #44078: Implement proper backups for o3 size:M added
Updated by okurz over 3 years ago
- Status changed from New to Blocked
- Assignee set to okurz
#44078 first
Updated by okurz almost 3 years ago
- Status changed from Blocked to Resolved
With #44078 completed we make active use of the storage space on storage.qa.suse.de and also that host is fully controlled with salt and actively monitored. Team agreed that we have AC1 covered :)
Updated by okurz almost 2 years ago
- Related to action #121282: Recover storage.qa.suse.de size:S added